Job Description
Summary
At ETS Team, we take pride in developing groundbreaking, world-changing platforms and services. Our ETS applications play a crucial role in supporting the Apple ecosystem by offering identity management, factory and device support, infrastructure support, platform support, and collaboration tools. Whether you're logging into Apple, making a purchase, or enabling Apple devices, our applications are there every step of the way, ensuring a flawless and secure experience.
Description
THIS ROLE IS DESIGNED FOR DRIVEN INDIVIDUALS WHO:
- Love learning new technologies and thrive in solving sophisticated challenges.
- Are independent, motivated, and excited to take on ambitious projects.
- Excel at collaborating with engineering teams and can stay calm under pressure.
- Have a passion for delivering quality, reliable solutions in a dynamic, high-energy workplace
We are seeking dedicated Site Reliability Engineers (SREs) at all levels of experience, from junior to senior, to join our teams.
Minimum Qualifications
- 3+ years of experience in Site Reliability Engineering, DevOps, Software Engineering, or a related field
- Strong foundation in programming language (Java) or scripting (Python / Bash / LUA)
- Hands on experience in one or more databases (Relational / NoSQL like Oracle, MongoDB)
- Education: Bachelor’s or Master’s degree in Computer Science or a related field (equivalent practical experience)
Preferred Qualifications
- Hands on experience with monitoring and logging tools (e.g., Prometheus, Splunk, Grafana, CloudWatch)
- Proficient in Linux, Networking concepts (TLS/SSL, DNS, Load Balancers, etc..) and troubleshooting skills in large scale environments
- Source control management such as Git / Understanding of CI/CD, Release Engineering and DevOps
- Understanding of security standards, policies, and cryptography
- Experience with Incident / Problem management and RCA
- Strong Network, Load Balancing (Nginx, Envoy, NetScaler) experience is a huge plus
- Good solid understanding using Kubernetes concepts such as networking, Storage, Secrets, Deployments, Containers. AWS or GCP are preferred.
- Knowledge or experience in Governance and Compliance.
- Understanding of SRE principles, including observability, error budgeting, service reliability measurements through SLA & SLO & SLI, corresponding telemetry standards and practices, and product feedback.
- Strong analytical skills