Job Description
Summary
Do you want to make Siri and Apple products ever smarter for our users? The Foundation Model Batch Inference team are building groundbreaking technology for large scale inference of foundation models, including innovative large language models (LLMs) and multimodal models. Our batch inference platform powers billions of foundation model inference queries across a variety of Apple products.
As part of this group, you will work with one of the most exciting high performance computing environments for foundation models inference, with petabytes of data and billions of queries, and have an opportunity to imagine and build products that delight our customers every single day.
Description
Join us and:
* build and optimize large scale batch inference solutions
* build scalable and efficient system to fully employ diverse high performance GPU fleet
Minimum Qualifications
- Strong coding skills
- Strong background in computer science: algorithms and data structures
- Strong experience with Docker containerization and Kubernetes orchestration, familiar with AWS EKS, Amazon S3 or GCP
- Extensive expertise in designing robust, large scale backend system, taking into account for performance, scalability, security and maintainability.
- Excellent interpersonal skills able to work independently as well as in a team
Preferred Qualifications
- Familiar with one of the popular ML Frameworks like PyTorch, Tensorflow
- Familiar with foundation model architectures such as Transformers, Encoder/Decoder
- Familiar with fundamental model inference
- Familiar with MapReduce style batch jobs