Job Description
Summary
As a Senior ML Product Evaluation Engineer, you will contribute to high-quality release of innovative features across multiple platforms, by leading evaluation efforts of Apple Intelligence and next generation Siri. You will play a critical role in shaping the future of Siri and Apple Intelligence. This involves defining evaluation strategies for LLM-powered products that align with the functional testing scope to ensure seamless and reliable user experiences. Join our team to push the boundaries of AI technology and enhance how users interact with intelligent systems.
Description
In this role, you will:
* Lead the design and execution of test plans for features across various platforms.
* Test and evaluate the ML models powering Siri for accuracy, performance, and stability.
* Create datasets and conduct model performance evaluations to ensure ML models meet required standards.
* Debug complex issues by analyzing logs and collaborating with developers to resolve root causes efficiently.
* Provide detailed test reports, highlight risks, and ensure issues are addressed before product release.
* Collaborate with data scientists and ML engineers to validate model deployment and performance in production environments.
Minimum Qualifications
- 7+ years of experience with ML model testing and evaluation in production environments.
- 7+ years of experience in working with distributed systems that include multiple sub-systems and orchestration components.
- Strong experience in creating datasets for model evaluation and conducting performance benchmarks.
- Advanced debugging skills, including log stream analysis and issue reproduction.
- Familiarity with ML / MLOps framework like TensorFlow, PyTorch, or MLflow.
- Proactive and creative mindset with a can-do attitude and strong focus on delivering high-quality results.
Preferred Qualifications
- Experience validating the performance and scalability of machine learning models in a production setting.
- Ability to lead and influence testing initiatives in fast-paced, dynamic environments.
- Knowledge of Swift, XCTest or equivalent tools is a plus.
- Proficiency in configuring and maintaining CI/CD pipelines using tools such as GitHub, TeamCity, Jenkins, or similar platforms.
- Master’s degree in Machine Learning, Data Science, or related field. PhD in Machine Learning or Artificial Intelligence is preferred