Job Description
Summary
Description
- Develop novel ML/DL models including LLMs for natural language and conversational understanding
- Work with software and platform engineers to convert and compile the models to run on device and integrate them with runtime systems.
- Optimize latency and performance of the models
- Diagnose model errors and perform accuracy hill climbing for shipped products.
Minimum Qualifications
- Solid understanding of state-of-the-art technology in machine learning, deep learning (including LLMs) and natural language understanding.
- Excellent problem-solving (e.g. via building forward-looking prototype systems), critical thinking, strong communication, and collaboration skills
Preferred Qualifications
- 3-5+ years proven programming skills using standard ML tools such as C/C++, Python, PyTorch, Tensorflow, Hugging Face, etc.
- Hands-on experience working (training, fine-tuning, optimizing, deploying) with large models (e.g. LLMs).
- Hands-on experience applying common machine learning optimization techniques, like quantization and distillation, to reduce the resource consumption and/or eliminate latency
- Publication at top ML/DL/NLP conferences such as NeurIPS, ACL, EMNLP, etc. is a plus.