Key job responsibilities
- Collaborate closely with applied scientists on machine learning tasks ranging from ML code & data management to training and deployment of ML models.
- Research and Develop stability and optimizations to continuously improve training KPIs, such as uptime, throughput, and good put.
- Collaborate with software engineering teams to operationalize and scale training improvements across experimentation and production workloads.
About the team
As part of the Amazon General Intelligence team, AGI Modeling Services provides training capabilities and services to accelerate invention of SoTA models across all modalities and their derivatives. These services include high-performance ML infrastructure, modeling toolkits, and optimized MLOps workflows for AGI scientists to build, train and release their models. We need your help to build the advancements required to make that a reality.
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent