Job Description
Responsibilities
Responsibilities:
- Engage directly with key partners to understand state-of-the-art LLMs and Diffusion models, run them at scale in performance and cost effective manner
- Leverage latest hardware stack technologies improvements in CUDA, infiniband and fast-moving software stack to deliver best of class inference
- Anticipate, identify, assess, track, and mitigate project risks and issues in a fast-paced start up like environment
- Motivated to build constructive and effective relationships and solve problems collaboratively
Qualifications
Required and Preferred
- B Tech or M Tech in computer science, engineering, mathematics or a related field, or equivalent industry experience
- 1+ year(s) of software development experience focused C/C++ and/or Python development
- Knowledge and experience in OSS, Docker, Kubernetes, Python, GOLANG programming languages
- Good communication, collaboration skills and a great team player.
- Experience working in a geo-distributed team
- Practical experience hosting and running large scale machine learning models in enterprise grade applications.
- Experience in building enterprise grade applications in C++, Pytho
- Experience in developing and operating low latency, high scale, reliable online service