✨ About The Role
- The role involves owning the architecture, design, development, and operations of large-scale systems designed for AI/ML tasks.
- Responsibilities include prototyping, optimizing, and maintaining scalable back-end services that power new ML and foundation model development workflows.
- The position requires designing extensible and testable interfaces between internal services, including storage and data models.
- Keeping CI/CD pipelines healthy and providing on-call support to customers in production is a key responsibility.
- The job offers a hybrid work schedule, requiring one or two days per week in the Redwood City HQ and the option to work remotely on other days.
⚡ Requirements
- A bachelor's degree in Computer Science or a related field is required for this position.
- The ideal candidate will have at least 2 years of experience in delivering distributed and machine learning systems in a production setting.
- Strong communication and coding skills are essential, with an emphasis on designing for scale and robustness.
- Experience with distributed compute frameworks and data processing pipelines is necessary for success in this role.
- The candidate should possess strong development and debugging skills in Python.
- A proactive and engaged team player attitude is important, especially in a customer-focused cross-functional environment.
- Experience working with machine learning systems and foundation models is preferred.