✨ About The Role
- The Staff Compute Platform Engineer will lead the development of the compute platform, focusing on scalability, reliability, and performance.
- Responsibilities include designing and implementing orchestration workflows using tools like Prefect and Ray for distributed computing.
- The role involves overseeing the design and development of the SnorkelFlow SDK, ensuring it meets user needs and is well-documented.
- The engineer will collaborate with cross-functional teams to ensure interoperability between compute workflows and other platform layers.
- The position requires defining observability strategies to monitor compute platform performance and optimize system efficiency.
âš¡ Requirements
- The ideal candidate will have a Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field.
- A minimum of 8 years of experience in backend or infrastructure engineering is required, with a strong focus on MLOps and SDK development for AI applications.
- The successful candidate will possess proven expertise in architecting and deploying scalable infrastructure for distributed systems.
- Strong programming skills in Python and familiarity with backend frameworks such as FastAPI or Flask are essential.
- The candidate should have extensive experience with CI/CD pipelines, containerization, and orchestration technologies like Docker and Kubernetes.