Ceph DevOps Engineer
Mirantis is the Kubernetes-native AI infrastructure company, enabling organizations to build and operate scalable, secure, and sovereign infrastructure for modern AI, machine learning, and data-intensive applications. By combining open source innovation with deep expertise in Kubernetes orchestration, Mirantis empowers platform engineering teams to deliver composable, production-ready developer platforms across any environment—on-premises, in the cloud, at the edge, or in sovereign data centers. As enterprises navigate the growing complexity of AI-driven workloads, Mirantis delivers the automation, GPU orchestration, and policy-driven control needed to manage infrastructure with confidence and agility. Committed to open standards and freedom from lock-in, Mirantis ensures that customers retain full control of their infrastructure strategy.
Job Description
Mirantis, a leading open cloud company, is dedicated to freeing enterprise application owners from infrastructure and operational complexities through open-source innovation. We are committed to delivering a true cloud experience that is consistent across any infrastructure and built on open standards.
We are seeking a talented DevOps Engineer with deep expertise in Ceph and cloud technologies to join our cloud storage team. This role offers a comprehensive opportunity to design, deploy, and rigorously test our product's cloud infrastructure, leveraging cutting-edge open-source components and ensuring robust, high-quality solutions.
Responsibilities
- Design, develop, and maintain software solutions that integrate and support Ceph/Rook for a distributed storage solution within Kubernetes.
- Work with Kubernetes clusters and Docker containers to deploy, manage, and storage backend for cloud workloads.
- Implement best practices for Kubernetes and Ceph cluster management, including high availability, scaling, backup, and disaster recovery.
- Contribute to Ceph/Rook and adjacent open-source projects through bug fixes, tests, and new features.
- Design and maintain CI/CD pipelines for automated deployment and management of Kubernetes and Ceph resources.
- Troubleshoot and resolve complex technical issues related to Kubernetes and Ceph/Rook at scale, including analyzing logs and identifying resource bottlenecks.
- Develop and enforce policies for security, monitoring, logging, and alerting in Kubernetes and Ceph/Rook environments.
- Collaborate with cross-functional teams, including DevOps, QA, and product management, to deliver reliable and scalable solutions.
Qualifications
- Deep expertise is required as all core components are written in Go.
- Proven experience building Kubernetes operators and controllers, including deep knowledge of the Kubernetes API, CRD development, and libraries like controller-runtime.
- Strong understanding of distributed storage technologies and hands-on experience with Ceph. This includes Ceph basics like OSDs, Block, Object, File storages and Kubernetes solution for Ceph called Rook.
- 3+ years of experience working with distributed systems, with a strong focus on Kubernetes and containerization, Ceph and Rook.
- Experience with Linux system programming, writing shell scripts, and a thorough understanding of CI/CD principles for developing robust pipelines.
- Experience with versioning control systems, such as Git
- Familiarity with code-review systems, like Gerrit or GerritHub
- Bachelor's degree in CS or related field
- Good communication skills, a can-do attitude, and a focus on results
- Upper-intermediate spoken and written English
Will be a plus
- Familiarity with AWS s3 API, OpenStack Cinder and Swift.
- Experience with Kubernetes CSI drivers usage.
- Experience with specific cloud providers (AWS, GCP, Azure) and their related services.
- Familiarity with workloads which intend using storage backends like Ceph: OpenStack, AWS, Kubernetes ceph-based Persistent Volumes.
- Familiarity with observability tools like Prometheus, Grafana, and Fluentd.
- Active participation in open-source communities (especially Ceph, Rook) and certifications like CKA (Certified Kubernetes Administrator).