Looking for a DevOps/MLOps Engineer to build and manage scalable, automated infrastructure for our LLM- powered GenAI platform. You’ll enable fast iteration and reliable deployment of models and services through robust CI/CD pipelines, container orchestration, and ML lifecycle tooling.
Key Responsibilities:
- Design and maintain CI/CD pipelines using Jenkins, GitHub Actions, or similar.
- Automate infrastructure provisioning using Terraform and manage services with Kubernetes.
- Write and maintain Bash/Python scripts for automation and operational tooling.
- Implement and monitor MLOps workflows using tools like MLflow, Azure ML, or similar.
- Support deployment and monitoring of LLM-based models and APIs in production.
Required Skills:
- Hands-on experience with Jenkins, GitHub Actions, or equivalent CI/CD tools.
- Proficiency with Terraform, Kubernetes, Docker, and cloud-native practices.
- Strong scripting skills in Bash and Python.
- Experience with ML model tracking, versioning, and deployment using MLflow or similar.
- Familiarity with cloud platforms (e.g., Azure, AWS, or GCP).
Nice to Have:
- Exposure to LLM/GenAI deployment workflows.
- Experience with model performance monitoring and observability tools (Prometheus, Grafana, etc.).
- Security and cost optimization best practices for ML infrastructure.