Job Description:
• The Machine Learning Operations (MLOps) Engineer will support our AI/ML initiatives by streamlining the deployment, monitoring, and scaling of machine learning models in production environments.
• Implement and maintain CI/CD pipelines for deploying machine learning models to production environments.
• Ensure seamless integration of machine learning models into existing software systems.
• Design and manage scalable infrastructure for training, testing, and serving machine learning models.
• Automate data preprocessing, model training, and deployment workflows.
• Monitor the performance of deployed models and systems, identifying and resolving issues proactively.
• Optimize model inference latency, scalability, and resource utilization.
• Work closely with data scientists, software engineers, and product teams to understand requirements and deliver operational solutions.
• Collaborate with DevOps and cloud engineering teams to ensure infrastructure reliability and security.
• Maintain version control for datasets, models, and code.
• Implement best practices for data and model governance, ensuring compliance with organizational and regulatory requirements.
• Stay updated with the latest trends in MLOps tools, frameworks, and practices.
• Recommend and implement improvements to the MLOps processes and infrastructure.
Requirements:
• Education Required: Bachelor’s degree in Computer Science, Data Science, Engineering, or a related field.
• Experience Required: 2-3 years of hands-on experience in MLOps, DevOps, or related roles.
• Experience with MLOps tools and platforms like MLflow, Kubeflow, or SageMaker.
• Experience with feature stores and model versioning systems.
• Experience in building CI/CD pipelines using tools like Jenkins, GitLab CI, or similar.
• Knowledge of: Proficiency in Python and familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
• Strong understanding of containerization and orchestration tools (e.g., Docker, Kubernetes).
• Familiarity with distributed computing frameworks (e.g., Apache Spark).
• Knowledge of cloud platforms such as AWS, Azure, or Google Cloud.
• Solid understanding of model monitoring, logging, and debugging tools.
• Familiarity with database technologies and data pipelines (SQL, NoSQL, ETL/ELT processes).
Benefits: