Senior MLOps Engineer
DeepRec.ai - San Jose, CA
Apply NowJob Description
Senior MLOps Engineer - Real-Time AI & Video Applications (Hybrid)Office Location: San Jose (Hybrid)Job Type: Full-timeWe're hiring for an impressive AI company who are focussed on real-time AI and Video Applications. Their team is made up of leading experts in computer graphics and generative modeling, and they are on a rapid growth trajectory. We're looking for experienced MLOps Engineers that want to work on real-time AI applications that are shaping the future of media.The RoleWe're looking for a talented MLOps Engineer to build and maintain robust machine learning pipelines and infrastructure. You'll be working closely with AI researchers, data scientists, and software engineers to deploy state-of-the-art models into production, optimize real-time inference, and ensure systems scale effectively.What You'll DoDesign and optimize ML pipelines for training, validation, and inferenceAutomate deployment of deep learning and generative models for real-time useImplement versioning, reproducibility, and rollback capabilitiesDeploy and manage containerized ML solutions on cloud platforms (AWS, GCP, Azure)Optimize model performance using TensorRT, ONNX Runtime, and PyTorchWork with GPUs, distributed computing, and parallel processing to power AI workloadsBuild and maintain CI/CD pipelines using tools like GitHub Actions, Jenkins, ArgoCDAutomate model retraining, monitoring, and performance trackingEnsure compliance with privacy, security, and AI ethics standardsWhat You Bring3+ years of experience in MLOps, DevOps, or AI model deploymentStrong skills in Python and frameworks like TensorFlow, PyTorch, ONNXProficiency with Docker, Kubernetes, and serverless architecturesHands-on experience with ML tools (ArgoWorkflow, Kubeflow, MLflow, Airflow)Experience deploying and optimizing GPU-based inference (CUDA, TensorRT, DeepStream)Solid grasp of CI/CD practices and scalable ML infrastructurePassion for automation and clean, maintainable system designStrong understanding of distributed systemsBachelor's or Master's in Computer Science or equivalent work experienceBonus SkillsExperience with CUDA programmingExposure to LLMs and generative AI in productionFamiliarity with distributed computing (Ray, Horovod, Spark)Edge AI deployment experience (Triton Inference Server, TFLite, CoreML)Basic networking knowledgePlease apply now for more details and next stepsWe look forward to hearing from you
Created: 2025-05-31