StaffAttract
  • Login
  • Create Account
  • Products
    • Private Ad Placement
    • Reports Management
    • Publisher Monetization
    • Search Jobs
  • About Us
  • Contact Us
  • Unsubscribe

Login

Forgot Password?

Create Account

Job title, industry, keywords, etc.
City, State or Postcode

Senior MLOps Engineer

DeepRec.ai - San Jose, CA

Apply Now

Job Description

Senior MLOps Engineer - Real-Time AI & Video Applications (Hybrid)Office Location: San Jose (Hybrid)Job Type: Full-timeWe're hiring for an impressive AI company who are focussed on real-time AI and Video Applications. Their team is made up of leading experts in computer graphics and generative modeling, and they are on a rapid growth trajectory. We're looking for experienced MLOps Engineers that want to work on real-time AI applications that are shaping the future of media.The RoleWe're looking for a talented MLOps Engineer to build and maintain robust machine learning pipelines and infrastructure. You'll be working closely with AI researchers, data scientists, and software engineers to deploy state-of-the-art models into production, optimize real-time inference, and ensure systems scale effectively.What You'll DoDesign and optimize ML pipelines for training, validation, and inferenceAutomate deployment of deep learning and generative models for real-time useImplement versioning, reproducibility, and rollback capabilitiesDeploy and manage containerized ML solutions on cloud platforms (AWS, GCP, Azure)Optimize model performance using TensorRT, ONNX Runtime, and PyTorchWork with GPUs, distributed computing, and parallel processing to power AI workloadsBuild and maintain CI/CD pipelines using tools like GitHub Actions, Jenkins, ArgoCDAutomate model retraining, monitoring, and performance trackingEnsure compliance with privacy, security, and AI ethics standardsWhat You Bring3+ years of experience in MLOps, DevOps, or AI model deploymentStrong skills in Python and frameworks like TensorFlow, PyTorch, ONNXProficiency with Docker, Kubernetes, and serverless architecturesHands-on experience with ML tools (ArgoWorkflow, Kubeflow, MLflow, Airflow)Experience deploying and optimizing GPU-based inference (CUDA, TensorRT, DeepStream)Solid grasp of CI/CD practices and scalable ML infrastructurePassion for automation and clean, maintainable system designStrong understanding of distributed systemsBachelor's or Master's in Computer Science or equivalent work experienceBonus SkillsExperience with CUDA programmingExposure to LLMs and generative AI in productionFamiliarity with distributed computing (Ray, Horovod, Spark)Edge AI deployment experience (Triton Inference Server, TFLite, CoreML)Basic networking knowledgePlease apply now for more details and next stepsWe look forward to hearing from you

Created: 2025-05-31

➤
Footer Logo
Privacy Policy | Terms & Conditions | Contact Us | About Us
Designed, Developed and Maintained by: NextGen TechEdge Solutions Pvt. Ltd.