MLOps Engineer
InfoVision Inc. - Dallas, TX
Apply NowJob Description
We're hiring an experienced MLOps Engineer to productionize and scale ML and GenAI systems, with a focus on LLM deployment, orchestration, and reliability in production environments.Key ResponsibilitiesDeploy, manage, and scale ML/DL models in production Build and operate Kubernetes-based infrastructure for ML workloads Handle model packaging, serialization, and versioning Design scalable inference systems (batch and real-time) Deploy and optimize local LLMs (latency, throughput, cost) Implement GenAI workflows (RAG, prompt pipelines, orchestration) Build and manage agentic systems with tool integration Design and manage LLM memory (short-term, long-term, vector stores) Integrate and manage API gateways for model access, routing, and rate limiting Monitor performance, drift, and system reliability RequirementsStrong Kubernetes fundamentals (pods, services, autoscaling, deployments) Hands-on experience with ML/DL models and serialization Proven experience in model deployment, scaling, and monitoring Experience with local LLM deployment and optimization Solid understanding of LLM memory patterns (context windows, retrieval, persistence) Experience with API gateways, load balancing, and service routing Familiarity with GenAI workflows (RAG, orchestration frameworks) Experience building agentic / multi-step LLM systems Proficiency in Python and modern ML/infra tooling
Created: 2026-05-09