Staff Machine Learning Engineering (Remote)

Cisco - Indianapolis, IN

Apply Now

Job Description

The application window is expected to close on: 02/28/2026 Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received . This role can be performed remotely from locations within the United States. Meet the Team Splunk, a Cisco company, is building a safer, more resilient digital world with an endu2011tou2011end, fullu2011stack platform designed for hybrid, multiu2011cloud environments. The Splunk AI Platform and Services team provides the core runtime and developer experience that power AI across Splunk and Cisco. We manage large-scale, multi-tenant LLM inference across major cloud providers and build platform services to support these workloads. We also provide VectorDB/RAG services and MCP services that make AI workloads secure, observable, and cost-efficient for product teams. On top of this foundation, we deliver agentic frameworks, SDKs, tools, and evaluation/guardrail capabilities that help teams quickly build reliable GenAI assistants and automation features. Youu2019ll join a group that sits at the intersection of distributed systems, ML, and developer experience, grounded in operational excellence and a culture of impact-driven, cross-functional collaboration. Your Impact + Lead the end-to-end architecture for key areas of the AI Platform: multi-tenant LLM serving (vLLM/Ray), routing and orchestration layers, VectorDB/RAG integration, and agentic/SDK surfaces used by product teams. + Design and drive implementation of high-scale inference services, including parallelism strategies (TP/PP/EP/MoE), autoscaling policies, and cross-region capacity management for GPU/CPU workloads. + Optimize latency, throughput, and cost for large-scale LLM and generative workloads using techniques such as batching, chunked prefills, caching, and mixed precision. + Design and tune distributed inference configurations (TP/PP/EP/MoE), across multi-GPU and multi-node clusters and modern GPU architectures. + Implement platform capabilities such as telemetry, metering & throttling, guardrails, and rollout/rollback to ensure AI services are safe, observable, and multi-tenant by default. + Lead the design of GenAI application servicesu2014chat assistants, and automation APIs, grounded in robust RAG pipelines, agentic workflows (LangChain/LangGraph or similar), and MCP-based tool ecosystems. + Drive operational excellence with runbooks, readiness checklists, CI/CD safeguards, on-call rotations, and post-incident improvements. + Provide technical mentorship and leadership for senior and mid-level engineers: review designs, guide trade-offs around quality/latency/COGS, and help grow the next generation of tech leads. + Collaborate closely with applied scientists to productionize new models and techniques, ensuring that research prototypes become robust, observable, and cost-efficient services. Minimum Qualifications: + Bachelor's degree in computer science, Engineering, or equivalent practical experience. + 8+ years of hands-on experience building and operating backend or distributed systems in production or 5+ years of experience with a Masteru2019s degree, or 3+ years with a PhD + Proven track record as a technical lead for complex systems: driving architecture, aligning stakeholders, and delivering high-impact projects end-to-end. + Strong proficiency in at least one modern programming language (e.g., Python, Go, or Java) and deep experience with software design, debugging, and performance tuning. + Significant experience with cloud-native architectures (containers, Kubernetes, service discovery, configuration management, CI/CD) and building reliable microservices (REST/gRPC). + Demonstrated ownership of production services at scale, including on-call participation, incident response, and post-incident/RCAs that led to concrete improvements. Preferred Qualifications: + Hands-on experience running LLM or deep learning inference at scale using frameworks such as vLLM, TensorRT-LLM, Triton Inference Server, or similar. + Deep understanding of GPU and distributed systems performance: latency/throughput trade-offs, pipelining, model parallelism (TP/PP/EP/MoE), mixed precision (BF16/FP8/nvFP4), and profiling tools. + Experience designing and operating RAG systems and GenAI application layers: document ingestion, chunking/embedding strategies, metadata design, hybrid retrieval, context ranking, and evaluation of retrieval quality. + Practical experience with agentic frameworks (LangChain, LangGraph, LlamaIndex, Semantic Kernel, or similar) and multi-agent coordination, including integration with MCP tools and internal/external APIs. + Background building platform or Developer experiences capabilitiesu2014shared services, SDKs, templates, micro-frontendsu2014that are adopted by multiple product teams. + Familiarity with LangSmith or similar evaluation platforms, including experiment design, offline/online evals, hallucination/groundedness metrics, and feedback loops. + Strong knowledge of AWS or Azure or GCP (EC2/VMs, IAM roles/ARNs/principals, VPC networking, security best practices) for AI workloads. + Experience defining and monitoring dashboards, and alerts for high-availability systems using Prometheus, Grafana, or cloud-native tooling. + Excellent communication and collaboration skills, comfortable influencing cross-functional partners and other senior engineers, and explaining trade-offs between quality, latency, and cost to both technical and non-technical audiences. Why Cisco? At Cisco, weu2019re revolutionizing how data and infrastructure connect and protect organizations in the AI era u2013 and beyond. Weu2019ve been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint. Simply put u2013 we power the future. Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and youu2019ll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere. We are Cisco, and our power starts with you. Why Cisco? At Cisco, weu2019re revolutionizing how data and infrastructure connect and protect organizations in the AI era u2013 and beyond. Weu2019ve been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint. Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and youu2019ll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere. We are Cisco, and our power starts with you. Message to applicants applying to work in the U.S. and/or Canada: The starting salary range posted for this position is $193,800.00 to $245,300.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation, equity, or benefits. Individual pay is determined by the candidate's hiring location, market conditions, job-related skillset, experience, qualifications, education, certifications, and/or training. The full salary range for certain locations is listed below. For locations not listed below, the recruiter can share more details about compensation for the role in your location during the hiring process. U.S. employees are offered benefits, subject to Ciscou2019s plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and long-term disability coverage, and basic life insurance. Please see the Cisco careers site to discover more benefits and perks. Employees may be eligible to receive grants of Cisco restricted stock units, which vest following continued employment with Cisco for defined periods of time. U.S. employees are eligible for paid time away as described below, subject to Ciscou2019s policies: + 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees + 1 paid day off for employeeu2019s birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness determined by Cisco + Non-exempt employees receive 16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees + Exempt employees participate in Ciscou2019s flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations) + 80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours of unused sick time carried forward from one calendar year to the next + Additional paid time away may be requested to deal with critical or emergency issues for family members + Optional 10 paid days per full calendar year to volunteer For non-sales roles, employees are also eligible to earn annual bonuses subject to Ciscou2019s policies. Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components, subject to the applicable Cisco plan. For quota-based incentive pay, Cisco typically pays as follows: + .75% of incentive target for each 1% of revenue attainment up to 50% of quota; + 1.5% of incentive target for each 1% of attainment between 50% and 75%; + 1% of incentive target for each 1% of attainment between 75% and 100%; and + Once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation. For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay 0% up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid. The applicable full salary ranges for this position, by specific state, are listed below: New York City Metro Area: $212,300.00 - $317,100.00 Non-Metro New York state & Washington state: $193,800.00 - $282,100.00 For quota-based sales roles on Ciscou2019s sales plan, the ranges provided in this posting include base pay and sales target incentive compensation combined. Employees in Illinois, whether exempt or non-exempt, will participate in a unique time off program to meet local requirements.Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis. Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.

Created: 2026-02-09

➤

Login

Create Account

Staff Machine Learning Engineering (Remote)