Principal Engineer

Insight Global - San Diego, CA

Apply Now

Job Description

Job Description Insight Global is seeking a Principal Software Engineer with AI experience for a direct hire opportunity to sit fully remote in the US. You will be joining a team to help improve AI governance and compliance platforms to help organizations manage and monitor AI systems securely and transparently. You will drive the end-to-end technical strategy, architecture, and productionization of the clients machine learning systems, large language model (LLM) capabilities, and AI infrastructure. Own how models, evaluation pipelines, data workflows, and observability components are designed, deployed, monitored, and continuously improved to meet reliability, quality, safety, and cost goals. Provide deep AI/ML expertise and leadership across engineering teams, guiding model integration, AI/ML platform decisions, and scalable distributed systems that support enterprise-grade GenAI workloads. Job responsibilities u2022u2003Define and own the architecture for scalable AI/ML systems, including training, fine-tuning, inference, evaluation, and monitoring pipelines. u2022u2003Translate ambiguous business and product requirements into robust AI/ML system designs and staged delivery plans. u2022u2003Make strategic decisions on model selection, LLM integrations, evaluation frameworks, model gateways, guardrails, and safety mechanisms. u2022u2003Lead design reviews, architecture forums, and technical decision-making across teams. u2022u2003Build and deploy production-grade AI/ML/LLM models, transformers, and generative AI featuresu2014from initial concept through production rollout. u2022u2003Establish standards for model readiness, evaluation gates, rollout/rollback, drift detection, observability, and ongoing performance management. u2022u2003Partner with engineering teams to integrate models into distributed systems with clear SLOs, telemetry, and error-budget mechanisms. u2022u2003Design and improve data pipelines, feature stores, and data quality/lineage workflows supporting model training and inference. u2022u2003Develop scalable AI/MLOps/AIOps practices for automation of training, testing, deployment, and monitoring. u2022u2003Evaluate and implement AI/ML workflow orchestration platforms (e.g., AI/MLflow, Kubeflow, Vertex AI) and CI/CD for AI/ML. u2022u2003Own evaluation pipelinesu2014latency, accuracy, cost, hallucination metrics, prompt versioning, and model performance insights. u2022u2003Instrument tracing and model observability using best-practice frameworks and telemetry standards. u2022u2003Implement guardrails and safety systems to ensure consistent, controlled behaviour of LLM-powered features. u2022u2003Partner closely with product, engineering, and leadership to shape platform strategy and AI feature roadmap. u2022u2003Provide trade-off analyses that incorporate model performance, security, compliance, scalability, and long-term maintainability. u2022u2003Write clear technical documents, proposals, and mechanism-based recommendations to guide executive decision-making. u2022u2003Mentor senior/junior engineers in AI/ML best practices, distributed systems, experimentation, and model governance. u2022u2003Support hiring, leveling, performance feedback, and the growth of a high-calibre engineering team. We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: Skills and Requirements u2022u200310+ years of software engineering experience u2022u20035+ years of hands-on AI/ML development experience u2022u2003Full stack development w/ Java and C+u2022u2003Bacheloru2019s Degree in Computer Science or related field u2022u2003Proven experience deploying AI/ML productions or LLM systems at scale (not prototypes) u2022u2003Extensive experience with Python programming u2022u2003Experience w/ cloud platforms (AWS/GCP/Azure) and Kubernetes experience u2022u2003Experience in AI/ML flow u2013 Kubeflow, Vertex AI, SageMaker or similar platform u2022u2003Expertise with LLM productionization including finetuning, retrieval-augmented generation (RAG), safety/guardrails, and evaluation. u2022u2003Masters or PhD in Computer Science, Machine Learning, etc u2022u2003Cloud platform experience with deploying AI/ML workloads at scale u2022u2003Contributions to Alops/MLOps platform u2022u2003Previous experience with AI observability and troubleshooting

Created: 2026-01-23

➤

Login

Create Account

Principal Engineer