Machine Learning Engineer - Reinforcement Learning & ...

Integrated Research Ltd. - Denver, CO

Apply Now

Job Description

Machine Learning Engineer - Reinforcement Learning & Adaptive SystemsPosted: 06/08/2025Closing Date: 03/10/2025Job Type: Permanent - Full TimeJob Category: Information TechnologyIR Labs is the innovation lab inside Integrated Research where small, cross‑functional squads chase outsized, industry‑defining opportunities. We operate like a funded startup — rapid sprints, bold experimentation, zero bureaucracy — backed by the global footprint and resources of a public company. Our charter is simple: turn cutting‑edge AI research into products that customers can’t imagine working without. We target the hardest problems in software and then move fast to ship solutions that create 10x impact. If you thrive on autonomy, crave world‑class technical challenges, and want to see your ideas hit production quickly, IR Labs is your launch pad. Join us and help build the future—one breakthrough at a time.Job DescriptionWhat You’ll DoOwn the RL strategy as the third founding MLE, partnering with LLM and Graph leads to close the adaptive learning loop.Design vectorized, GPU-accelerated environments/gyms for graph reasoning, code-intelligence, and multi-tool agent workflows.Apply SOTA RL (PPO, DPO, ReLoRA, Decision Transformer, CQL, RLAIF) to continuously fine-tune policies that interact with knowledge graphs and code-aware agents.Build robust reward models and safety layers—adversarial probing, constraint optimization, variance penalties, and constitutional-style rules to reduce reward hacking.Define benchmarks and dashboards that unite task success, robustness, latency, and cost; integrate HELM/RL4LMs and custom graph/LLM tests.Scale distributed RL on GPU fleets (RLlib / DeepSpeed-RL) with actor-learner sharding, mixed precision, and Flash-Attention for sample efficiency.Leverage offline RL and active learning to use logged corpora while collecting targeted rollouts for high-uncertainty strument for observability and security; expose auditable, safe APIs and mentor the team on RL best practices.Desired Skills and ExperienceWhat You Bring to the Table8+ years end-to-end ML experience; 4+ years shipping deep RL in production (preferably on language or graph tasks).Expert in PyTorch (+Lightning/Accelerate) and RL stacks (RLlib, trlX, CleanRL, Sample Factory); able to author CUDA/Triton kernels when needed.Proven experience building Gymnasium/PettingZoo environments and GPU-native sims (Isaac Gym, Brax) with curriculum design.Deployed RLHF/RLAIF pipelines with reward-model training, safety guarding, and preference data.Strong evaluation chops (HELM, HumanEval, red-team/robustness suites) and tooling for latency/cost-aware metrics.Practical experience with offline RL, dataset curation, and active sampling for label fortable designing action spaces from low-level code reps (AST/CFG/IR) for repair/refactor/optimization tasks.Security-first mindset (sandboxing, IAM scoping, prompt-injection defenses, SOC2/HIPAA); clear communicator and mentor.Our job descriptions often reflect our ideal candidate. If you have a strong foundation of relevant skills and a passion for this field, we encourage you to apply, even if you don't check every box.What We OfferHigh Impact – Ship real features in weeks, not quartersCutting-Edge Tech – Work to solve problems no one has cracked before.Remote & Flexible – Work from anywhere with a culture built on trust, autonomy, and balance.Growth & Ownership – Own features end-to-end, learn rapidly, and grow with the company as we scale.Top-Tier Compensation – Competitive salary, performance bonuses, equity upside, and strong benefits.Team & Culture – Small, senior team that values collaboration, creativity, and building something meaningful together.401k with Employer Contributions .Health Savings Account (HSA) Contributions with High-Deductible Health Plan .Short-Term/Long-Term Disability Insurance .And more!Compensation Range$190,000 - $210,000 base$53,000 - $63,000 variable compensationActual compensation offer to candidate may vary from posted hiring range based upon geographic location, work experience, education, and/or skill level. The pay ratio between base pay and target incentive (if applicable) will be finalized at the offer stage.At IR we celebrate, support, and thrive on difference for the benefit of our employees, our products, and our community. We are proud to be an Equal Employment Opportunity employer and encourage applications from all suitable candidates; we never discriminate based on race, religion, national origin, gender identity or expression, sexual orientation, age, or marital, veteran, or disability status. #J-18808-Ljbffr

Created: 2025-09-17

➤

Login

Create Account

Machine Learning Engineer - Reinforcement Learning & ...