AI/Machine Learning Engineer
US Tech Solutions - Mountain View, CA
Apply NowJob Description
About the Role We are looking for anEvaluation Scientistwho can work across bothhands-on experimentationandautomation infrastructure . This role begins with runningmanual evaluations(e.g., executing and monitoring individual experiments) and progresses toward buildingscripts, tools, and infrastructurethat streamline and automate these processes, with the long-term goal of reducing manual work as much as possible. The ideal candidate will also bring expertise incoding agentsandquality evaluation , enabling them to design robust experiments and improve workflows. While the role will receive high-level guidance, candidates should be able toindependently define and implement the lower-level detailsof experiment setup after ramping up. For example, given a high-level requirement for a new type of evaluation, the candidate should be able topropose and execute an implementation planwith detailed steps, metrics, and automation in place. Key Responsibilities Run and managemanual evaluation experimentsacross AI/ML systems. Develop and maintainautomation infrastructure(scripts, pipelines, tools) to reduce manual evaluation work. Design and executenew types of evaluations , translating broad research questions into structured experiment setups. Work withcoding agentsand applied ML workflows to define and measure quality. Definemetrics, benchmarks, and evaluation criteriato assess performance and identify gaps. Collaborate with research leads to align evaluation design with project goals while owningimplementation details . Ensure reproducibility, consistency, and scalability of evaluation processes. Qualifications Strong coding skills inPython(or equivalent) for scripting, automation, and experiment design. Experience withrunning and analyzing experiments , including quality evaluation methodologies. Knowledge ofcoding agents, ML models, or applied automation frameworks . Ability towork independently : take high-level requirements and define detailed steps for execution. 2–4 years of hands-on experience inevaluation, scripting, or applied data science/ML(academic or industry). Strong analytical skills with experience indata handling, reporting, and experiment analysis . Preferred Skills Familiarity withevaluation frameworksand automation tools in AI/ML research. Experience inbuilding scalable infrastructurefor experiments or evaluations. Knowledge ofexperimental design, statistical testing, or quality benchmarking .#J-18808-Ljbffr
Created: 2026-02-27