Sr. Software Engineer- AI/ML, AWS Neuron Apps
Amazon - Seattle, WA
Apply NowJob Description
Job Summary: Amazon Web Services (AWS) is seeking a Senior Software Engineer to join their Machine Learning Applications team, focusing on the AWS Neuron software stack for AI accelerators. The role involves optimizing and deploying advanced AI models, collaborating with silicon architects, and driving performance improvements for AI inference solutions. Responsibilities: • Pioneer distributed inference solutions for industry-leading LLMs such as GPT, Llama, Qwen • Optimize breakthrough language and vision generative AI models • Collaborate directly with silicon architects and compiler teams to push the boundaries of AI acceleration • Drive performance benchmarking and tuning that directly impacts millions of inference calls globally • Architect the bridge between ML frameworks including PyTorch, JAX and AI hardware • Spearhead distributed inference architecture for PyTorch and JAX using XLA • Engineer breakthrough performance optimizations for AWS Trainium and Inferentia • Develop ML tools to enhance LLM accuracy and efficiency • Transform complex tensor operations into highly optimized hardware implementations • Pioneer benchmarking methodologies that shape next-gen AI accelerator design Qualifications: Required: • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience • 5+ years of programming experience using Python or C++ and PyTorch. • Experience with AI acceleration via quantization, parallelism, model compression, batching, KV caching, vllm serving • Experience with accuracy debugging & tooling, performance benchmarking of AI accelerators • Fundamentals of Machine learning and deep learning models, their architecture, training and inference lifecycles along with work experience on optimizations for improving the model execution. Preferred: • Masters degree in computer science or equivalent • Masters degree in machine learning or equivalent • Experience with accuracy debugging & tooling, performance benchmarking of AI accelerators • Experience in developing CUDA kernels, HPC and inference optimization, tensors operations Company: Launched in 2006, Amazon Web Services (AWS) began exposing key infrastructure services to businesses in the form of web services -- now widely known as cloud computing. Founded in 2002, the company is headquartered in Seattle, USA, with a team of 10001+ employees. The company is currently Late Stage.
Created: 2026-03-09