Nvidia Corporation
Santa Clara, CA
We are looking for experienced engineers to help build and scale next-generation AI infrastructure using PyTorch, one of the world's most widely used deep learning frameworks. This role sits at the intersection of machine learning systems, compilers, and high-performance computing, enabling researchers and product teams to train and deploy large-scale models efficiently. You will work on core components of the PyTorch ecosystem, including model execution, distributed training, performance optimization, and developer experience. What you'll be doing: Design and build core PyTorch capabilities across runtime, autograd, distributed training, and model execution Optimize performance across GPU/accelerator backends (CUDA, Triton, etc.) Contribute to or lead development of large-scale ML systems and infrastructure Improve model training efficiency, scalability, and reliability across multi-node environments Work on compilers / graph transformations / kernel...