Nvidia Corporation
Santa Clara, CA, USA
We are now looking for a DL Algorithms Engineer! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing and deploying Large Language Models (LLMs), Vision-Language Models (VLMs), and World Foundation Models (WFMs) in production environments. In this role, you will focus on optimizing and deploying deep learning models for efficient and fast inference across diverse GPU platforms, particularly for physical AI and generative AI applications. You will collaborate with research scientists, software engineers, and hardware specialists to bring cutting-edge AI models from prototype to production. What you will be doing: Optimize deep learning models for low-latency, high-throughput inference, with a focus on LLMs, VLMs, diffusion models, and World Foundation Models (WFMs) designed for physical AI applications. Convert, deploy, and optimize models for efficient inference using frameworks such as TensorRT, TensorRT-LLM, vLLM, and SGLang....