Nvidia Corporation
Santa Clara, CA, USA
At NVIDIA, we're building GeForce G-Assist - an on-device AI assistant that combines Small Language Models (SLMs), retrieval systems, and hybrid cloud capabilities to deliver responsive, context-aware assistance inside the GeForce ecosystem. We work closely across engineering and product teams to ensure G-Assist performs reliably in real-world scenarios. What you'll be doing: Together, we focus on how models behave in production, not just on benchmarks. Evaluate and improve Small Language Models used in GeForce G-Assist, with an emphasis on accuracy, robustness, and conversational reliability. Identify and mitigate conversation and context contamination, including state drift, prompt leakage, and retrieval cross-talk. Work with SLM and VLM architectures to support text and multimodal interactions. Collaborate on hybrid architectures that combine local SLMs with cloud-based models. We value engineers who enjoy thinking across the full system-from model behavior to runtime...