New offer - be the first one to apply!
May 30, 2025
Senior • On-site
$168,000 - $322,000/yr
Santa Clara, CA
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by phenomenal technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing and transform industries. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Join the team and make a lasting impact on the world!
We’re looking for a Solutions Architect to join our AI Operations Team to architect, lead, and deliver large-scale AI projects for our Digital Marketing Organization. This position requires a deep knowledge of the latest trends in applied AI, with a strong understanding of Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG). The ideal candidate should have specialized expertise in implementing end-to-end AI workflows and be an excellent communicator, able to work with globally dispersed development, product, and business groups. Come help lead our efforts to use NVIDIA's latest generative AI technologies in production-ready AI features across our websites.
What You Will Be Doing:
Architect end-to-end generative AI applications for the Digital Marketing Organization with a focus on LLM deployment and RAG workflows.
Get hands-on and use advanced Python programming knowledge to make valuable contributions at both the application and infrastructure levels.
Provide technical leadership and guidance on standard methodologies for training LLMs and implementing RAG-based solutions.
Work with our primary collaborators, NVIDIA’s Marketing Team, to understand their requirements and deliver tailored solutions to their requests as well as partner with the Digital Marketing Org’s AI Development Team and other development resources to complete projects.
Collaborate closely with our globally dispersed development, MLOps, product, engineering, and business teams.
Implement strategies for efficiently and effectively implementing AI workflows and agents to achieve optimal performance using NVIDIA’s hardware and software platforms.
Lead workshops and design sessions with our Digital Marketing Development Teams to define and refine generative AI solutions focused on LLMs and RAG workflows.
Design and implement RAG-based workflows to enhance content generation and information retrieval.
Work closely with NVIDIA engineering and product teams to provide feedback and contribute to the evolution of generative AI software.
Work closely with the Digital Marketing Org’s Web and Platform Teams to integrate RAG workflows into their applications and systems.
What We Need To See:
Master's or Ph.D. in Computer Science, Artificial Intelligence, or a related field; or equivalent experience in building and deploying AI-powered solutions at scale.
8+ years of hands-on experience in a technical role, including experience with generative AI.
Advanced proficiency in Python programming, with the ability to contribute at both the application and infrastructure levels.
Knowledge of building Agentic frameworks and multi-agent applications using Langchain, Langgraph, etc.
Hands-on experience with or understanding of NVIDIA’s hardware and software technologies (e.g. CUDA, Triton, TensorRT, NeMo, RAPIDS, etc.)
Proven record of successfully deploying and optimizing LLM models for inference in production environments.
In-depth understanding of state-of-the-art language models, such as modern open models (e.g. Llama, Mistral) and proprietary APIs (e.g. ChatGPT, Claude, Gemini).
Expertise in training and fine-tuning LLMs using NVIDIA NeMo Framework and other popular frameworks.
Strong knowledge of cloud and datacenter GPU systems
Excellent communication and collaboration skills with the ability to articulate complex technical concepts to both technical and non-technical team members.
Experience leading workshops, training sessions, and communicating technical solutions to diverse audiences.
Ways To Stand Out From The Crowd:
Experience in deploying LLM models in cloud environments (e.g., AWS, Azure, GCP) and on-premises infrastructure.
Experience working with any agentic models/frameworks.
Working experience with Observability and Evaluation tools
Familiarity with containerization technologies (e.g., Docker) and orchestration tools (e.g., ECS, Kubernetes) for scalable and efficient model deployment.
Hands-on experience with NVIDIA GPU technologies, and GPU cluster management, and ability to design and implement scalable and efficient workflows for LLM training and inference on GPU clusters
With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers; we have some of the most forward-thinking and hardworking people in the world working for us and, due to unparalleled growth, best-in-class teams are rapidly growing. If you’re creative and autonomous with a real passion for your work, we want to hear from you!
The base salary range is 168,000 USD - 322,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.