New offer - be the first one to apply!
October 1, 2025
Junior • On-site
$120,000 - $189,750/yr
Durham, NC , +1
Today, NVIDIA is tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, encouraging environment where everyone is inspired to do their best work. Come join the team and see how we can make a lasting impact on the world.
We are now looking for a Deep Learning Inference Performance Architect - New College Graduate. NVIDIA is seeking creative programmers and architects who love to squeeze out every cycle of performance from deep learning software. The inference Architecture team does groundbreaking hardware-software co-design work that focuses on accelerating AI Inference workloads. You will write performance optimized low level code on today’s GPUs while helping guide our future GPU architecture decisions. If you are someone who enjoys digging deep into GPU architecture details, are passionate about AI, and know where every cycle goes when you write highly tuned software, this role may be a great fit for you!
This position offers the opportunity to have real impact in a dynamic, technology-focused company.
What you’ll be doing:
Develop innovative HW, DSP, GPU and system architectures to extend the state of the art in AI Inference performance and efficiency
Analyze and prototype key deep learning and data analytics algorithms and applications
Understand and analyze the interplay of hardware and software architectures on future algorithms and applications
Write efficient software for AI Inference, including CUDA kernels, framework level code, and application-level code
Collaborate across the company to guide the direction of AI, working with software, research, and product teams
What we need to see:
Recently completed a MS or PhD in Computer Science, Electrical Engineering, Math or related field (or equivalent experience)
Strong mathematical foundation in machine learning and deep learning
Expert programming skills in C, C++, or Python
Familiarity with GPU computing (CUDA or similar) and HPC (MPI, OpenMP)
Strong knowledge and coursework in computer architecture
Ways to stand out from the crowd:
Background with systems-level performance modeling, profiling, and analysis
Experience in characterizing and modeling system-level performance, executing comparison studies, and documenting and publishing results
Experience in optimizing AI Inference workloads with CUDA kernel development
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant and talented people on the planet working for us. If you're creative and autonomous, we want to hear from you!
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 120,000 USD - 189,750 USD for Level 2, and 148,000 USD - 235,750 USD for Level 3.You will also be eligible for equity and benefits.