New offer - be the first one to apply!
July 28, 2025
Senior • Hybrid • On-site • Remote
$184,000 - $287,500/yr
Santa Clara, CA , +2
NVIDIA Aerial CUDA Accelerated RAN (ACAR) is framework for building high-performance, software-defined, cloud-native Radio Access Network functions over NVIDIA CPU/GPU/DPU based systems. We are seeking a self-motivated senior performance engineer to drive performance and scalability of our platform. This position offers the opportunity to work on cutting-edge technology for 5G and 6G networks, using NVIDIA's world-class compute platforms to advance the field of software-defined digital signal processing stack!
What you'll be doing:
As a member of Aerial RAN team working for 5G and 6G networks, you will be responsible for:
Optimizing CPU, GPU and NIC sub-systems for predictable low-latency and maximum efficiency
Crafting and implementing performance verification tools, frameworks and dashboards
Monitoring and prioritizing performance regressions reported by CI/CD
Collaborating with multi-functional teams to solve performance bottlenecks in CPU, GPU and NIC sub-systems
Benchmarking performance use-cases on different platforms
What we need to see:
BS/MS (or equivalent experience) in a relevant field and 10+ years’ experience or PhD with 5+ years’ experience or equivalent.
Strong software design, development, debugging and testing skills.
Hands-on experience with performance analysis, characterization and optimization.
Experience with programming latency sensitive, real-time, multi-threaded applications on CPUs and one or more of GPUs or DSPs or Vector processors.
Deep knowledge of CPU, DSP or GPU architecture, as well as memory, I/O and networking interfaces.
Familiarity with data science and using visualization tools to summarize large quantities of data.
Experience in one or more programming / scripting languages: C/C++, Python, shell scripting.
Ways to stand out from the crowd
Experience in designing and managing firmware timelines for wireless SoCs used in cellular wireless networks and/or terminals!
Track record in E2E design/testing of signal processing algorithms at the PHY layer or resource allocation optimization at MAC level.
CUDA experience highly desired.
Appetite to learn the details of how next generations of GPU will operate and build an outstanding Software-Radio 5G/6G stack that can fully demonstrate their power.
You will also be eligible for equity and benefits.