New offer - be the first one to apply!
August 23, 2025
Senior • On-site
$176,000 - $276,000/yr
Santa Clara, CA
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.
We are seeking a dedicated Base Command Manager (BCM) Engineer to support product deployments/escalations and collaborate with Engineering and our Field Organization.
What you'll be doing
Play a key role in NVIDIA’s NPI team, acting as the link between engineering and the NVIS field team for cluster deployment and management solutions.
Collaborate closely with engineering and product teams to review and influence design decisions for products centered around large-scale, BCM-managed clusters.
Evaluate changes in BCM and underlying OS/software stacks, communicating the impact to the field organization to maintain robust and scalable deployment workflows.
Define and relay detailed cluster management requirements to engineering, enabling the successful New Product Introduction (NPI) of next-generation GPU platforms.
Describe architectural and design changes, build clear and actionable tasks for the field, including standardized deployment guides, configuration standard methodologies, and validation workflows.
Validate complex cluster configurations including Slurm and Kubernetes orchestrators for performance, scalability, and resilience, ensuring they meet real-world customer scenarios.
Support the NPI team by bridging knowledge gaps, tracking progress, and aligning collaborators throughout the product development lifecycle.
Support NVIDIA's mission by ensuring our breakthrough technologies are successfully deployed for global customers and OEM partners.
What we need to see
Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
10+ years of experience in at least two of the following: HPC/large-scale cluster administration, Linux systems engineering, infrastructure automation (e.g., Ansible, Salt), or data center operations.
5+ years of direct, hands-on experience provisioning, managing, and fixing clusters using NVIDIA Base Command Manager (BCM).
Deep, practical knowledge of how Slurm and Kubernetes is coordinated, deployed, and managed by BCM, including workload submission and resource management.
Proficiency in Python and Bash scripting for automation, cluster validation, and workflow optimization.
In-depth experience with cluster management and monitoring tools (e.g., Prometheus, Grafana, DCGM, and similar observability stacks).
Outstanding written and verbal communication skills, with the ability to explain complex technical concepts to both technical and non-technical collaborators.
A customer-first attitude, self-motivation, and a proactive approach to leadership in diverse environments.
Ways to stand out from the crowd:
Proficiency with cluster networking including InfiniBand and Spectrum-X.
Experience with NVIDIA Mission Control.
Familiarity with CI/CD workflows in an infrastructure context, including tools such as Git, GitLab, and Jenkins.
Background in Professional Services, customer-facing deployment, and solutions optimization.
Industry certifications such as CKA/CKAD (Certified Kubernetes Administrator/Developer), RHCE, or other advanced Linux/HPC credentials.
NVIDIA is widely considered one of the world's most desirable employers in technology. We have some of the world's most forward-thinking and passionate people working for us. If you're creative and autonomous, we want to hear from you!
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 176,000 USD - 276,000 USD for Level 5, and 208,000 USD - 327,750 USD for Level 6.You will also be eligible for equity and benefits.