New offer - be the first one to apply!

June 8, 2026

Lead Linux System Administrator

Senior • Remote

Łódź, Poland

About the role

We are looking for a Lead Linux System Administrator to take technical ownership of the Linux environment supporting large-scale GPU infrastructure used for AI training and inference workloads.

This role combines hands-on system administration with team leadership. You will be responsible for the stability, performance, security, and day-to-day management of Linux-based GPU servers, while also supporting and mentoring a team of administrators working in a complex production environment.

Responsibilities

  • Lead, mentor, and support a team of Linux System Administrators responsible for GPU infrastructure operations

  • Manage the full Linux server lifecycle, including provisioning, patching, configuration management, hardening, and performance tuning

  • Maintain and optimize the NVIDIA GPU software stack, including drivers, CUDA, cuDNN, NCCL, and GPU management tools such as DCGM and nvidia-smi

  • Support and manage MIG and GPU time-slicing configurations where needed

  • Develop and maintain automation for bare-metal provisioning, OS image management, and server configuration using tools such as Ansible, Terraform, and scripting

  • Tune Linux systems for demanding workloads, including kernel parameters, local storage, parallel file systems, networking, and scheduler settings

  • Troubleshoot complex issues across hardware, drivers, the operating system, and cluster-level services

  • Work closely with DevOps/SRE, Site Operations, and AI/ML teams to ensure smooth integration between OS-level infrastructure and higher-level orchestration platforms

  • Support security hardening, vulnerability management, patch compliance, and operational standards across the server fleet

  • Participate in on-call support and contribute to continuous improvements in reliability, performance, and operational efficiency

Requirements

  • 7+ years of hands-on experience in Linux system administration in production environments

  • At least 3 years of experience in a technical lead, lead administrator, or people leadership role

  • Strong expertise in administering Linux systems at scale

  • Hands-on experience with NVIDIA GPUs in Linux environments, including drivers, CUDA ecosystem components, and GPU management tools

  • Strong experience with Ansible or other configuration management tools

  • Good scripting skills in Python and/or Bash

  • Experience with Infrastructure as Code and infrastructure automation

  • Good understanding of high-performance computing, storage systems, and high-speed networking technologies such as InfiniBand or RoCE

  • Experience supporting AI/ML or HPC workloads

  • Ability to troubleshoot complex production issues and work effectively in a high-availability environment

  • English proficiency at least at a communicative level is required, as you will be working in an international team

Nice to have

  • Experience with cluster management and orchestration tools such as Slurm, Kubernetes, or Run:ai

  • Familiarity with bare-metal provisioning tools and large server fleet management

  • Experience in AI infrastructure companies, hyperscalers, or HPC/research environments

  • Knowledge of Linux performance tuning for GPU-accelerated workloads

  • Higher education in Computer Science, Engineering, or a related field

What we offer

  • Benefits package

  • Opportunity to lead Linux infrastructure supporting advanced AI workloads at scale

  • Work with modern GPU hardware and software stacks in a technically demanding environment

  • Collaboration with experienced engineers across infrastructure, platform, and AI teams

  • A dynamic workplace with room for ownership, technical influence, and professional growth