New offer - be the first one to apply!
June 30, 2026
Sr. Software Engineer (Data Center Automation)
Senior • On-site
Memphis, TN
About the Role
We are seeking a highly skilled Sr. Software Engineer to manage and enhance reliability across a multi-data center environment. This role focuses on automating processes, building robust observability solutions, and ensuring seamless operations for mission-critical AI infrastructure. The position bridges software engineering principles with physical data center realities to deliver resilient, scalable systems with near-zero downtime.
The primary objective is to mitigate downtime and minimize end-user impact from scheduled and unscheduled maintenance through proactive automation, observability, and integrated software-physical reliability strategies.
Responsibilities
- Design, develop, and deploy scalable services (primarily in Python and Rust) to automate monitoring, alerting, incident response, and infrastructure provisioning.
- Implement and maintain observability solutions including metrics, logging, tracing, dashboards, and alerting systems.
- Collaborate with software, network, site, and facility operations teams to automate fault tolerance, disaster recovery, capacity planning, and environmental risk mitigation.
- Troubleshoot complex data center issues including hardware failures, software bugs, and network problems while adhering to SLAs and error budgets.
- Optimize Linux-based systems through kernel tuning, container orchestration (e.g., Kubernetes), and automation scripting.
- Analyze and troubleshoot large-scale network topologies across multi-data center environments.
- Participate in on-call rotations, incident response, and blameless postmortems.
- Mentor junior engineers and promote documentation and automation best practices.
Basic Qualifications
- Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience).
- 3+ years of experience in SRE, Infrastructure, DevOps, or Systems Engineering in distributed production environments.
- Strong production programming experience in Python; familiarity with Rust or other systems languages (Go, C++).
- Experience with Linux systems administration, performance tuning, and automation.
- Knowledge of containerization and orchestration tools such as Docker and Kubernetes.
- Experience implementing observability tools (e.g., Prometheus, Grafana) including metrics, logging, tracing, and alerting.
- Understanding of networking fundamentals including TCP/IP, routing, redundancy, and DNS.
- Experience with incident response, on-call rotations, and reliability best practices (SLAs, error budgets).
Preferred Skills and Experience
- 5+ years of SRE or infrastructure experience in hyperscale or AI/ML environments.
- Large-scale Kubernetes operations and automation experience.
- Proficiency in Rust for systems programming.
- Experience integrating software reliability with physical data center infrastructure.
- Experience building automated remediation and disaster recovery systems.
- Background optimizing Linux systems for AI workloads or GPU clusters.
- Experience with bare-metal provisioning and multi-site failover mechanisms.
- Mentoring and strong documentation skills.
Similar jobs you might like
Technology
New offer

xAI
Sr. Software Engineer (Data Center Automation)
Senior
On-site
Palo Alto, CA
🏢 Summary: Senior Software Engineer role focused on building automation and observability solutions to enhance reliability across multi-data center AI infrastructure. The position combines strong programming skills with hands-on data center and Linux systems expertise to minimize downtime and optimize performance. It involves developing scalable services, improving monitoring and incident response, and collaborating across infrastructure and facility teams. 🗂️ Requirements: Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering or related field (or equivalent experience), 3+ years experience in SRE, Infrastructure, DevOps, or Systems Engineering in large-scale production environments, Strong production experience in Python, Solid Linux systems administration and kernel-level knowledge, Experience with containerization and orchestration (Docker, Kubernetes or similar), Experience implementing observability solutions (metrics, logging, tracing, monitoring, alerting), Understanding of networking fundamentals (TCP/IP, routing, DNS, redundancy), Experience troubleshooting distributed systems, hardware and network issues, Experience with on-call rotations and incident response practices (SLAs, error budgets), Ability to collaborate with cross-functional technical teams 📃 Skills: Python, Rust, Linux, Kubernetes, Docker, Prometheus, Grafana, TCP/IP, DNS, Scripting, Automation, Observability, Monitoring, Tracing, Networking 🏢 Description: ABOUT xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates. ABOUT THE ROLE: We are seeking a highly skilled Sr. Software Engineer to join our team in managing and enhancing reliability across a multi-data center environment. This role focuses on automating processes, building and implementing robust observability solutions, and ensuring seamless operations for mission-critical AI infrastructure. The ideal candidate will combine strong coding abilities with hands-on data center experience to build scalable reliability services, optimize system performance, and minimize downtime—including close partnership with facility operations to address physical infrastructure impacts. In an era where AI workloads demand near-zero downtime, this position plays a pivotal role in bridging software engineering principles with physical data center realities. By prioritizing automation and observability, team members in this role can reduce mean time to recovery (MTTR) by up to 50% through proactive monitoring and automated remediation. The primary objective of this team is to mitigate downtime and minimize impact to end-users from both scheduled and unscheduled maintenance, as well as events affecting onsite data centers. This is achieved through proactive automation, robust observability, and integrated software-physical reliability strategies. RESPONSIBILITIES: - Design, develop, and deploy scalable code and services (primarily in Python and Rust) to automate reliability workflows, including monitoring, alerting, incident response, and infrastructure provisioning. - Implement and maintain observability tools and practices, such as metrics collection, logging, tracing, and dashboards, to provide real-time insights into system health across multiple data centers. - Collaborate with cross-functional teams to identify reliability bottlenecks and automate solutions for fault tolerance, disaster recovery, capacity planning, and physical/environmental risk mitigation. - Troubleshoot and resolve complex issues in data center environments, including hardware failures, environmental anomalies, software bugs, and network-related problems, while adhering to reliability principles like error budgets and SLAs. - Optimize Linux-based systems for performance, security, and reliability, including kernel tuning, container orchestration, and scripting for automation. - Understand network topologies and concepts in large-scale, multi-data center environments to troubleshoot connectivity, routing, redundancy, and performance issues. - Participate in on-call rotations, post-incident reviews (blameless postmortems), and continuous improvement initiatives. - Mentor junior team members and document processes to foster a culture of automation and knowledge sharing. BASIC QUALIFICATIONS: - Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, or a closely related technical field (or equivalent professional experience). - 3+ years of hands-on experience in site reliability engineering (SRE), infrastructure engineering, DevOps, or systems engineering in large-scale, distributed, or production environments. - Strong programming skills with proven production experience in Python; experience with Rust or another systems-level language (e.g., Go, C++) is essential. - Solid experience with Linux systems administration, performance tuning, kernel-level understanding, and scripting/automation in production environments. - Practical knowledge of containerization and orchestration technologies, such as Docker and Kubernetes. - Experience implementing observability solutions, including metrics, logging, tracing, monitoring tools, alerting, and dashboards. - Familiarity with troubleshooting complex issues in distributed systems, including software bugs, hardware failures, network problems, and environmental factors. - Understanding of networking fundamentals (TCP/IP, routing, redundancy, DNS) in large-scale or multi-site environments. - Experience participating in on-call rotations, incident response, post-incident reviews, and reliability practices such as error budgets or SLAs. - Ability to collaborate effectively with cross-functional teams. PREFERRED SKILLS AND EXPERIENCE: - 5+ years of experience in SRE or infrastructure roles in hyperscale, cloud, or AI/ML training environments with multi-data center setups. - Hands-on experience operating or scaling Kubernetes clusters at large scale, including automation for provisioning and high availability. - Proficiency in Rust for systems programming and performance-critical components. - Experience integrating software reliability tools with physical data center infrastructure (power, cooling, environmental monitoring). - Experience building automated remediation, fault tolerance, disaster recovery, capacity planning, or predictive failure detection systems. - Background in optimizing Linux-based systems for AI workloads, GPU clusters, or high-throughput compute environments. - Experience with bare-metal provisioning, data center interconnects, or hybrid/multi-site failover mechanisms. - Mentoring experience and strong documentation skills. xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.
Technology
New offer

xAI
Site Reliability Engineer - Cybersecurity
Senior
On-site
Palo Alto, CA
🏢 Summary: Cybersecurity / SRE role focused on securing and maintaining the reliability of a large-scale fintech platform operating in hybrid cloud environments. The position emphasizes Kubernetes and container security, SIEM management, CI/CD protection, and automation using Python and infrastructure-as-code tools. Candidates will work on mission-critical distributed systems, ensuring regulatory compliance and resilient security operations at scale. 🗂️ Requirements: Experience securing hybrid AWS/on-premises environments, Strong proficiency in Python, Strong proficiency in Terraform, Strong proficiency in Puppet, Deep expertise in Kubernetes, Experience with container security, Hands-on experience with GitHub Actions, Experience with Prometheus, Experience with Grafana, Experience with CloudWatch, Experience with Karma, Experience managing and integrating Wazuh, Experience with security scanning tools (Semgrep, Trivy, Falco), Experience with IAM and security posture management, Ability to comply with PCI and NIST CSF standards, Located in SF Bay Area or willing to relocate 📃 Skills: AWS, IAM, Python, Terraform, Puppet, Kubernetes, Docker, GitHub, Prometheus, Grafana, CloudWatch, Karma, Wazuh, Semgrep, Trivy, Falco, PCI, NIST, CI/CD 🏢 Description: ABOUT xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates. ABOUT THE ROLE: The Cybersecurity / SRE team is focused on ensuring the security and reliability of X Money. This role will primarily focus on the X Money platform but will also cross over with the X Social platform. The ideal candidate will have experience in the banking, money transmission, and P2P payments industry. We emphasize working with large distributed systems and security platforms at scale, with an automation-first mindset. You'll be responsible for securing and maintaining the reliability of X Money's infrastructure. You'll work closely with cross-functional teams to enhance security measures, improve system resilience, and implement best practices. RESPONSIBILITIES: - Build and secure mission-critical applications in a hybrid cloud environment. - Manage identities and roles effectively. - Monitor and remediate infrastructure to comply with regulations and best practices (e.g., PCI, NIST CSF). - Maintain a SIEM and all data pipelines needed for reliable alerting. - Design and implement secure container standards and automation to enable frictionless developer workflows. - Maintain Kubernetes security aligned with current best practices. - Build, deploy, and maintain security operations infrastructure using Python, Terraform, and Puppet. - Secure and enhance CI/CD pipelines. - Integrate and maintain code scanning platforms. - Develop dashboards and alerts from security metrics. - Own security projects: identify issues and implement solutions. - Apply critical analysis and problem-solving skills. BASIC QUALIFICATIONS: - Proven experience securing hybrid AWS/on-premises environments, including IAM and overall security posture. - Strong proficiency in Python, Terraform, and Puppet. - Certifications like CISA, CRISC, CGEIT, Security+, CASP+, or similar preferred. - Deep expertise in Kubernetes and container security. - Hands-on expertise building GitHub Actions and workflows. - Extensive experience with Prometheus, Grafana, CloudWatch, and Karma. - Well versed in management and integrations of Wazuh. - Hands-on experience with security scanning tools (Semgrep, Trivy, Falco). - Proactive mindset with strong ownership and problem-solving skills. - Excellent critical thinking and analytical abilities. - Located in the SF Bay Area or willing to relocate. COMPENSATION AND BENEFITS: $180,000 - $440,000 USD Base salary is just one part of our total rewards package, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks. xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.
Technology
New offer

xAI
Backend Engineer - API
Senior
On-site
Palo Alto, CA
🏢 Summary: Engineering role focused on building and operating a highly scalable, low-latency API infrastructure that serves AI models globally. The position involves owning end-to-end distributed systems for high-throughput inference, including model serving, request routing, and observability. It requires deep expertise in Rust or C++ and strong experience with distributed systems and production-grade infrastructure. 🗂️ Requirements: Expert knowledge of Rust or C++, Experience designing and maintaining horizontally scalable distributed systems, Experience building reliable production infrastructure, Knowledge of service observability and reliability best practices, Experience operating PostgreSQL, Clickhouse, or MongoDB, Strong understanding of high-throughput, low-latency systems 📃 Skills: Rust, C++, Go, PostgreSQL, Clickhouse, MongoDB, gRPC, Docker, Kubernetes, TensorRT, vLLM, SGLang 🏢 Description: ABOUT THE ROLE: As an ideal candidate you have a good understanding of how highly scalable and reliable production infrastructure is built. Most of our backend infrastructure is written in Rust. Familiarity with a compiled language such as C++, Rust, or Go is highly beneficial. RESPONSIBILITIES: Build the xAI API that serves our models to developers worldwide Own the end-to-end system responsible for high-throughput inference, handling billions of tokens per minute with low latency and high availability, including model serving infrastructure, request routing, SDK development, rate limiting, observability, and efficient scaling BASIC QUALIFICATIONS: Expert knowledge of either Rust or C++ Experience in designing, implementing, and maintaining reliable and horizontally scalable distributed systems Knowledge of service observability and reliability best practices Experience in operating commonly used databases such as PostgreSQL, Clickhouse, and MongoDB PREFERRED SKILLS AND EXPERIENCE: Experience with LLM inference engines and serving frameworks (e.g., SGLang, TensorRT, vLLM) Experience designing or building with agent SDKs and agent orchestration frameworks Experience with Docker, Kubernetes, and containerized applications Expert knowledge of gRPC (unary, response streaming, bi-directional streaming, REST mapping) COMPENSATION AND BENEFITS $180,000 - $440,000 USD Base salary is just one part of the total rewards package, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short- and long-term disability insurance, life insurance, and various other discounts and perks. xAI is an equal opportunity employer. For details on data processing, view the Recruitment Privacy Notice.
Technology

xAI
Backend Engineer - API
Senior
On-site
Palo Alto, CA
🏢 Summary: Engineering role focused on building and owning a high-throughput, low-latency API and backend infrastructure for large-scale model inference. The position involves designing and operating reliable, horizontally scalable distributed systems that serve billions of tokens per minute. You will develop and maintain model serving, routing, SDKs, and observability within a production-grade environment. 🗂️ Requirements: Expert knowledge of Rust or C++, Experience building and maintaining horizontally scalable distributed systems, Experience designing reliable high-availability production infrastructure, Knowledge of observability and reliability best practices, Experience operating PostgreSQL, Clickhouse, or MongoDB 📃 Skills: Rust, C++, Go, PostgreSQL, Clickhouse, MongoDB, gRPC, Docker, Kubernetes, TensorRT, vLLM, SGLang, REST, SDK 🏢 Description: About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.ABOUT THE ROLE: As an ideal candidate you have a good understanding of how highly scalable and reliable production infrastructure is built. Most of our backend infrastructure is written in Rust. So familiarity with a compiled language such as C++, Rust, or Go is highly beneficial. RESPONSIBILITIES: Build the xAI API that serves our models to developers worldwide Own the end-to-end system responsible for high-throughput inference, handling billions of tokens per minute with low latency and high availability, including model serving infrastructure, request routing, SDK development, rate limiting, observability, and efficient scaling BASIC QUALIFICATIONS: Expert knowledge of either Rust or C++ Experience in designing, implementing, and maintaining reliable and horizontally scalable distributed systems Knowledge of service observability and reliability best practices Experience in operating commonly used databases such as PostgreSQL, Clickhouse, and MongoDB PREFERRED SKILLS AND EXPERIENCE: Experience with LLM inference engines and serving frameworks (e.g., SGLang, TensorRT, vLLM) Experience designing or building with agent SDKs and agent orchestration frameworks Experience with Docker, Kubernetes, and containerized applications Expert knowledge of gRPC (unary, response streaming, bi-directional streaming, REST mapping) COMPENSATION AND BENEFITS $180,000 - $440,000 USD Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.
Technology
New offer

xAI
Software Engineer - Networking Software and Services
Senior
On-site
Palo Alto, CA
🏢 Summary: Opportunity to build and scale automation-first network software and services supporting large-scale GPU supercomputing fabrics for AI training and inference. The role focuses on developing tools for network management, metrics collection, provisioning, monitoring, and auto-remediation while implementing Infrastructure as Code best practices. You will design highly scalable, reliable systems that orchestrate tens of thousands of network devices in production environments. 🗂️ Requirements: Deep experience working with network engineers and network topologies, Strong knowledge of physical and logical network architectures, Strong knowledge of network protocols, Proven experience designing scalable and reliable software systems, Experience building systems that orchestrate large-scale network devices, Ability to implement Infrastructure as Code best practices, Experience enhancing deployment pipelines, Ability to create and define meaningful metrics for prioritization, Strong communication skills 📃 Skills: Python, Go, TCP/IP, BGP, RDMA, IaC, Networking, Automation 🏢 Description: ABOUT xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates. About the Role As part of the Network Software and Services for AI (nssAI) team, you will build cutting-edge software, services, and frameworks to empower Network Development Engineers. You will work hands-on across all facets of network management, including metric collection, configuration, zero-touch provisioning, monitoring, and auto-remediation, driving automation-first solutions for production and ancillary networks. The role involves developing extensible tools, streamlining complex processes, and ensuring high reliability to support AI training and inference workloads. Focus - Building software and tools with extensive metrics coverage for large-scale GPU supercomputing network fabrics used for AI training and serving inference queries. - Implementing Infrastructure as Code best practices, enhancing deployment pipelines, and ensuring robust and secure service delivery across production environments. Preferred Skills and Experience - Deep experience collaborating with network engineers using extensive knowledge of physical and logical network topologies and protocols. - Expert knowledge and proven history of designing scalable and reliable software from the ground up. - Experience building and orchestrating tens of thousands of network devices at high speed. - Ability to thrive in ambiguity and create metrics to help prioritize team focus. Tech Stack - Python - Go - TCP/IP - BGP - RDMA Annual Salary Range $150,000 - 250,000k Benefits Base salary is part of the total rewards package, which includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short- and long-term disability insurance, life insurance, and additional discounts and perks. xAI is an equal opportunity employer. For details on data processing, view the Recruitment Privacy Notice.
Technology

xAI
X Developer Platform – Forward Deployed Engineer, X API
Senior
On-site
Palo Alto, CA
🏢 Summary: Hands-on Forward Deployed Engineer role focused on building production solutions, sample applications, and developer tools on top of the X API platform for Enterprise and Indie developers. The position combines deep technical implementation with developer experience, emphasizing APIs, real-time data, and AI/agent integrations. You will work directly with customers to drive adoption and improve SDKs, documentation, and tooling across the ecosystem. 🗂️ Requirements: 5+ years in customer-facing technical role, Proficiency in at least two programming languages (Python, JavaScript, TypeScript, Java), Experience shipping production-quality software and developer tools, Strong knowledge of API design principles, Experience with REST or GraphQL APIs, Understanding of real-time streaming systems, Experience with authentication mechanisms, Hands-on experience building SDKs or developer tools, Experience creating technical documentation and guides, Experience with AI or LLM integrations 📃 Skills: Python, JavaScript, TypeScript, Java, REST, GraphQL, APIs, Streaming, Authentication, SDKs, CLI, LLM, AI, Rust, Scala, JVM 🏢 Description: ABOUT xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.ABOUT THE ROLE: X Developer Platform is responsible for the partner ecosystem of B2B and B2C developers that build solutions using X's extensive API set. With more than 500 million active users and billions of posts per week, X offers robust, real-time and historical data, insights and engagement opportunities across a wide range of organizations and industries. Our API gives you the ability to learn from and engage with the conversation on X and we supply developers with the tools to further uncover, build on, and share the value of this conversation with the world. Our mission broadly is to achieve the state of X, "The Everything App", and be the clear digital townsquare of Earth. We make our ever-expanding universe of social media data available via our extensive API suite with consistent and reliable architecture so the world can realize the full potential of this amazing stream of information. Our team helps collect, process, enrich and deliver hundreds of billions of signals a day through the X API platform. Our products are highly available, scalable, optimized, respectful of X's user base, and truly essential for our customers who build their businesses on X data. We are seeking an exceptional Forward Deployed Engineer who will work at the intersection of deep technical implementation and world-class developer experience. In this hands-on role, you will be on the front lines building production solutions for Enterprise customers, creating sample apps for both Enterprise and Indie developers, improving DevEx tools (SDKs, MCPs, CLIs, sandbox environments, and more), and authoring comprehensive how-to-guides — with an emphasis for X API + Agentic/AI use cases and integrations. You are genuinely obsessed with developers, our documentation, and making every X API product super accessible, intuitive, and delightful to build with. You will work directly with customers while collaborating closely with internal product and engineering teams to drive adoption, reduce friction, and maximize long-term value across the entire developer ecosystem. X Developer API Products Include: Real time streaming access to X data Historical access to archived X data Insight into X engagement data Enrichments on X objects derived from the latest machine learning technologies Flexible access to aggregate data End to end developer experience RESPONSIBILITIES: Partner directly with Enterprise customers to understand their business needs and rapidly build production-grade solutions, prototypes, integrations, and accelerators using the X API platform. Design, develop, and maintain high-quality sample applications, starter kits, reference implementations, and code examples for both Enterprise teams and Indie developers to accelerate adoption and showcase best practices. Build, enhance, and ship developer experience tools such as SDKs, MCPs, CLIs, Sandbox/Test Environments, and other internal/external tooling that dramatically improves developer productivity and ease of use. Research, write, and continuously improve comprehensive how-to guides, tutorials, cookbook recipes, technical blogs, and educational content — with special emphasis on X API integrations, real-time data, AI Agents, LLMs, and emerging use cases. Obsess over every aspect of developer experience: continuously audit and elevate our documentation, onboarding flows, code samples, and overall accessibility to make X API products the most approachable and powerful in the industry. Gather real-world feedback from customers and the broader developer community, advocate passionately for their needs internally, and collaborate with Product and Engineering teams to influence roadmap and feature prioritization. Diagnose complex technical issues, bugs, and edge cases; provide expert-level troubleshooting, workarounds, and long-term solutions while turning learnings into public guides and tooling improvements. Streamline support processes and create scalable materials that close knowledge gaps and accelerate success for both managed partners and self-serve developers. BASIC QUALIFICATIONS: Exceptional coding proficiency in two or more languages (Python, JavaScript/TypeScript, Java, etc.) with a proven track record of shipping production-quality software, sample apps, prototypes, and developer tools. Strong understanding of API design principles, REST/GraphQL, real-time streaming systems, authentication, and modern AI/agent workflows. Hands-on experience building developer-facing assets: sample applications, reference implementations, DevEx tools, and high-quality technical documentation/guides. Deep, genuine passion for developer experience (DevEx) — you instinctively identify friction and love removing it through better docs, tools, and accessible APIs. 5+ years of experience in a customer-facing technical role (partner engineering, solutions architecture, developer relations, forward-deployed engineering, or similar) working directly with enterprise customers. Ability to work comfortably and professionally with diverse stakeholders (software developers, product managers, technical executives, and business leaders) to define and deliver shared objectives. Excellent project management skills with the ability to scope, execute, and drive initiatives autonomously in a fast-paced environment. Outstanding verbal and written communication skills, including the ability to translate complex technical topics into clear, engaging documentation and presentations. Strong attention to detail and a solution-oriented mindset that turns customer problems into scalable improvements. PREFERRED SKILLS AND EXPERIENCE: Previous experience building or significantly contributing to developer platforms, tools, SDKs, interactive playgrounds, or educational content. Hands-on knowledge of AI/Agent frameworks, LLM integrations, or building AI-powered applications on top of data APIs. Industry experience in social media, enterprise software, data analytics, real-time/streaming data, or related spaces. Strong familiarity with Rust, Scala (ideally), or JVM-based programming languages. A public portfolio of sample apps, open-source contributions, technical blogs, guides, or DevEx tools that demonstrate your builder mindset and developer obsession. COMPENSATION AND BENEFITS: $180,000 - $440,000 USD Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.
Technology

xAI
Senior Data Analyst- Fraud & AML
Senior
On-site
Palo Alto, CA
12,333 - 18,333 USD/yr
🏢 Summary: Senior Data Scientist role focused on designing and optimizing AML and fraud detection models to strengthen financial crime compliance and transaction monitoring. The position involves building advanced analytics solutions, coverage assessment frameworks, and automated reporting to support BSA/AML, OFAC, and regulatory requirements. It is a cross-functional, high-impact role combining machine learning, compliance expertise, and scalable data solutions. 🗂️ Requirements: 7+ years data science experience in financial services, 4+ years experience in fraud and financial crime compliance, Master's degree in quantitative field, Experience building transaction monitoring models in regulated environment, Strong knowledge of BSA/AML and SAR processes, Understanding of sanctions screening and model risk management, Proficiency in Python and SQL, Experience supporting regulatory examinations, U.S. work authorization under ITAR requirements 📃 Skills: Python, SQL, MachineLearning, Statistics, AML, BSA, OFAC, SAR, FraudDetection, TransactionMonitoring, Compliance, ModelRisk, DataAnalytics, Dashboards, Automation, RPA 🏢 Description: About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.ABOUT THE ROLE: We are looking for a Senior Data Scientist to join our Compliance Program and play a pivotal role in modernizing and strengthening our financial crime detection capabilities. You will architect, build, and optimize data-driven transaction monitoring models, coverage assessment frameworks, and advanced analytics solutions that directly support BSA/AML regulatory compliance, including but not limited to SAR filing, Customer Identification Program elements, and Enhanced Due Diligence measures. The role will also support the OFAC Sanctions, Fraud and overall risk prioritization across multiple products and jurisdictions. This is a high-impact, cross-functional role that blends advanced analytics, machine learning, and deep compliance domain expertise. You will work closely with Compliance, Engineering, Model Risk, Product, and external regulators to ensure our controls are robust, defensible, and scalable. RESPONSIBILITIES: Design, develop, and enhance AML and fraud models, rules, and heuristics using Python, SQL, and AI-enabled tooling; partner with the Compliance Machine Learning team on model reviews to improve detection rates and reduce false positives. Build and maintain interactive performance dashboards and automated reporting solutions that track key risk, productivity, and capacity metrics for senior leadership and regulators. Architect and implement enterprise-wide Transaction Monitoring Coverage Assessment frameworks, including standardized methodologies for gap identification, root-cause analysis, remediation planning, and ongoing sustainability monitoring. Lead complex data initiatives, including extraction of SAR filing metrics with product-level breakdowns and development of jurisdiction- and typology-specific SAR narrative generator tools. Embed data science best practices into product launches and feature rollouts to proactively identify and close monitoring coverage gaps. Support regulatory examinations (e.g., NYDFS Part 504) by preparing analytical documentation, third-party validation materials, and executive certification packages. Drive continuous improvement of compliance operations through automation, process optimization, and advanced analytics. BASIC QUALIFICATIONS: 7+ years of hands-on data science / advanced analytics experience in financial services, with at least 4 years focused on fraud and financial crime compliance. Master's degree (or higher) in Applied Mathematics, Statistics, Data Science, Actuarial Science, or a related quantitative field. Proven track record of building and optimizing transaction monitoring models, coverage frameworks, or compliance analytics programs in a regulated environment (fintech, bank, or payment company preferred). Deep understanding of BSA/AML regulations, suspicious activity reporting, customer due diligence, sanctions screening, and model risk management principles. Demonstrated ability to translate complex regulatory requirements into actionable data solutions and present findings to senior leadership and regulators. Certified Anti-Money Laundering Specialist (CAMS) or equivalent compliance certification is strongly preferred. PREFERRED SKILLS AND EXPERIENCE: Experience leading cross-functional initiatives involving Engineering, Legal, Product Compliance, and external consulting partners. Background in building internal case management systems, SAR automation tools, or RPA solutions. Familiarity with AML detection platforms. Track record of delivering measurable impact (e.g., reduced case volumes, improved detection of high-risk activity, increased operational efficiency). COMPENSATION AND BENEFITS: $148,000- $220,000 USD Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks. ITAR REQUIREMENTS: To conform to U.S. Government export regulations, applicant must be a (i) U.S. citizen or national, (ii) U.S. lawful, permanent resident (aka green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. § 1158, or be eligible to obtain the required authorizations from the U.S. Department of State. Learn more about the ITAR here. xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.
Technology
New offer

xAI
AI Tutor - Software Engineering Specialist
Senior
Remote
Palo Alto, CA
124,800 - 207,996 USD/hr
🏢 Summary: Opportunity to contribute to AI model training by curating, evaluating, and refining code while ensuring high standards of scalability, performance, and reliability. The role involves improving AI-generated code and collaborating cross-functionally to deliver enterprise-grade coding solutions. Flexible remote work options are available, with compensation based on experience and location. 🗂️ Requirements: Professional experience building scalable, high-performance applications, Deep expertise in at least one programming language, Proficiency in relevant frameworks and libraries, Strong understanding of software design principles, Experience with performance optimization, Experience implementing accessibility, security, and reliability standards, Strong debugging and profiling skills, Hands-on experience with testing frameworks and tools, Legal eligibility to work without visa sponsorship, Access to Chromebook, MacOS 11+, or Windows 10+ device if using personal computer 📃 Skills: Programming, Frameworks, Libraries, Debugging, Profiling, Testing, Docker, APIs, Databases, Authentication, Analytics, Monitoring, Security 🏢 Description: ABOUT THE ROLE: Contribute to AI model training initiatives by curating code examples, offering precise solutions, and providing meticulous corrections in specialized programming languages. Evaluate and refine AI-generated code, ensuring it adheres to industry standards for efficiency, scalability, and reliability. Collaborate with cross-functional teams to enhance AI-driven coding solutions, ensuring they meet enterprise-level quality and performance benchmarks. BASIC QUALIFICATIONS: Professional software engineering experience building scalable, high-performance applications. Deep expertise in one or more programming languages. Strong proficiency in relevant frameworks and libraries. Solid understanding of software design principles, performance optimization, and best practices. Experience implementing quality standards, including accessibility, security, and reliability where applicable. Strong debugging and profiling skills using development tools and performance monitoring. Hands-on experience with testing frameworks and tools relevant to your domain. PREFERRED SKILLS AND EXPERIENCE: The ideal candidate for this role is adaptable, possesses strong logical reasoning skills, is detail-oriented, and thrives in a fast-paced work environment. Experience integrating analytics, monitoring, and security best practices relevant to your technical domain. Experience with containerization technologies (e.g., Docker). Knowledge of complementary technologies (e.g., backend systems, APIs, databases, authentication) to enable effective cross-functional collaboration. LOCATION AND OTHER EXPECTATIONS: Tutor roles may be offered as full-time, part-time, or contractor positions, depending on role needs and candidate fit. For contractor positions, hours will vary widely based on project scope and contractor availability, with no fixed commitments required. On average most projects may involve at least 10 hours per week to achieve deliverables effectively though this is not a fixed commitment and depends on the scope of work. Contractors have full flexibility to set their own hours and determine the exact amount of time needed to complete deliverables. Tutor roles may be performed remotely from any location worldwide, subject to legal eligibility, time-zone compatibility, and role specific needs. For US based candidates, please note we are unable to hire in the states of Wyoming and Illinois at this time. We are unable to provide visa sponsorship. For those who will be working from a personal device, your computer must be a Chromebook, Mac with MacOS 11.0 or later, or Windows 10 or later. COMPENSATION AND BENEFITS: US based candidates: $60/hour - $100/hour depending on factors including relevant experience, skills, education, geographic location, and qualifications. International candidates: Information will be provided to you during the recruitment process. Benefits vary based on employment type, location and jurisdiction. Benefits for eligible U.S. based positions include health insurance, 401(k) plan, and paid sick leave. Specific details and role specific information will be provided to you during the interview process. xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.
Technology

xAI
Member of Technical Staff – Web Engineering
Senior
On-site
Palo Alto, CA
🏢 Summary: Fullstack/Web Engineer role focused on building and optimizing high-performance, real-time user-facing features for a large-scale social platform. The position emphasizes frontend excellence while contributing across the stack, including scalable backend systems and real-time analytics. You will own features end-to-end, driving architecture, performance, and reliability for globally deployed products. 🗂️ Requirements: 2+ years web development experience, Expertise in TypeScript, Expertise in Node.js, Expertise in React or modern web frameworks, Expertise in CSS/SASS, Experience with UI/UX design, Experience optimizing performance and security, Experience building scalable, high-concurrency systems, Backend development experience, Proficiency in Rust, Go, Java, Python, or Scala 📃 Skills: TypeScript, Node.js, React, CSS, SASS, Rust, Go, Java, Python, Scala, HTML, WebSockets, REST, GraphQL, SQL, NoSQL, Testing, CI/CD, Monitoring, Security 🏢 Description: About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.About the Role: We're looking for exceptional Fullstack / Web Engineers that can work across the stack but have a passion for frontend development and a keen eye for design. You'll architect and optimize user-facing features that power real-time conversations for millions worldwide. Dive into cutting-edge technologies and scalable backend systems, collaborating with top-tier talent to push the boundaries of web performance and innovation. You have the ability to thrive in a fast-paced environment, where you proactively tackle high-impact challenges that shape the future of social media—perfect for engineers passionate about crafting seamless, responsive experiences that drive global engagement and redefine digital interaction. Responsibilities: Own and drive features from inception and design to implementation and launch, being the web expert on your team. Build and maintain high-quality, performant products and features, leveraging the most modern and cutting edge web standards, technologies, frameworks, and AI tooling. Responsible for fullstack features, including user dashboards, personalized experiences, content delivery, interactive tools, assessments, and real-time analytics Lead architecture, scalability, and reliability decisions for high-concurrency, low-latency systems. Uphold engineering excellence via testing, monitoring, deployment, and secure data handling. Drive technical/product decisions with teams and deploy global features to maximize user value. Basic Qualifications: 2+ years of web development experience. Expert in TypeScript, Node.js, and modern web frameworks (e.g., React). Expert in modern CSS/SASS Experience in high-quality UI and UX design Proven track record of optimizing applications for performance, security, and offline functionality. Preferred Skills and Experience: 5+ years of experience in a web frontend role, working on a large scale consumer app. Experience with backend development, proficiency in one or more of the following: Rust, Go, Java, Python, Scala. Compensation and Benefits: $180,000 - $440,000 USD Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.
Technology
New offer

xAI
Network Engineer
Mid
On-site
Memphis, TN
🏢 Summary: Opportunity for a Network Engineer to support and evolve a large-scale global network infrastructure, focusing on provisioning, monitoring, troubleshooting, and continuous improvement of corporate and backbone networking environments. The role involves hands-on work with routing, wireless, remote access, and network security technologies while participating in on-call operations. Ideal for candidates with solid networking fundamentals seeking to grow in a high-performance, production environment. 🗂️ Requirements: Basic understanding of BGP configuration and management, Basic knowledge of backbone and datacenter networks, Working knowledge of at least one internal routing protocol, Working knowledge of TCP/IP, Working knowledge of corporate networking, Working knowledge of remote access technologies, Working understanding of wireless technologies, Ability to provision and manage network devices and circuits, Ability to troubleshoot network hardware and software issues, Strong documentation skills, Strong communication skills, Eligibility to work under ITAR regulations (U.S. citizen, national, permanent resident, refugee, asylee, or authorized) 📃 Skills: BGP, TCP/IP, Routing, Wireless, RemoteAccess, NAC, Juniper, Cisco, Aruba, Arista, Scripting, Automation, NetworkSecurity, Firmware 🏢 Description: ABOUT THE ROLE: Our Network Engineering team handles a dynamic, constantly growing and evolving global network that provides a reliable, high-performance and secure network behind one of the few products in the world that touches over 1 billion people. The ideal candidate is a Network Engineer who is proficient with the fundamentals and eager to grow skills on a global network. RESPONSIBILITIES: - Help to define configuration and architecture standards - Contribute to the qualification of new platforms - Provision network devices in corporate offices and POPs according to documented procedures - Provision and manage corporate circuits - Contribute to continuous improvement of configuration standards and operational procedures - Contribute to continuous improvement of corporate device monitoring - Liaise with third parties, IT, and remote office users to resolve issues - Create and execute scheduled changes, processes, and documentation - Participate in the on-call rotation - Maintain and upgrade existing systems, including firmware updates and configurations - Monitor network performance and ensure system availability and reliability - Troubleshoot and resolve network hardware and software problems BASIC QUALIFICATIONS: - Basic understanding of BGP configuration and management - Basic knowledge of backbone and datacenter networks - Working understanding of wireless technologies - Working knowledge of at least one internal routing protocol and its day-to-day operation - Working knowledge of corporate networking - Working knowledge of TCP/IP - Working knowledge of remote access - Strong documentation and communication skills PREFERRED SKILLS AND EXPERIENCE: - Familiarity with multiple networking vendors such as Juniper, Cisco, Aruba, Arista - Familiarity with remote access technologies - Knowledge of network security and NAC - Experience with scripting and automation of network operations activities ITAR REQUIREMENTS: To conform to U.S. Government export regulations, applicants must be a U.S. citizen or national, U.S. lawful permanent resident (green card holder), refugee under 8 U.S.C. § 1157, asylee under 8 U.S.C. § 1158, or be eligible to obtain the required authorizations from the U.S. Department of State.