June 8, 2026
Senior Site Reliability Engineer (SRE)
Senior • Remote
We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our team.
In this critical role, you will collaborate closely with software developers and operations teams to ensure high reliability, scalability, and efficiency of our systems, with a strong focus on meeting and exceeding customer expectations. Your expertise will be crucial in deploying, maintaining, and automating our infrastructure and application environments to ensure seamless user experiences.
Your proactive involvement will be key to enhancing system reliability, optimizing resource utilization, and ensuring continuous improvement in our operational practices.
Your responsibilities will include defining and tracking Service Level Objectives (SLOs), managing error budgets, and reducing toil through automation. You will play a pivotal role in driving the success of technology initiatives, maximizing their impact across the organization, and ensuring that solutions consistently meet the high standards our customers expect.
Responsibilities
Collaborate with development, security, quality, and operation teams to implement SRE practices and ensure system reliability
Define and support required level of reliability, availability, and performance for services and applications
Design and deliver Cloud-based solutions tailored to client needs
Troubleshoot, mitigate, and support fixing of the infrastructure and application issues in a timely manner
Implement a monitoring system for the infrastructure and application reliability
Communicate technical concepts clearly to both engineering teams and management stakeholders
Requirements
Bachelor’s degree in Computer Science, Engineering, or a related field
3+ years of hands-on experience in Site Reliability Engineering or related roles
Proven experience in any cloud (AWS/GCP/Azure)
Experience with implementing SRE practices such as SLO/SLI, Error budgets, Postmortems, Reducing Toil, capacity planning, and Incident Management
Python or other scripting/programming language
Strong background in monitoring tools
Proficiency in CI/CD tools, infrastructure as code, and configuration management
Solid knowledge of container orchestration technologies (Kubernetes, Docker)
English language proficiency at an Upper-Intermediate level (B2) or higher
Nice to have
Expertise in deployment and management of LLMs, including technologies like RAG
Certification in Kubernetes, AWS/GCP/Azure, or similar technologies
Proven experience in DevOps
Knowledge of managing and optimizing AI/ML models in production environments, including basic deployment, monitoring, and maintenance
We offer/Benefits
We gather like-minded people:
Engineering community of industry professionals
Friendly team and enjoyable working environment
Flexible schedule and opportunity to work remotely within Poland
Chance to work abroad for up to 60 days annually
Business-driven relocation opportunities
We provide growth opportunities:
Outstanding career roadmap
Leadership development, career advising, soft skills, and well-being programs
Certification (GCP, Azure, AWS)
Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru
English classes
We cover it all:
Stable income (Employment Contract or B2B)
Participation in the Employee Stock Purchase Plan
Benefits package (health insurance, multisport, shopping vouchers)
Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more
Referral bonuses
Corporate, social and well-being events
Please, note:
The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview.
We will reach out to selected candidates exclusively.
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
Similar jobs you might like
Technology
emagine Polska
Senior DevOps / SRE (Platform Reliability Engineer) - French fluent
Senior
Remote
Lisbon, Portugal
🏢 Summary: Senior DevOps / SRE role focused on ensuring reliability, scalability, security, and performance of a cloud-native AWS platform. The position centers on infrastructure automation, CI/CD, Kubernetes operations, observability, and implementing SRE best practices to support highly available production systems. You will lead incident management, optimize cloud costs, and drive continuous improvement of platform resilience. 🗂️ Requirements: 5+ years in DevOps/SRE/Cloud/Platform Engineering, Strong Linux administration and troubleshooting, Production experience with Kubernetes, Experience with CI/CD tools, Expertise in Infrastructure as Code, Hands-on experience with AWS, Strong networking fundamentals, Experience with monitoring and logging tools, Scripting skills (Bash or Python) 📃 Skills: AWS, Kubernetes, Docker, Helm, Terraform, Ansible, CloudFormation, Linux, GitLab, Jenkins, GitHub, Azure, Prometheus, Grafana, ELK, Datadog, Splunk, Bash, Python, TCP/IP, DNS 🏢 Description: We are looking for a Senior DevOps / Site Reliability Engineer (SRE) to ensure the reliability, scalability, performance, and security of our platform and cloud infrastructure. You will play a key role in building and operating cloud-native systems, improving observability, automating operations, implementing SRE best practices (SLOs/SLIs), and supporting development teams to deliver highly available services. Key Responsibilities Design, implement, and maintain highly available and scalable infrastructure on AWS. Own and improve the reliability of production systems using SRE principles (SLO, SLI, error budgets). Build and manage CI/CD pipelines to support fast and safe software delivery. Develop and maintain Infrastructure as Code (IaC) using Terraform, Ansible, CloudFormation, etc. Manage and optimize container orchestration platforms (Kubernetes, Docker, Helm). Implement and maintain monitoring, logging, and alerting solutions (Prometheus, Grafana, ELK, Datadog, Splunk). Lead incident response, perform root cause analysis, and write postmortems to drive continuous improvement. Improve system performance, capacity planning, scaling strategies, and disaster recovery processes. Collaborate closely with development teams to improve deployment strategies and system resilience. Implement security best practices (IAM, secret management, vulnerability scanning, patching). Define operational standards, runbooks, documentation, and best practices for platform reliability. Participate in on-call rotation and provide senior-level support for critical production issues. Key Responsibilities (5 Main Missions) The DevOps / SRE lead will be responsible for the stability and evolution of the platform. Your role is structured around five main areas: Mission 1: AWS Infrastructure Management (Build & Run) Mission 2: CI/CD and Deployment Automation Mission 3: Monitoring, Observability, and Alerting: Global Monitoring , Log Management , Application Monitoring , Business Analytics Mission 4: Incident Management, Resilience, and Security Mission 5: FinOps and AWS Cost Optimization Key Requirements 5+ years of experience in DevOps / SRE / Cloud Infrastructure / Platform Engineering. Strong expertise in Linux systems administration and troubleshooting. Proven experience with Kubernetes in production environments. Strong experience with CI/CD tools (GitLab CI, Jenkins, GitHub Actions, Azure DevOps). Solid knowledge of Infrastructure as Code (Terraform highly preferred). Experience with AWS cloud platforms. Strong understanding of networking fundamentals (TCP/IP, DNS, load balancing, reverse proxies). Experience with observability tools: monitoring, metrics, logging, tracing. Strong scripting skills (Bash, Python, or similar). French advanced level. Nice to Have Experience with additional cloud platforms (Azure, GCP). Strong understanding of networking fundamentals.
Technology
Yard Corporate
Site Reliability Engineer (SRE)
Senior
Hybrid
Warsaw, Poland
40,000 - 55,000 PLN
🏢 Summary: Senior Site Reliability Engineer role focused on building and standardizing SRE practices across a hybrid AWS and on-prem infrastructure. The position centers on ensuring scalability, resilience, and high availability of high-frequency, data-intensive platforms through observability, automation, and Kubernetes optimization. You will define SLOs, enhance monitoring architecture, and drive reliability culture across engineering teams. 🗂️ Requirements: 5+ years experience in SRE, DevOps, or Infrastructure Engineering supporting distributed production systems, Bachelor’s degree in Computer Science, Computer Engineering, or related field (or equivalent experience), Deep expertise in Grafana, Prometheus, Loki, and Tempo (OpenTelemetry), Strong production experience with Docker and Kubernetes, Experience managing hybrid infrastructure (AWS and on-premises), Proficiency in at least one language: Python, Go, or Bash, Hands-on experience with CI/CD pipelines and Infrastructure-as-Code, Experience defining and managing SLOs and SLAs, Willingness to participate in on-call rotation 📃 Skills: AWS, Kubernetes, Docker, Prometheus, Grafana, Loki, Tempo, OpenTelemetry, Python, Go, Bash, CI/CD, IaC, Git, Hypervisors 🏢 Description: About the Client Our client is a premier, global investment management firm operating at the intersection of finance and technology. Known for their sophisticated, data-intensive systems, they build and maintain high-performance platforms that process massive volumes of market and operational data. To support their expanding footprint, they are looking for a senior-level Site Reliability Engineer (SRE) who will take ownership of shaping, standardizing, and scaling their SRE frameworks and reliability culture from the ground up. The Role In this role, you will serve as a foundational force for SRE practices, partnering directly with Cloud, Infrastructure, and Software Engineering squads. You will work across a hybrid infrastructure (combining advanced AWS cloud environments and physical on-premises servers) to guarantee the scalability, resilience, and maximum uptime of critical, high-frequency transactional platforms. Core Responsibilities SRE Evangelism: Design, implement, and champion core reliability principles, helping technology teams adopt sustainable scaling practices. Observability Architecture: Implement, scale, and maintain end-to-end monitoring, telemetry, and distributed tracing systems utilizing Prometheus, Grafana, Loki, and Tempo (OpenTelemetry framework). Kubernetes Optimization: Establish best-practice configurations for containerized workloads, ensuring applications running on Kubernetes are highly resilient, cost-effective, and performant. Incident Management & Culture: Participate in a balanced, shared on-call rotation (averaging one week per month). Automation & Engineering: Build custom tooling and CI/CD pipelines to automate routine tasks, system health checks, and rapid disaster recovery workflows. SLO/SLA Definition: Partner with product and engineering teams to define, monitor, and enforce Service Level Objectives (SLOs) and Error Budgets. What We Look For Experience: 5+ years of hands-on experience in a dedicated SRE, DevOps, or Infrastructure Engineering role supporting complex, distributed production systems. Education: A Bachelor’s degree in Computer Science, Computer Engineering, or a related technical discipline (or equivalent practical experience). Observability Expertise: Deep, subject-matter knowledge of modern monitoring stacks, specifically Grafana, Prometheus, Loki, and Tempo (OTel). Orchestration & Containers: Strong, production-grade expertise in containerization (Docker) and orchestration (Kubernetes). Hybrid Infrastructure: Experience navigating hybrid models—managing both cloud services (AWS preferred) and physical on-premise hardware resources. Scripting/Coding: Proficiency in writing clean, maintainable code in at least one scripting or programming language (e.g., Python, Bash, or Go) to build reliable automation. Methodologies: Solid grounding in CI/CD concepts, infrastructure-as-code (IaC), and agile development processes. Soft Skills: Excellent verbal and written communication skills, with a proven ability to convey complex infrastructure and reliability concepts to both technical and non-technical stakeholders. What We Offer Stable Employment: Full-time employment contract ( Umowa o Pracę - UoP ). Tax Optimization: Eligibility for creative tax-deductible costs ( KUP - Koszty Uzyskania Przychodu). Financial Reward: Highly competitive base salary accompanied by a generous annual performance bonus . Comprehensive Health: Premium private medical care package that fully includes dental coverage (stomatologia) . Wellness & Lifestyle: MultiSport card to keep you active and healthy. Daily Perks: Pre-funded lunch card for your daily meals. Tech Stack at a Glance Cloud & Virtualization: AWS, Kubernetes, Docker, On-Premises Hypervisors Observability: Prometheus, Grafana, Loki, Tempo, OpenTelemetry (OTel) Languages: Python, Go, Bash CI/CD & Automation: Git-based pipelines, Configuration Management, IaC
Technology
Link Group
Senior Site Reliability Engineer
Senior
Hybrid
Warsaw, Poland
170 - 230 PLN
🏢 Summary: The role focuses on ensuring reliability, scalability, and performance of large-scale cloud-based applications by building and maintaining resilient infrastructure. You will manage AWS cloud environments, Kubernetes clusters, and CI/CD pipelines while implementing monitoring, automation, and incident response processes. The position emphasizes Infrastructure-as-Code, observability, and continuous reliability improvements. 🗂️ Requirements: 5+ years experience in SRE, DevOps or similar role, Strong experience with AWS cloud services, Experience with Infrastructure-as-Code tools, Hands-on experience with Kubernetes, Proficiency with Docker, Experience with CI/CD pipelines, Solid knowledge of PostgreSQL or Amazon RDS, Strong SQL knowledge, Knowledge of networking concepts (VPC, DNS, troubleshooting), Strong Linux/Unix administration skills, Experience with observability tools, Experience with automation in infrastructure, Experience with incident management 📃 Skills: AWS, Terraform, Pulumi, Kubernetes, EKS, Docker, GitHub, PostgreSQL, RDS, SQL, VPC, DNS, Linux, Unix, Prometheus, Grafana, Datadog, Dynatrace, CI/CD 🏢 Description: We are looking for an experienced Site Reliability Engineer to ensure the reliability, scalability, and performance of large-scale cloud-based web applications. You will work closely with software development, cloud operations, and platform teams to build and maintain resilient infrastructure and improve system stability. Key Responsibilities: Design and maintain monitoring, alerting, and incident response systems to ensure high availability Collaborate closely with engineering, product, and architecture teams Build and manage cloud infrastructure using Infrastructure-as-Code (e.g., Terraform, Pulumi) on AWS Operate and optimize Kubernetes environments (e.g., EKS) Develop and maintain containerized applications using Docker Improve CI/CD pipelines and drive automation across deployment processes Implement and manage observability tools (logging, metrics, tracing) Participate in incident management, postmortems, and reliability improvements Support capacity planning, disaster recovery, and system scaling Contribute to security, compliance, and operational best practices Develop automation and AI-driven solutions for monitoring and incident prevention Requirements: 5+ years of experience in SRE, DevOps, or similar roles Strong experience with AWS cloud services and Infrastructure-as-Code tools Hands-on experience with Kubernetes and containerized environments Proficiency in Docker and CI/CD pipelines (e.g., GitHub Actions) Solid understanding of databases (e.g., PostgreSQL, Amazon RDS) and SQL Knowledge of networking concepts (VPC, DNS, troubleshooting tools like dig/traceroute) Strong Linux/Unix administration skills Experience with observability tools (e.g., Prometheus, Grafana, Datadog, Dynatrace) Familiarity with automation and AI-based solutions in infrastructure Strong problem-solving and incident management skills
Technology

Relativity
Senior Engineer - Site Reliability Engineering
Senior
Remote
Krakow, Poland
208,000 - 312,000 PLN/yr
🏢 Summary: Remote Senior Software Engineer – SRE role focused on building and maintaining highly available, scalable, and observable cloud-native systems. The position emphasizes automation, CI/CD improvements, incident management, and implementation of reliability best practices across SaaS platforms. The engineer collaborates cross-functionally to enhance system resilience, performance, and operational excellence. 🗂️ Requirements: 5+ years in Software Engineering, SRE, or Cloud Infrastructure roles, Experience with DevOps tools and practices, Proficiency in Python, Go, Java, C#, or .Net, Experience with at least two: GitHub, Azure DevOps, GitLab, Jenkins, Hands-on experience with observability tools, Strong experience with CI/CD pipelines and automation, Experience with cloud-native distributed systems, Experience in high-availability SaaS environments, Knowledge of SLOs, SLIs, and error budgets, Experience with redundancy and disaster recovery, Participation in on-call rotations 📃 Skills: Python, Go, Java, C#, .Net, GitHub, Azure, GitLab, Jenkins, Prometheus, Grafana, OpenTelemetry, CI/CD, DevOps, SLO, SLI, SaaS, Automation, Cloud, Agile 🏢 Description: Posting Type Remote Job Overview As the Senior Software Engineer – SRE you will focus on implementing and maintaining reliability solutions across the platform. This role emphasizes hands-on engineering work, automation, and operational excellence. The Senior Software Engineer will work closely with other engineers to ensure systems are highly available, observable, and resilient. As a member of the engineering team, the Senior Software Engineer will work closely with Infrastructure, Engineering, and Product teams to develop highly resilient, observable, and automated solutions that enhance system availability and efficiency. The ideal candidate will bring deep technical expertise, strong problem-solving skills, and a passion for reliability engineering. Job Description and Requirements Job Responsibilities Implement, and advocate for best-in-class reliability, observability, and scalability practices across the platform. Develop automated solutions for system reliability, capacity planning, and incident response to minimize manual intervention. Participate in improving Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets to enhance system reliability. Contribute to CI/CD pipeline improvements and DevOps practices. Support root cause analysis (RCA) investigations, drive corrective actions, and advocate for a blameless postmortem culture. Participate in on-call rotations to ensure 24/7 availability of critical systems. Influence and mentor engineering teams on SRE principles, DevOps culture, and best practices. Stay ahead of industry trends, adopting new tools, frameworks, and methodologies to continually improve system reliability. Preferred Qualifications 5+ years of experience in software engineering, site reliability engineering, or cloud infrastructure roles. Experience with DevOps tooling and practices. Proficient in building service-oriented architectures and cloud-native distributed systems. Proficiency in programming languages such as Python, Go, Java, or C# or .Net. In-depth technical understanding and experience with at least two of the following DevOps platforms: GitHub, Azure DevOps, GitLab, or Jenkins. Hands-on experience with observability tools (e.g., Prometheus, Grafana, OpenTelemetry or others). Strong background in CI/CD pipelines, automation, and DevOps practices. Experience working in global, high-availability SaaS environments. Experience implementing redundancy and disaster recovery scenarios. Excellent teamwork and cross-group collaboration skills. Ability to collaborate with both technical and business professionals. Hands-on experience with Agile Project Development Methodologies. Experience delivering complex technical solutions. Excellent problem-solving, analytical, and communication skills. Nice to have: Experience with Chaos Engineering and/or AI Ops . Competencies and Skills Automation-First Mindset – Commitment to reducing toil through scripting and automation. Reliability Engineering – Expertise in SLOs, SLIs, error budgets, and high-availability architectures. Incident Management & Postmortems – Experience in handling production incidents and driving continuous improvement. Observability & Monitoring – Deep understanding of logging, monitoring, and alerting best practices. Practical knowledge of data structures and modern data engines. Collaboration & Communication – Ability to work across teams, influence stakeholders, and advocate for reliability improvements. Mentorship & Coaching – Passion for mentoring engineers and building an SRE culture within the organization. Additional Information This role offers a unique opportunity to shape the future of SRE in a cutting-edge SaaS company, ensuring the reliability and scalability of mission-critical applications for customers worldwide. If you are passionate about solving complex reliability challenges and driving technical excellence, we’d love to hear from you! Relativity is a diverse workplace with different skills and life experiences—and we love and celebrate those differences. We believe that employees are happiest when they're empowered to be their full, authentic selves, regardless how you identify. Benefit Highlights: Comprehensive health, dental, and vision plans Parental leave for primary and secondary caregivers Flexible work arrangements Two, week-long company breaks per year Additional time off Long-term incentive program Training investment program All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin, disability or protected veteran status, or any other legally protected basis, in accordance with applicable law. Relativity is committed to competitive, fair, and equitable compensation practices. This position is eligible for total compensation which includes a competitive base salary, an annual performance bonus, and long-term incentives. The expected salary range for this role is between following values: 208 000 and 312 000PLN The final offered salary will be based on several factors, including but not limited to the candidate's depth of experience, skill set, qualifications, and internal pay equity. Hiring at the top end of the range would not be typical, to allow for future meaningful salary growth in this position. Required Skills: Automation, Data Analysis, Database Management, Network Architecture, Performance Optimizations, Problem Solving, Project Management, Software Development, System Designs, Technical Leadership
Technology

Relativity
Senior Engineer - Site Reliability Engineering
Senior
Remote
Krakow, Poland
208,000 - 312,000 PLN/yr
🏢 Summary: Senior Software Engineer – SRE role focused on building and maintaining highly available, observable, and resilient cloud-native systems. The position emphasizes automation, CI/CD improvements, incident management, and implementation of reliability best practices across a SaaS platform. You will collaborate cross-functionally to enhance scalability, performance, and operational excellence. 🗂️ Requirements: 5+ years in Software Engineering, SRE, or Cloud Infrastructure, Experience with DevOps tools and practices, Proficiency in Python, Go, Java, or C#/.NET, Experience with at least two: GitHub, Azure DevOps, GitLab, Jenkins, Hands-on experience with observability tools, Strong experience with CI/CD pipelines and automation, Experience with cloud-native distributed systems, Experience in high-availability SaaS environments, Knowledge of SLOs, SLIs, and error budgets, Experience with incident management and root cause analysis, Experience implementing redundancy and disaster recovery, Experience with Agile methodologies 📃 Skills: Python, Go, Java, C#, DotNet, GitHub, Azure, GitLab, Jenkins, Prometheus, Grafana, OpenTelemetry, CI/CD, DevOps, SLO, SLI, SaaS, Automation, Cloud, DistributedSystems 🏢 Description: Job Overview As the Senior Software Engineer – SRE you will focus on implementing and maintaining reliability solutions across the platform. This role emphasizes hands-on engineering work, automation, and operational excellence. The Senior Software Engineer will work closely with other engineers to ensure systems are highly available, observable, and resilient. As a member of the engineering team, the Senior Software Engineer will work closely with Infrastructure, Engineering, and Product teams to develop highly resilient, observable, and automated solutions that enhance system availability and efficiency. The ideal candidate will bring deep technical expertise, strong problem-solving skills, and a passion for reliability engineering. Job Description and Requirements Job Responsibilities Implement, and advocate for best-in-class reliability, observability, and scalability practices across the platform. Develop automated solutions for system reliability, capacity planning, and incident response to minimize manual intervention. Participate in improving Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets to enhance system reliability. Contribute to CI/CD pipeline improvements and DevOps practices. Support root cause analysis (RCA) investigations, drive corrective actions, and advocate for a blameless postmortem culture. Participate in on-call rotations to ensure 24/7 availability of critical systems. Influence and mentor engineering teams on SRE principles, DevOps culture, and best practices. Stay ahead of industry trends, adopting new tools, frameworks, and methodologies to continually improve system reliability. Preferred Qualifications 5+ years of experience in software engineering, site reliability engineering, or cloud infrastructure roles. Experience with DevOps tooling and practices. Proficient in building service-oriented architectures and cloud-native distributed systems. Proficiency in programming languages such as Python, Go, Java, or C# or .Net. In-depth technical understanding and experience with at least two of the following DevOps platforms: GitHub, Azure DevOps, GitLab, or Jenkins. Hands-on experience with observability tools (e.g., Prometheus, Grafana, OpenTelemetry or others). Strong background in CI/CD pipelines, automation, and DevOps practices. Experience working in global, high-availability SaaS environments. Experience implementing redundancy and disaster recovery scenarios. Excellent teamwork and cross-group collaboration skills. Ability to collaborate with both technical and business professionals. Hands-on experience with Agile Project Development Methodologies. Experience delivering complex technical solutions. Excellent problem-solving, analytical, and communication skills. Nice to have: Experience with Chaos Engineering and/or AI Ops . Competencies and Skills Automation-First Mindset – Commitment to reducing toil through scripting and automation. Reliability Engineering – Expertise in SLOs, SLIs, error budgets, and high-availability architectures. Incident Management & Postmortems – Experience in handling production incidents and driving continuous improvement. Observability & Monitoring – Deep understanding of logging, monitoring, and alerting best practices. Practical knowledge of data structures and modern data engines. Collaboration & Communication – Ability to work across teams, influence stakeholders, and advocate for reliability improvements. Mentorship & Coaching – Passion for mentoring engineers and building an SRE culture within the organization. Additional Information This role offers a unique opportunity to shape the future of SRE in a cutting-edge SaaS company, ensuring the reliability and scalability of mission-critical applications for customers worldwide. If you are passionate about solving complex reliability challenges and driving technical excellence, we’d love to hear from you! Relativity is a diverse workplace with different skills and life experiences—and we love and celebrate those differences. We believe that employees are happiest when they're empowered to be their full, authentic selves, regardless how you identify. Benefit Highlights: Comprehensive health, dental, and vision plans Parental leave for primary and secondary caregivers Flexible work arrangements Two, week-long company breaks per year Additional time off Long-term incentive program Training investment program All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin, disability or protected veteran status, or any other legally protected basis, in accordance with applicable law. Relativity is committed to competitive, fair, and equitable compensation practices. This position is eligible for total compensation which includes a competitive base salary, an annual performance bonus, and long-term incentives. The expected salary range for this role is between following values: 208 000 and 312 000PLN The final offered salary will be based on several factors, including but not limited to the candidate's depth of experience, skill set, qualifications, and internal pay equity. Hiring at the top end of the range would not be typical, to allow for future meaningful salary growth in this position. Required Skills: Automation, Data Analysis, Database Management, Network Architecture, Performance Optimizations, Problem Solving, Project Management, Software Development, System Designs, Technical Leadership
Technology
Link Group
Site Reliability Engineer
Mid
Hybrid
Warsaw, Poland
🏢 Summary: Hands-on Site Reliability Engineer role focused on building and scaling reliability practices across cloud and on-prem environments. The position involves improving performance, scalability, and resilience of production systems through automation, observability, and Kubernetes-based infrastructure. You will drive SRE standards and collaborate with engineering teams to enhance system stability and fault tolerance. 🗂️ Requirements: 4+ years experience in SRE, DevOps or similar roles, Strong experience with distributed systems, Strong experience with Kubernetes, Experience with AWS cloud, Hands-on automation experience with Python, Bash or Go, Solid understanding of CI/CD practices, Experience with observability and monitoring tools, Experience managing production systems 📃 Skills: Kubernetes, AWS, Python, Bash, Go, Prometheus, Grafana, CI/CD, SRE, DevOps 🏢 Description: We’re looking for a Site Reliability Engineer (SRE) to help build and scale reliability practices across our engineering organization. This is a hands-on role where you’ll work across cloud and on-prem environments, improving the performance, scalability, and resilience of critical production systems. 🔧 What you’ll be doing: • Driving SRE best practices, standards, and ways of working • Building and scaling observability & monitoring solutions (e.g. Prometheus, Grafana) • Working with Kubernetes-based infrastructure to ensure reliability and efficiency • Automating deployments, incident response, and recovery processes • Collaborating closely with engineering teams to improve system stability and fault tolerance • Contributing to a strong reliability culture (SLOs, post-mortems, continuous improvement) ✅ What we’re looking for: • 4+ years of experience in SRE / DevOps / similar roles • Strong experience with distributed systems, Kubernetes, and cloud (AWS preferred) • Hands-on approach to automation (Python, Bash, or Go) • Solid understanding of CI/CD and modern software delivery • Proactive mindset and strong ownership of production systems Name and surname*
Technology
Link Group
DevOps / Site Reliability Engineer
Mid
Hybrid
Kraków, Poland
20,000 - 25,000 PLN
🏢 Summary: DevOps / Site Reliability Engineer role focused on building and maintaining scalable cloud infrastructure while improving platform reliability and automation. The position centers on Kubernetes-based environments, CI/CD pipeline development, and enhancing monitoring and observability. The engineer will support development teams through infrastructure as code and internal developer platform initiatives. 🗂️ Requirements: Experience with cloud platforms (Azure preferred), Strong experience with Kubernetes, Strong knowledge of Infrastructure as Code (Terraform), Hands-on experience with CI/CD tools, Experience with monitoring and observability tools, Understanding of scalability, reliability, and security best practices 📃 Skills: Azure, Kubernetes, Terraform, GitHubActions, ArgoCD, CI/CD, Datadog, Prometheus, Grafana, MongoDB, Rancher, Jenkins, PowerBI, Jira, Confluence 🏢 Description: DevOps / Site Reliability Engineer We’re looking for a DevOps / SRE to help build and maintain scalable cloud infrastructure and improve reliability across our platform. You’ll focus on automation, CI/CD, and supporting development teams with efficient tooling and processes. Key responsibilities Develop and manage cloud infrastructure (Azure preferred) Work with Kubernetes and containerized environments Build and maintain CI/CD pipelines (GitHub Actions, ArgoCD) Automate deployments and operational processes Contribute to Internal Developer Platform (IDP) development Improve monitoring and observability (e.g., Datadog, Prometheus, Grafana) Requirements Experience with cloud platforms and Kubernetes Strong knowledge of Infrastructure as Code (e.g., Terraform) Hands-on experience with CI/CD tools Understanding of scalability, reliability, and security best practices Experience with monitoring/observability tools Nice to have Experience with MongoDB Atlas, Rancher, Jenkins, Power BI Familiarity with Jira, Confluence