June 8, 2026

Observability Specialist

Senior • Hybrid

Warsaw, Poland

Introduction & Summary

We are seeking an experienced Observability Specialist dedicated to ensuring the reliability and performance of our systems. This role involves collaborating with enterprise architects and IT professionals to design, implement, and oversee a scalable telemetry infrastructure. The ideal candidate will possess deep expertise in ELK or similiar technologies and modern telemetry standards.

 

Main Responsibilities

As our Observability Engineer, your core duties will include:

  • Architectural Collaboration: Partner with system architects and local engineering teams in Denmark to design resilient monitoring solutions.

  • Monitor Kubernetes environments with OpenTelemetry (OTel) standards for logs, traces, and metrics.

  • Manage centralized data collection and automate Elastic deployments using Ansible.

  • Utilize Elastic APM for identifying code-level bottlenecks and resolving latency issues.

  • Implement AIOps configurations for proactive anomaly detection and automated root-cause analysis.

  • Drive Site Reliability Engineering (SRE) methodologies across teams.

  • Elastic Stack Management: Deploy, scale, and maintain Elasticsearch, Logstash, and Kibana (ELK) environments.

 

Key Requirements

  • Cloud-Native Observability: Strong skills in monitoring Kubernetes (Openshift)

    environments and integrating with major cloud providers.

  • APM & Distributed Tracing: Expertise in Application Performance Monitoring (APM) to

    identify code-level bottlenecks and latency issues.

  • OpenTelemetry (OTel): Hands-on experience implementing OpenTelemetry (or similiar) standards

    for logs, traces, and metrics to ensure vendor-neutral telemetry.

  • Infrastructure as Code (IaC): Proficiency in automating Elastic environments with

    Ansible.

  • Performance Engineering: Expert-level knowledge of shard optimization, mapping,

    and Index Lifecycle Management (ILM) to balance high performance with cost control.

  • SRE Methodology: Experience defining and monitoring Service Level Objectives

    (SLOs) and managing Error Budgets.

  • Strong communication skills for collaboration with IT teams.

NIce to Have:

  • Elastic Stack Mastery: Deep expertise in architecting and managing Elasticsearch,

    Logstash, and Kibana (ELK) at scale.

  • Data Ingestion & Fleet: Proven experience deploying Elastic Agent and Fleet for

    centralized agent management and data collection.

  • AIOps & Machine Learning: Ability to configure Elastic ML models for proactive

    anomaly detection and automated root cause analysis.

 

Other Details

This is position based in Warsaw, flexible Hybrid model, focused on leading-edge observability solutions in a dynamic and collaborative environment.

Similar jobs you might like

Technology

Link Group

DevOps Engineer (Observability)

Senior

Hybrid

Warsaw, Poland

130 - 145 PLN

🏢 Summary: Design and scale next-generation observability and logging solutions within an international DevOps team, focusing on building high-scale monitoring platforms and cloud-native infrastructure from the ground up. The role combines architecture, infrastructure as code, and reliability engineering for distributed systems. You will drive metrics, logging, tracing, and alerting solutions in a collaborative environment. 🗂️ Requirements: Hands-on experience with Prometheus and Grafana, Experience scaling observability tools such as Thanos or Mimir, Experience managing ELK stack or Loki logging platforms, Strong proficiency in Terraform and Terragrunt, Deep understanding of Kubernetes, Experience with distributed systems observability (metrics, logs, traces), Full professional proficiency in English 📃 Skills: Prometheus, Grafana, Thanos, Mimir, ELK, Loki, Terraform, Terragrunt, Kubernetes, Python, Go, GitHubActions, Puppet 🏢 Description: The Opportunity Join a high-performing, international team of six DevOps experts. This is not a "maintenance-only" role. You will have a seat at the table in designing, building, and scaling our next-generation observability and logging solutions from the ground up. We believe in "Attitude First." If you are an ambitious engineer who thrives on collaboration, knowledge sharing, and solving complex distributed systems challenges, we want to grow with you. Key Responsibilities Architect & Build: Design and implement end-to-end observability solutions, including metrics, logging, tracing, and advanced alerting. Platform Excellence: Operate and optimize high-scale monitoring platforms (Prometheus, Mimir, Grafana) and ELK stack logging infrastructure. Infrastructure as Code: Define and maintain all observability systems using Terraform and Terragrunt . Reliability Engineering: Ensure the scalability and performance of our systems while supporting incident detection and root cause analysis (RCA). Collaborate: Work across domains with a team that values mentoring, transparency, and collective problem-solving. Your Technical Core Observability Expert: Solid hands-on experience with Prometheus, Grafana, and scaling tools like Thanos or Mimir . Logging Architect: Proven experience managing enterprise-grade logging platforms (ELK stack or Loki). IaC Ninja: Strong proficiency in Terraform/Terragrunt to manage infrastructure. Cloud Native: Deep understanding of Kubernetes and the complexities of metrics/logs/traces in distributed systems. Language: Full proficiency in English for seamless global collaboration. Stand Out From The Crowd (Nice to Have) Coding: Ability to automate and integrate using Python or Go . CI/CD: Exposure to GitHub Actions and automated workflows. Configuration Management: Experience with Puppet. SRE Mindset: Understanding of Service Level Indicators (SLIs), Objectives (SLOs), and Error Budgets.

Technology

emagine Polska

Site Reliability Engineer

Senior

Remote

Lisbon, Portugal

🏢 Summary: Hands-on Observability Engineer role focused on building and automating enterprise-grade monitoring and observability solutions across AWS-based cloud and distributed systems. The position centers on developing infrastructure as code, CI/CD pipelines, and monitoring ecosystems to improve reliability, performance, and incident response. Approximately 90% of the role involves coding in Python and Terraform. 🗂️ Requirements: Strong hands-on experience with AWS, Strong Python development and scripting experience, Strong experience with Terraform, Experience building and maintaining CI/CD pipelines using Jenkins, Experience with Elasticsearch and ELK Stack, Experience with Linux systems, Shell scripting skills, Understanding of monitoring, logging, and alerting concepts, Experience working in Agile or DevOps environments 📃 Skills: AWS, Python, Terraform, Jenkins, Elasticsearch, ELK, Linux, Bash, CI/CD, Kubernetes, Grafana, Prometheus, Datadog, NewRelic, Snowflake, Databricks, dbt, Matillion 🏢 Description: Role Overview We are looking for a skilled and proactive Observability Engineer to implement, automate, and support enterprise-grade observability and monitoring solutions across cloud and application platforms. The ideal candidate should have strong AWS infrastructure knowledge, hands-on automation skills, and experience building reliable monitoring and alerting ecosystems for modern distributed applications. The role involves working closely with Platform Engineering, Data Engineering, and Application teams to develop observability solutions and bring operational visibility, reliability, incident detection, and platform performance. Main Responsibilities ·        Design, implement, and maintain observability solutions for cloud-native and distributed systems. ·        Build monitoring, logging, alerting, and dashboarding solutions across infrastructure and applications. ·        Develop automation scripts and tooling using Python. ·        Implement and maintain Infrastructure as Code (IaC) using Terraform. ·        Build and support CI/CD pipelines using Jenkins and Git-based workflows. ·        Configure and optimize monitoring for AWS services, Kubernetes workloads, APIs, databases, and applications. ·        Create actionable alerts and operational dashboards to improve incident response and system reliability. ·        Work with engineering teams to onboard applications into observability platforms. ·        Support troubleshooting, root cause analysis, and performance optimization initiatives. ·        Ensure observability standards, governance, and best practices are followed across projects. Key Requirements ·        Strong hands-on experience with Amazon Web Services (AWS). ·        Solid Python development/scripting experience. ·        Strong experience with Terraform. ·        Experience building and maintaining CI/CD pipelines using Jenkins. ·        Elasticsearch / ELK Stack experience and building queries. ·        Worked with Data Platforms monitoring is preferred. ·        Experience with Linux systems and shell scripting. ·        Understanding of monitoring, logging, and alerting concepts. ·        Experience working in Agile/DevOps environments. Nice to Have Skills Experience with any of the following is highly desirable: ·        Snowflake ·        Databricks ·        dbt ·        Matillion ·        Grafana ·        New Relic ·        Datadog ·        Prometheus ·        Elasticsearch / ELK Stack experience NOTES: We are looking for an Engineer who loves to build. This is a highly technical role—90% of the job is hands-on coding in python and terraform.

Technology

Spyrosoft

Senior Platform Optimization & Observability Engineer

Senior

Remote

Wroclaw, Poland

150 - 200 PLN

🏢 Summary: The offer is for a technical role responsible for owning platform health, optimization, and the observability stack within a complex enterprise environment. The position focuses on improving virtualization and storage performance, enhancing security posture, optimizing disaster recovery, and migrating monitoring and security capabilities to a new ELK-based observability platform. The role combines deep operational work with migration and optimization initiatives across infrastructure and monitoring systems. 🗂️ Requirements: Hands-on experience optimizing virtualization platforms, Strong storage performance and capacity optimization skills, Experience with platform security hardening, Deep operational experience with ELK stack, Experience migrating dashboards, queries, and reports from Log Analytics, Strong understanding of disaster recovery optimization and recovery metrics 📃 Skills: ELK, APM, VMware, Hyper-V, KVM, Proxmox, LogAnalytics, Monitoring, Alerting, Virtualization, Storage, Security, DisasterRecovery, Azure 🏢 Description: Tech stack: ELK stack (observability, APM, security) VMware, Hyper‑V, KVM / Proxmox Log Analytics (migration source) Monitoring and alerting platforms Requirements: Hands‑on experience optimizing virtualization platforms Strong storage performance and capacity optimization skills Experience with platform security hardening Deep operational experience with ELK stack Experience migrating dashboards, queries, and reports from Log Analytics Strong understanding of DR optimization and recovery metrics Nice to have: Compliance scanning tools (CIS) SOC 1 / SOC 2 / C5 familiarity Sentinel rule migration experience Experienced in using AI tools in day-to-day workflow Project description: You will own platform health, optimization, and the observability stack in a complex enterprise environment. The project focuses on improving platform performance, security posture, DR effectiveness, and migrating monitoring and security capabilities to a new observability platform. Main responsibilities: Optimize virtualization and storage platforms Expand observability with APM and security capabilities Migrate monitoring and security assets from Azure tooling Optimize logging, alerting, and retention strategies Review and improve DR and firewall configurations Collaborate with network and security engineers

Technology

Caspian One

Site Reliability Engineer

Senior

Hybrid

Krakow, Poland

1,400 - 1,800 PLN

🏢 Summary: Hands-on Site Reliability Engineer role focused on ensuring stability, scalability, and observability of a mission-critical distributed risk and analytics platform in hybrid cloud environments. The position centers on production reliability, incident response, automation, and continuous improvement of monitoring and deployment processes. You will collaborate with engineering teams to strengthen system resilience, performance, and operational standards. 🗂️ Requirements: Strong Java experience in distributed systems, Experience with observability and monitoring tools, Hands-on experience with hybrid cloud environments (preferably GCP), Experience with CI/CD pipelines and automation tools, Solid knowledge of Linux systems administration, Understanding of RDBMS fundamentals, Experience with job schedulers (e.g., Control-M), Ability to lead incident response and root-cause analysis 📃 Skills: Java, Grafana, Prometheus, Loki, OpenTelemetry, GCP, Jenkins, Ansible, Linux, SQL, Control-M, CI/CD 🏢 Description: We’re looking for a seasoned Site Reliability Engineer to support a high‑performance, mission‑critical risk and analytics platform used across global trading and finance environments. You’ll play a key role in ensuring the stability, scalability, and observability of complex distributed systems running across hybrid cloud infrastructure. In this role, you’ll take ownership of production reliability driving incident response, conducting root‑cause analysis, improving monitoring capabilities, and delivering automation that reduces operational toil. You’ll work closely with development teams, platform engineers, and service management leads to strengthen resilience, refine processes, and enhance the engineering culture around availability and performance. This is a hands on technical position suited to someone who thrives in high‑throughput environments, communicates clearly, and enjoys solving deep engineering problems in real time. Core Responsibilities Maintain and improve the reliability, uptime, and performance of distributed applications. Lead incident response, triage complex issues, coordinate recoveries, and deliver structured post‑incident reviews. Enhance observability—designing and evolving monitoring, alerting, logging, and tracing frameworks. Drive continuous improvement across automation, deployment processes, and service stability. Collaborate with cross‑functional teams to influence architecture, design, and operational standards. Support CI/CD pipelines, environment configuration, and vulnerability remediation. Contribute to a knowledge‑driven culture through documentation, tooling, and best‑practice adoption. Required Skills & Experience Strong Java background with proven experience supporting or developing distributed systems. Observability tooling expertise (Grafana, Prometheus, Loki, OpenTelemetry or similar). Hands‑on with hybrid cloud environments , ideally with GCP or another major cloud provider. CI/CD and automation experience (e.g., Jenkins, Ansible). Solid understanding of Linux , RDBMS fundamentals , and job schedulers (e.g., Control‑M or equivalents). Strong analytical mindset with a methodical approach to troubleshooting. Excellent communication skills and comfort working in Agile teams.

Technology

Yard Corporate

Site Reliability Engineer (SRE)

Senior

Hybrid

Warsaw, Poland

40,000 - 55,000 PLN

🏢 Summary: Senior Site Reliability Engineer role focused on building and standardizing SRE practices across a hybrid AWS and on-prem infrastructure. The position centers on ensuring scalability, resilience, and high availability of high-frequency, data-intensive platforms through observability, automation, and Kubernetes optimization. You will define SLOs, enhance monitoring architecture, and drive reliability culture across engineering teams. 🗂️ Requirements: 5+ years experience in SRE, DevOps, or Infrastructure Engineering supporting distributed production systems, Bachelor’s degree in Computer Science, Computer Engineering, or related field (or equivalent experience), Deep expertise in Grafana, Prometheus, Loki, and Tempo (OpenTelemetry), Strong production experience with Docker and Kubernetes, Experience managing hybrid infrastructure (AWS and on-premises), Proficiency in at least one language: Python, Go, or Bash, Hands-on experience with CI/CD pipelines and Infrastructure-as-Code, Experience defining and managing SLOs and SLAs, Willingness to participate in on-call rotation 📃 Skills: AWS, Kubernetes, Docker, Prometheus, Grafana, Loki, Tempo, OpenTelemetry, Python, Go, Bash, CI/CD, IaC, Git, Hypervisors 🏢 Description: About the Client Our client is a premier, global investment management firm operating at the intersection of finance and technology. Known for their sophisticated, data-intensive systems, they build and maintain high-performance platforms that process massive volumes of market and operational data. To support their expanding footprint, they are looking for a senior-level Site Reliability Engineer (SRE) who will take ownership of shaping, standardizing, and scaling their SRE frameworks and reliability culture from the ground up. The Role In this role, you will serve as a foundational force for SRE practices, partnering directly with Cloud, Infrastructure, and Software Engineering squads. You will work across a hybrid infrastructure (combining advanced AWS cloud environments and physical on-premises servers) to guarantee the scalability, resilience, and maximum uptime of critical, high-frequency transactional platforms. Core Responsibilities SRE Evangelism: Design, implement, and champion core reliability principles, helping technology teams adopt sustainable scaling practices. Observability Architecture: Implement, scale, and maintain end-to-end monitoring, telemetry, and distributed tracing systems utilizing Prometheus, Grafana, Loki, and Tempo (OpenTelemetry framework). Kubernetes Optimization: Establish best-practice configurations for containerized workloads, ensuring applications running on Kubernetes are highly resilient, cost-effective, and performant. Incident Management & Culture: Participate in a balanced, shared on-call rotation (averaging one week per month). Automation & Engineering: Build custom tooling and CI/CD pipelines to automate routine tasks, system health checks, and rapid disaster recovery workflows. SLO/SLA Definition: Partner with product and engineering teams to define, monitor, and enforce Service Level Objectives (SLOs) and Error Budgets. What We Look For Experience: 5+ years of hands-on experience in a dedicated SRE, DevOps, or Infrastructure Engineering role supporting complex, distributed production systems. Education: A Bachelor’s degree in Computer Science, Computer Engineering, or a related technical discipline (or equivalent practical experience). Observability Expertise: Deep, subject-matter knowledge of modern monitoring stacks, specifically Grafana, Prometheus, Loki, and Tempo (OTel). Orchestration & Containers: Strong, production-grade expertise in containerization (Docker) and orchestration (Kubernetes). Hybrid Infrastructure: Experience navigating hybrid models—managing both cloud services (AWS preferred) and physical on-premise hardware resources. Scripting/Coding: Proficiency in writing clean, maintainable code in at least one scripting or programming language (e.g., Python, Bash, or Go) to build reliable automation. Methodologies: Solid grounding in CI/CD concepts, infrastructure-as-code (IaC), and agile development processes. Soft Skills: Excellent verbal and written communication skills, with a proven ability to convey complex infrastructure and reliability concepts to both technical and non-technical stakeholders. What We Offer Stable Employment: Full-time employment contract ( Umowa o Pracę - UoP ). Tax Optimization: Eligibility for creative tax-deductible costs ( KUP - Koszty Uzyskania Przychodu). Financial Reward: Highly competitive base salary accompanied by a generous annual performance bonus . Comprehensive Health: Premium private medical care package that fully includes dental coverage (stomatologia) . Wellness & Lifestyle: MultiSport card to keep you active and healthy. Daily Perks: Pre-funded lunch card for your daily meals. Tech Stack at a Glance Cloud & Virtualization: AWS, Kubernetes, Docker, On-Premises Hypervisors Observability: Prometheus, Grafana, Loki, Tempo, OpenTelemetry (OTel) Languages: Python, Go, Bash CI/CD & Automation: Git-based pipelines, Configuration Management, IaC

Technology

EPAM Systems

Senior C++ Engineer with Observability

Senior

Remote

Katowice, Poland

🏢 Summary: Senior C++ Engineer with strong observability expertise to lead the design and implementation of monitoring and telemetry solutions across the software development lifecycle. The role focuses on low-latency, high-performance distributed systems, ensuring measurable customer experience, reliability, and operational insight. Combines hands-on engineering with technical leadership in large-scale production environments. 🗂️ Requirements: 5+ years of observability engineering experience (metrics, tracing, logging), Strong C++ engineering skills, Experience with profiling and telemetry pipelines, Experience building large-scale monitoring or observability platforms, Knowledge of latency-sensitive market data systems, Expertise in OpenTelemetry, eBPF, and GitOps, Experience with API-driven automation and CI/CD-integrated observability, Knowledge of cloud-native, Kubernetes, and distributed systems architectures, Ability to define standards and guide engineering teams, English proficiency at B2 level or above 📃 Skills: C++, OpenTelemetry, eBPF, GitOps, Kubernetes, CI/CD, APIs, Cloud-native, Distributed-systems, Metrics, Tracing, Logging, Profiling, Telemetry 🏢 Description: We are looking for a Senior C++ Engineer with Observability expertise to spearhead our observability implementation. This position blends practical engineering with technical leadership, guaranteeing that customer experience, system reliability, and operational insight remain measurable and integrated across the entire software development lifecycle. The perfect candidate will offer substantial experience in low-latency trading, market data, or similar high-performance distributed systems, paired with a solid C++ foundation and a proven history of delivering production monitoring, telemetry, and operational tooling at scale. Responsibilities Spearhead observability implementation throughout the software development lifecycle Guarantee that customer experience, system reliability, and operational insight stay measurable and integrated Construct and maintain large-scale monitoring, observability, or control platforms Set standards and promote the adoption of observability best practices Collaborate effectively with development teams to deliver telemetry and operational tooling Deploy metrics, tracing, logging, profiling, and telemetry pipelines Advance customer-experience measurement for latency-sensitive market data systems Guide engineering teams toward shared observability goals Requirements More than 5 years of experience in observability engineering, covering metrics, tracing, and logging Strong skills in profiling and telemetry pipelines Knowledge of customer-experience measurement for latency-sensitive market data systems Experience building and maintaining large-scale monitoring, observability, or control platforms Strong C++ engineering skills along with the capacity to collaborate effectively with development teams Expertise in OpenTelemetry, eBPF, and GitOps Capability in API-driven automation and CI/CD-integrated observability practices Knowledge of cloud-native, Kubernetes, and distributed systems architectures Demonstrated ability to guide engineering teams and define standards English proficiency at B2 level or above Nice to have Experience in trading or market-making Knowledge of exchange connectivity Understanding of market data environments We offer We gather like-minded people: Engineering community of industry professionals Friendly team and enjoyable working environment Flexible schedule and opportunity to work remotely within Poland Chance to work abroad for up to 60 days annually Business-driven relocation opportunities We provide growth opportunities: Outstanding career roadmap Leadership development, career advising, soft skills, and well-being programs Certification (GCP, Azure, AWS) Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru English classes We cover it all: Stable income (Employment Contract or B2B) Participation in the Employee Stock Purchase Plan Benefits package (health insurance, multisport, shopping vouchers) Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more Referral bonuses Corporate, social and well-being events Please, note: The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview. We will reach out to selected candidates exclusively. EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Technology

EPAM Systems

Senior C++ Engineer with Observability

Senior

Remote

Lodz, LD, Poland

🏢 Summary: Senior C++ Engineer role focused on leading observability implementation across the software development lifecycle in low-latency, high-performance distributed systems. The position combines hands-on C++ engineering with technical leadership to deliver scalable monitoring, telemetry, and operational tooling. The role emphasizes measurable customer experience, system reliability, and operational insight in market data or trading environments. 🗂️ Requirements: 5+ years of observability engineering experience (metrics, tracing, logging), Strong C++ engineering skills, Experience with profiling and telemetry pipelines, Experience building and maintaining large-scale monitoring or observability platforms, Expertise in OpenTelemetry, eBPF, and GitOps, Experience with API-driven automation and CI/CD-integrated observability, Knowledge of cloud-native, Kubernetes, and distributed systems architectures, Knowledge of customer-experience measurement for latency-sensitive systems, Ability to define standards and guide engineering teams, English proficiency at B2 level or higher 📃 Skills: C++, OpenTelemetry, eBPF, GitOps, Kubernetes, CI/CD, APIs, Telemetry, Tracing, Logging, Profiling, Cloud-native 🏢 Description: We are looking for a Senior C++ Engineer with Observability expertise to spearhead our observability implementation. This position blends practical engineering with technical leadership, guaranteeing that customer experience, system reliability, and operational insight remain measurable and integrated across the entire software development lifecycle. The perfect candidate will offer substantial experience in low-latency trading, market data, or similar high-performance distributed systems, paired with a solid C++ foundation and a proven history of delivering production monitoring, telemetry, and operational tooling at scale. Responsibilities Spearhead observability implementation throughout the software development lifecycle Guarantee that customer experience, system reliability, and operational insight stay measurable and integrated Construct and maintain large-scale monitoring, observability, or control platforms Set standards and promote the adoption of observability best practices Collaborate effectively with development teams to deliver telemetry and operational tooling Deploy metrics, tracing, logging, profiling, and telemetry pipelines Advance customer-experience measurement for latency-sensitive market data systems Guide engineering teams toward shared observability goals Requirements More than 5 years of experience in observability engineering, covering metrics, tracing, and logging Strong skills in profiling and telemetry pipelines Knowledge of customer-experience measurement for latency-sensitive market data systems Experience building and maintaining large-scale monitoring, observability, or control platforms Strong C++ engineering skills along with the capacity to collaborate effectively with development teams Expertise in OpenTelemetry, eBPF, and GitOps Capability in API-driven automation and CI/CD-integrated observability practices Knowledge of cloud-native, Kubernetes, and distributed systems architectures Demonstrated ability to guide engineering teams and define standards English proficiency at B2 level or above Nice to have Experience in trading or market-making Knowledge of exchange connectivity Understanding of market data environments We offer We gather like-minded people: Engineering community of industry professionals Friendly team and enjoyable working environment Flexible schedule and opportunity to work remotely within Poland Chance to work abroad for up to 60 days annually Business-driven relocation opportunities We provide growth opportunities: Outstanding career roadmap Leadership development, career advising, soft skills, and well-being programs Certification (GCP, Azure, AWS) Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru English classes We cover it all: Stable income (Employment Contract or B2B) Participation in the Employee Stock Purchase Plan Benefits package (health insurance, multisport, shopping vouchers) Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more Referral bonuses Corporate, social and well-being events Please, note: The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview. We will reach out to selected candidates exclusively. EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Technology

EPAM Systems

Senior C++ Engineer with Observability

Senior

Remote

Krakow, Poland

🏢 Summary: Senior C++ Engineer role focused on leading observability implementation across the software development lifecycle for low-latency, high-performance distributed systems. The position combines hands-on engineering with technical leadership to ensure measurable customer experience, system reliability, and operational insight. It involves building large-scale monitoring platforms and advancing telemetry practices in latency-sensitive market data environments. 🗂️ Requirements: 5+ years of experience in observability engineering (metrics, tracing, logging), Strong proficiency in C++, Experience with profiling and telemetry pipelines, Experience building and maintaining large-scale monitoring or observability platforms, Expertise in OpenTelemetry, eBPF, and GitOps, Experience with API-driven automation and CI/CD-integrated observability, Knowledge of cloud-native, Kubernetes, and distributed systems architectures, Experience with customer-experience measurement in latency-sensitive systems, Ability to define standards and guide engineering teams, English proficiency at B2 level or higher 📃 Skills: C++, OpenTelemetry, eBPF, GitOps, Kubernetes, CI/CD, APIs, Telemetry, Tracing, Logging, Profiling, Cloud-native, DistributedSystems 🏢 Description: We are looking for a Senior C++ Engineer with Observability expertise to spearhead our observability implementation. This position blends practical engineering with technical leadership, guaranteeing that customer experience, system reliability, and operational insight remain measurable and integrated across the entire software development lifecycle. The perfect candidate will offer substantial experience in low-latency trading, market data, or similar high-performance distributed systems, paired with a solid C++ foundation and a proven history of delivering production monitoring, telemetry, and operational tooling at scale. Responsibilities Spearhead observability implementation throughout the software development lifecycle Guarantee that customer experience, system reliability, and operational insight stay measurable and integrated Construct and maintain large-scale monitoring, observability, or control platforms Set standards and promote the adoption of observability best practices Collaborate effectively with development teams to deliver telemetry and operational tooling Deploy metrics, tracing, logging, profiling, and telemetry pipelines Advance customer-experience measurement for latency-sensitive market data systems Guide engineering teams toward shared observability goals Requirements More than 5 years of experience in observability engineering, covering metrics, tracing, and logging Strong skills in profiling and telemetry pipelines Knowledge of customer-experience measurement for latency-sensitive market data systems Experience building and maintaining large-scale monitoring, observability, or control platforms Strong C++ engineering skills along with the capacity to collaborate effectively with development teams Expertise in OpenTelemetry, eBPF, and GitOps Capability in API-driven automation and CI/CD-integrated observability practices Knowledge of cloud-native, Kubernetes, and distributed systems architectures Demonstrated ability to guide engineering teams and define standards English proficiency at B2 level or above Nice to have Experience in trading or market-making Knowledge of exchange connectivity Understanding of market data environments We offer We gather like-minded people: Engineering community of industry professionals Friendly team and enjoyable working environment Flexible schedule and opportunity to work remotely within Poland Chance to work abroad for up to 60 days annually Business-driven relocation opportunities We provide growth opportunities: Outstanding career roadmap Leadership development, career advising, soft skills, and well-being programs Certification (GCP, Azure, AWS) Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru English classes We cover it all: Stable income (Employment Contract or B2B) Participation in the Employee Stock Purchase Plan Benefits package (health insurance, multisport, shopping vouchers) Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more Referral bonuses Corporate, social and well-being events Please, note: The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview. We will reach out to selected candidates exclusively. EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Technology

EPAM Systems

Senior C++ Engineer with Observability

Senior

Remote

Warsaw, Poland

🏢 Summary: Senior C++ Engineer role focused on leading observability implementation across the software development lifecycle in high-performance, low-latency distributed systems. The position combines hands-on engineering with technical leadership to build and scale monitoring, telemetry, and operational tooling for market data environments. It emphasizes reliability, customer-experience measurement, and cloud-native observability practices. 🗂️ Requirements: 5+ years of experience in observability engineering (metrics, tracing, logging), Strong C++ engineering skills, Experience with profiling and telemetry pipelines, Experience building large-scale monitoring or observability platforms, Knowledge of latency-sensitive market data systems, Expertise in OpenTelemetry, Expertise in eBPF, Experience with GitOps practices, Experience with API-driven automation, Experience integrating observability with CI/CD, Knowledge of Kubernetes, Knowledge of cloud-native architectures, Knowledge of distributed systems architectures, Ability to define standards and guide engineering teams, English proficiency at B2 level or higher 📃 Skills: C++, OpenTelemetry, eBPF, GitOps, Kubernetes, CI/CD, APIs, Telemetry, Tracing, Logging, Profiling, Cloud-native, DistributedSystems 🏢 Description: We are looking for a Senior C++ Engineer with Observability expertise to spearhead our observability implementation. This position blends practical engineering with technical leadership, guaranteeing that customer experience, system reliability, and operational insight remain measurable and integrated across the entire software development lifecycle. The perfect candidate will offer substantial experience in low-latency trading, market data, or similar high-performance distributed systems, paired with a solid C++ foundation and a proven history of delivering production monitoring, telemetry, and operational tooling at scale. Responsibilities Spearhead observability implementation throughout the software development lifecycle Guarantee that customer experience, system reliability, and operational insight stay measurable and integrated Construct and maintain large-scale monitoring, observability, or control platforms Set standards and promote the adoption of observability best practices Collaborate effectively with development teams to deliver telemetry and operational tooling Deploy metrics, tracing, logging, profiling, and telemetry pipelines Advance customer-experience measurement for latency-sensitive market data systems Guide engineering teams toward shared observability goals Requirements More than 5 years of experience in observability engineering, covering metrics, tracing, and logging Strong skills in profiling and telemetry pipelines Knowledge of customer-experience measurement for latency-sensitive market data systems Experience building and maintaining large-scale monitoring, observability, or control platforms Strong C++ engineering skills along with the capacity to collaborate effectively with development teams Expertise in OpenTelemetry, eBPF, and GitOps Capability in API-driven automation and CI/CD-integrated observability practices Knowledge of cloud-native, Kubernetes, and distributed systems architectures Demonstrated ability to guide engineering teams and define standards English proficiency at B2 level or above Nice to have Experience in trading or market-making Knowledge of exchange connectivity Understanding of market data environments We offer We gather like-minded people: Engineering community of industry professionals Friendly team and enjoyable working environment Flexible schedule and opportunity to work remotely within Poland Chance to work abroad for up to 60 days annually Business-driven relocation opportunities We provide growth opportunities: Outstanding career roadmap Leadership development, career advising, soft skills, and well-being programs Certification (GCP, Azure, AWS) Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru English classes We cover it all: Stable income (Employment Contract or B2B) Participation in the Employee Stock Purchase Plan Benefits package (health insurance, multisport, shopping vouchers) Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more Referral bonuses Corporate, social and well-being events Please, note: The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview. We will reach out to selected candidates exclusively. EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Technology

EPAM Systems

Senior C++ Engineer with Observability

Senior

Remote

Poznan, WP, Poland

🏢 Summary: Senior C++ Engineer role focused on leading observability implementation across the full software development lifecycle for low-latency, high-performance distributed systems. The position combines hands-on engineering with technical leadership to build and scale monitoring, telemetry, and operational tooling for latency-sensitive market data environments. It requires strong C++ expertise and deep experience in production-grade observability practices. 🗂️ Requirements: 5+ years of experience in observability engineering (metrics, tracing, logging), Strong C++ engineering skills, Experience with profiling and telemetry pipelines, Experience building and maintaining large-scale monitoring or observability platforms, Knowledge of latency-sensitive market data systems, Expertise in OpenTelemetry, eBPF, and GitOps, Experience with API-driven automation and CI/CD-integrated observability, Knowledge of cloud-native, Kubernetes, and distributed systems architectures, Ability to define standards and guide engineering teams, English proficiency at B2 level or higher 📃 Skills: C++, OpenTelemetry, eBPF, GitOps, Kubernetes, CI/CD, APIs, Telemetry, Tracing, Logging, Profiling, Cloud-native, DistributedSystems 🏢 Description: We are looking for a Senior C++ Engineer with Observability expertise to spearhead our observability implementation. This position blends practical engineering with technical leadership, guaranteeing that customer experience, system reliability, and operational insight remain measurable and integrated across the entire software development lifecycle. The perfect candidate will offer substantial experience in low-latency trading, market data, or similar high-performance distributed systems, paired with a solid C++ foundation and a proven history of delivering production monitoring, telemetry, and operational tooling at scale. Responsibilities Spearhead observability implementation throughout the software development lifecycle Guarantee that customer experience, system reliability, and operational insight stay measurable and integrated Construct and maintain large-scale monitoring, observability, or control platforms Set standards and promote the adoption of observability best practices Collaborate effectively with development teams to deliver telemetry and operational tooling Deploy metrics, tracing, logging, profiling, and telemetry pipelines Advance customer-experience measurement for latency-sensitive market data systems Guide engineering teams toward shared observability goals Requirements More than 5 years of experience in observability engineering, covering metrics, tracing, and logging Strong skills in profiling and telemetry pipelines Knowledge of customer-experience measurement for latency-sensitive market data systems Experience building and maintaining large-scale monitoring, observability, or control platforms Strong C++ engineering skills along with the capacity to collaborate effectively with development teams Expertise in OpenTelemetry, eBPF, and GitOps Capability in API-driven automation and CI/CD-integrated observability practices Knowledge of cloud-native, Kubernetes, and distributed systems architectures Demonstrated ability to guide engineering teams and define standards English proficiency at B2 level or above Nice to have Experience in trading or market-making Knowledge of exchange connectivity Understanding of market data environments We offer We gather like-minded people: Engineering community of industry professionals Friendly team and enjoyable working environment Flexible schedule and opportunity to work remotely within Poland Chance to work abroad for up to 60 days annually Business-driven relocation opportunities We provide growth opportunities: Outstanding career roadmap Leadership development, career advising, soft skills, and well-being programs Certification (GCP, Azure, AWS) Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru English classes We cover it all: Stable income (Employment Contract or B2B) Participation in the Employee Stock Purchase Plan Benefits package (health insurance, multisport, shopping vouchers) Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more Referral bonuses Corporate, social and well-being events Please, note: The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview. We will reach out to selected candidates exclusively. EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.