Site Reliability Engineer
Polygon.io
Site Reliability Engineer
We’re hiring a high-performing SRE to lead ops tasks and build scalable, highly available tools and systems for our growing platform.
About this role
Embark on an exciting journey as an SRE on the Platform Engineering team at Polygon.io, where you will have significant ownership over the platform. We are seeking a high performing site reliability engineer that will specialize in operational tasks and projects to help us build tools and systems that are highly available and scalable. You will have significant ownership over the platform, specifically technologies like bare-metal Kubernetes, Container Runtimes, Ceph, GH Actions, ArgoCD, Argo Workflows, and Atlantis.
Responsibilities
- Partner closely with development teams to build, maintain, and enhance large-scale distributed systems, ensuring high availability and excellent user experience.
- Write, configure, and deploy reliable code to improve existing or new systems, setting exemplary standards for code quality.
- Participate actively in the on-call rotation to support operational stability.
- Create clear documentation, including system designs, technical analyses, runbooks, and playbooks.
- Provide constructive design feedback and mentor team members to enhance their design capabilities.
- Collaborate with development teams to optimize system reliability and performance, applying a platform engineering approach to operational tasks.
- Develop and manage automation for operational processes, including monitoring, performance optimization, and disaster recovery.
- Identify, diagnose, and resolve issues across development, testing, and production environments.
- Lead or contribute to incident postmortems and implement preventative measures to minimize recurrence.
- Design, deploy, and manage comprehensive monitoring solutions using Prometheus, LGTM Stack, and OpenTelemetry for proactive issue detection and platform observability.
Skills & Qualifications
- 4+ years experience being an ops engineer
- Experience operating Kubernetes
- Preferred experience supporting on prem environments
- Preferred experience with distributed file systems. Ex: Ceph
- Preferred experience with Juniper networking infrastructure
- Preferred experience with IaC tools such as Terraform and Puppet
About Polygon.io
At Polygon.io, we’re on a mission to modernize Wall Street by empowering developers with the tools to shape the future of finance. We’re reimagining financial market data for the 21st century, removing barriers, simplifying access, and creating frictionless, forward-thinking technologies.
Join us and become part of a passionate team that consistently sets new industry standards, creating a profound impact on the world of finance and technology, and leveling the playing field by providing fair access for all.