Staff Site Reliability Engineer

Company: Gradle Technologies
Location: New York City
Posted on: February 16, 2026

Job Description:

Job Description Job Description Who We Are Develocity is a first-of-its-kind toolchain observability and acceleration platform that helps software teams adopt and improve DORA capabilities (including continuous delivery) in order to achieve software delivery excellence. It combines build and test acceleration with deep observability for builds and tests with Gradle Build Tool, Apache Maven™, sbt, npm, and Python, and applies to both CI and local builds and tests. Ultimately, Develocity provides an operational layer across an organization's toolchains to speed up, troubleshoot, and optimize local developer and remote CI feedback loops. Our software is used by some of the world's leading software organizations, such as Netflix, Airbnb, SAP, several top ten banks, and many other major customers across all verticals. We regularly collaborate with these and other users to make our products continuously better. We have partnered with the Apache Software Foundation, the Commonhaus Foundation, the Scala Center, the Micronaut Foundation, and other OSS projects like Spring, Quarkus, Kotlin, JUnit, AndroidX, and many more to bring the values of Develocity also to the OSS Community. Our Values Seek to Understand: Everything starts with listening and understanding, and we strive to understand different viewpoints, problems, and motivations. Before we take action, we ensure we truly grasp the challenges, perspectives, and goals. Know the Why : We approach our work with a clear sense of purpose, ensuring every step is deliberate and focused. We take meaningful action with urgency, but never at the expense of thoughtful consideration. Innovate & Iterate : We embrace challenges and are not afraid to try new things, even if they might fail. With deep understanding and a clear purpose, we can develop creative and bold solutions to tackle challenges. Own the Outcome: We are empowered to take initiative and we maintain transparency in our work and its outcomes. When we execute, we take responsibility for our decisions, measure the success of our innovations, and learn from the results. Who You Are We're building a new SRE team and looking for founding members to help shape how we operate. As a Lead SRE, you'll be a technical and operational leader for reliability across Develocity. You'll help define our SRE vision, set standards for how we operate production services, and mentor other SREs as the team grows. This is a hands-on role with broad influence across engineering, cloud platform, and customer-facing teams. The SRE team will be responsible for the reliability, performance, and availability of Develocity instances serving paying customers, open-source projects, and public-facing services, plus supporting infrastructure like artifact registries. You'll work on our internally-built Cloud Application Platform, Kubernetes on AWS, and develop deep expertise in it. When incidents happen, you'll troubleshoot issues across the stack, from application to infrastructure. You'll collaborate with the Cloud Platform team to improve the tooling you depend on, and with engineering teams to build reliability into how we ship software. If you like automating things and hate doing the same task twice, you'll fit in well. You'll be part of a distributed, remote-first team that values asynchronous communication and written documentation. Strong self-direction and clear communication across time zones are essential. Responsibilities Operate and maintain all Develocity instances and supporting services in production. Define and evolve SRE standards, practices, and operating models, including on-call, incident response, postmortems, and SLOs. Participate in a follow-the-sun on-call rotation, acting as a technical escalation point for complex or high-severity incidents. Lead incident response and blameless retrospectives, ensuring learnings result in measurable reliability improvements. Set reliability priorities using risk, customer impact, business goals, SLOs, and error budgets. Identify systemic reliability risks and continuously evolve Develocity's SaaS operations as the platform and customer base grow. Lead and influence architectural and design reviews to ensure reliability, scalability, and operability. Drive automation across deployment, upgrades, monitoring, self-healing, recovery, and operational workflows. Build and maintain comprehensive observability for all managed services, including logging, metrics, tracing, and alerting. Own disaster recovery, backups, and business continuity planning and execution. Partner with engineering leadership to balance feature delivery with reliability and operational excellence. Mentor and coach SREs, supporting technical growth and strong operational practices. Help onboard new SREs and contribute to hiring by defining and assessing SRE excellence at Develocity. Communicate clearly with customers during incidents and maintenance windows. Optimize performance, resource utilization, and operational costs. Minimum qualifications 7 years in SRE, DevOps, or an equivalent role operating production services at scale. Experience leading reliability initiatives across multiple teams or services. Demonstrated ability to influence technical direction without direct authority. Experience designing and operating systems with SLOs and error budgets, and exercising strong judgment in balancing reliability, velocity, and cost. Strong Kubernetes experience in production environments. Cloud infrastructure expertise, preferably AWS (EKS, RDS, S3, EC2). Proficiency with observability tools (Prometheus, Grafana) and Infrastructure as Code (Terraform). Track record of incident management and response in a 24/7 on-call environment. Scripting proficiency (Python, Bash) for automation. Strong written and verbal English communication skills. Preferred qualifications Experience as a founding or early SRE establishing practices in a growing SaaS organization. Familiarity with Develocity. JVM language experience (Java, Kotlin). Experience with customer-facing and executive-level incident communications. What We Offer A ground-floor role in a new SRE team - you'll shape how we do things, not inherit someone else's decisions. Real ownership of production systems used by engineers at companies you've heard of. Direct interaction with customers when things go wrong (and when they go right). A culture that values automation over heroics. In-person meetings, such as our annual company offsite and team meetings. Work from home in a remote-first environment. Competitive salaries and equity grants. Compensation The US salary range for this position is $180-220k which reflects the target ranges for all US locations. Within this range, individual pay is determined by geographic location and additional factors including but not limited to experience, relevant skills, qualifications, seniority, performance, and travel requirements. Our recruiting team can share more information about the specific salary range for your location during the hiring process. Location Remote from anywhere in EST timezone. While our team works remotely and is spread across the globe, we deeply value daily interactions and collaboration.

Keywords: Gradle Technologies, Hempstead , Staff Site Reliability Engineer, IT / Software / Systems , New York City, New York

Didn't find what you're looking for? Search again!

Let New York City recruiters find you. Post your resume for free!

Get New York City IT / Software / Systems jobs via email.

View more Hempstead IT / Software / Systems jobs

Other IT / Software / Systems Jobs

Vendor Operations Lead
Description: Job Description Job Description Company Description VERSANT is a leading force in news, sports and entertainment - home to iconic and trusted brands that inspire, inform, and delight audiences. Our unique (more...)
Company: Versant Media
Location: New York City
Posted on: 02/18/2026

Senior Power BI Developer - SPBD 0211 NP01
Description: Job Description Job Description Job Title: Senior Power BI Developer Location: New York City, NY Hybrid 3 days onsite Interview Requirement: In-person interview required Rate: From 55/hr W2 Position (more...)
Company: NavitasPartners
Location: New York City
Posted on: 02/18/2026

Technical Lead Engineer
Description: Job Description Job Description Full Stack Developer - Generative AI SaaS - NYC - Series A Startup We are seeking a hands-on, experienced Python/Full-Stack Engineer to drive the technical vision and execution (more...)
Company: TriSearch
Location: New York City
Posted on: 02/18/2026

Salary in Hempstead, New York Area | More details for Hempstead, New York Jobs |Salary

Microsoft Dynamics Technical Architect
Description: Job Family: SAAS/PAAS/Cloud Consulting Travel Required: Up to 25 Clearance Required: None What You Will Do: Guidehouse has an exciting opportunity for a Microsoft Dynamics Architect to support our client's (more...)
Company: Guidehouse
Location: New York City
Posted on: 02/18/2026

Senior Data Analyst (Senior Consultant)
Description: Job Family: Data Science Consulting Travel Required: Up to 10 Clearance Required: Ability to Obtain Public Trust About our AI and Data Capability Team Our consultants on the AI and Data Analytics Capability (more...)
Company: Guidehouse
Location: New York City
Posted on: 02/18/2026

Senior DevSecOps Engineer
Description: Our collaborations have shaped some of the defining moments in public-sector service delivery. We ve helped build products that connect Veterans to tailored services, help millions access affordable (more...)
Company: Ad Hoc
Location: Elkins Park
Posted on: 02/18/2026

Software Engineer / Senior Software Engineer
Description: Were seeking a Software Engineer / Senior Software Engineer interested in transforming the insurance industry. The ideal candidate will have a commitment to delivering results, a passion for quality, (more...)
Company: Applied Systems Inc
Location: Elkins Park
Posted on: 02/18/2026

Revenue Enablement Lead
Description: Job Description Job Description At Trunk Tools, we re the leading AI company revolutionizing construction the second-largest industry on earth. We recently raised a 40M Series B led by Insight Partners, (more...)
Company: Trunk Tools, Inc.
Location: New York City
Posted on: 02/18/2026

Director of IT & Security
Description: Our collaborations have shaped some of the defining moments in public-sector service delivery. We ve helped build products that connect Veterans to tailored services, help millions access affordable (more...)
Company: Ad Hoc
Location: Elkins Park
Posted on: 02/18/2026

AI Platform & Data Engineer
Description: Our collaborations have shaped some of the defining moments in public-sector service delivery. We ve helped build products that connect Veterans to tailored services, help millions access affordable (more...)
Company: Ad Hoc
Location: Elkins Park
Posted on: 02/18/2026

Loading more jobs...

Staff Site Reliability Engineer

Didn't find what you're looking for? Search again!

Other IT / Software / Systems Jobs

Log In or Create An Account