Site Reliability Engineer II

Company: The Walt Disney Company
Location: New York City
Posted on: April 4, 2026

Job Description:

Job Posting Title: Site Reliability Engineer II Req ID: 10143234 Job Description: Department/Group Overview Our engineering fleet is a horizontal set of teams providing engineering services across the organization. Our specific team provides reliability engineering and operational support to backend service development teams. Disney Entertainment and ESPN Product & Technology Technology is at the heart of Disney’s past, present, and future. Disney Entertainment and ESPN Product & Technology is a global organization of engineers, product developers, designers, technologists, data scientists, and more – all working to build and advance the technological backbone for Disney’s media business globally. The team marries technology with creativity to build world-class products, enhance storytelling, and drive velocity, innovation, and scalability for our businesses. We are Storytellers and Innovators. Creators and Builders. Entertainers and Engineers. We work with every part of The Walt Disney Company’s media portfolio to advance the technological foundation and consumer media touch points serving millions of people around the world. Here are a few reasons why we think you’d love working here: Building the future of Disney’s media: Our Technologists are designing and building the products and platforms that will power our media, advertising, and distribution businesses for years to come. Reach, Scale & Impact: More than ever, Disney’s technology and products serve as a signature doorway for fans' connections with the company’s brands and stories. Disney. Hulu. ESPN. ABC. ABC News…and many more. These products and brands – and the unmatched stories, storytellers, and events they carry – matter to millions of people globally. Innovation: We develop and implement groundbreaking products and techniques that shape industry norms and solve complex and distinctive technical problems. Job Description The Streaming SRE squad drives improvements in performance, resiliency, and operational excellence. We take a consultative approach to reliability engineering—partnering with a variety of cross-functional teams to provide guidance, automation, education, and best practices that elevate the reliability and scalability of services that support our products and brands. We are seeking a Site Reliability Engineer who will contribute to the stability and scalability of critical systems by building automation, improving operational workflows, enhancing observability, and participating in incident response. The ideal candidate has a strong understanding of distributed system fundamentals, cloud-native resources and operations, and performance optimization. This role requires a collaborative mindset and the ability to work closely with engineering teams to implement SRE principles across the organization. Fostering innovation is a critical component to success here at Disney Entertainment and ESPN Product & Technology. Therefore, the ideal candidate will also need to be highly adaptable to changes and be able to pivot when required. Responsibilities: Contribute to the design, implementation, and improvement of systems to enhance reliability, scalability, and performance. Build and maintain automation for deployment, monitoring, alerting, and operational workflows. Collaborate with software engineering teams to implement SRE best practices, including SLIs, SLOs, error budgets, and automated remediation. Support CI/CD pipelines and participate in optimizing the software delivery lifecycle. Develop tools, dashboards, and instrumentation to improve observability across metrics, logs, and distributed tracing. Participate in incident response, root cause analysis (RCA), and corrective actions to prevent recurrence. Assist in capacity planning, performance tuning, and scaling strategies for distributed systems. Maintain and improve Infrastructure-as-Code (IaC) definitions and cloud environment configurations. Contribute to documentation, runbooks, architectural diagrams, and operational standards. Collaborate with cross-functional teams to identify reliability risks and recommend improvements. Participate in incident-based escalations and rotations to support high-availability production systems. Continuously evaluate system architecture, tools, and practices to drive operational excellence and efficiency. Basic Qualifications Bachelor's degree in computer science, Engineering, or related field (or equivalent experience). 3 years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or related discipline. Hands-on experience with cloud platforms – AWS (preferred), GCP, Azure. Proficiency in Python, Go, JavaScript, Bash, or equivalent scripting languages. Working knowledge of Linux or Unix-based systems. Experience with CI/CD systems (e.g., GitHub Actions, GitLab CI, Jenkins). Familiarity with Infrastructure-as-Code (Terraform, CloudFormation, etc.). Experience with containerization technologies such as Docker and Kubernetes. Understand networking fundamentals, distributed systems, and system design basics. Strong analytical and troubleshooting skills, including the ability to diagnose complex system issues. An ability to work both independently and collaboratively Strong communication skills and the ability to collaborate effectively with cross-functional teams. Preferred Qualifications Hands-on experience with observability stacks (Prometheus, Grafana, ELK/EFK, Datadog, Splunk, New Relic). Exposure to GitOps tooling (Argo CD, Flux). Experience contributing to SLO/SLI frameworks and implementing error budgets. Knowledge of service mesh architectures (Istio, Linkerd). Familiarity with performance testing and load testing tools. Experience with message queues, event-driven systems, or distributed data platforms. Cloud or DevOps-related certifications (AWS Associate/Specialty, GCP Professional, Kubernetes CKA/CKS). Experience working in large-scale enterprise environments or with distributed global teams. Experience using modern AI-assisted development tools (e.g., Copilot, Cursor, or similar) to improve code quality, accelerate development, and enhance documentation. Understanding foundational AI/ML concepts, familiarity with cloud-native AI services such as model hosting, and/or ability to use AI tools to automate cloud operations tasks. The hiring range for this position in New York City is $123,000 - $165,000. The base pay actually offered will take into account internal equity and also may vary depending on the candidate’s geographic region, job-related knowledge, skills, and experience among other factors. A bonus and/or long-term incentive units may be provided as part of the compensation package, in addition to the full range of medical, financial, and/or other benefits, dependent on the level and position offered. Job Posting Segment: PE - Sports, News & Entertainment, Enablement Job Posting Primary Business: PE - Sports, News & Entertainment, Enablement - Infrastructure Engineering Primary Job Posting Category: Site/System Reliability Engineer Employment Type: Full time Primary City, State, Region, Postal Code: New York, NY, USA Alternate City, State, Region, Postal Code: Date Posted: 2026-02-26

Keywords: The Walt Disney Company, Hempstead , Site Reliability Engineer II, IT / Software / Systems , New York City, New York

Didn't find what you're looking for? Search again!

Let New York City recruiters find you. Post your resume for free!

Get New York City IT / Software / Systems jobs via email.

View more Hempstead IT / Software / Systems jobs

Other IT / Software / Systems Jobs

POS Support Technician (Korean Bilingual)
Description: Bluu Inc., founded in 2007, is a end-to-end POS hardware service provider for merchants nationwide. Bluu Inc. is a leader in developing innovative POS software. Currently, Bluu Inc. hardware is installed (more...)
Company: United Merchant Services
Location: Hackensack
Posted on: 04/5/2026

Head of Supply Chain Analytics and Systems
Description: FCP Euro is seeking a senior supply chain analytics leader to define, build, and scale our supply chain analytics and planning systems capabilities. This role will own the multi-year vision and execution (more...)
Company: FCP Euro
Location: Milford
Posted on: 04/5/2026

Interface Analyst Full-Time Day Shift 25242
Description: Join Our Team at New Bridge Medical Center We are dedicated to providing high-quality, compassionate care to our diverse community. As a leading healthcare provider, we offer a supportive and inclusive (more...)
Company: Bergen New Bridge Medical Center
Location: Paramus
Posted on: 04/5/2026

Salary in Hempstead, New York Area | More details for Hempstead, New York Jobs |Salary

Principal Software Engineer
Description: Fidelity Investments, a leader in financial services, is seeking a Principal Software Engineer to join our IT amp Data Management team in Boston, MA. This role is
Company: Financial Advisory Workforce
Location: Lyndhurst
Posted on: 04/5/2026

RTRC Infrastructure Manager
Description: Date Posted: 2026-03-09 Country: United States of America Location: US-CT-EAST HARTFORD-RTRC J 411 Silver Ln RTRC J Position Role Type: Hybrid U.S. Citizen, U.S. Person, or Immigration Status Requirements: (more...)
Company: RTX
Location: East Hartford
Posted on: 04/5/2026

Embedded Service Technician II
Description: Securitas Technology Corporation STC Technology and Solutions groups are experiencing tremendous success, and we currently have an Embedded Technician opening for team-oriented individuals possessing (more...)
Company: Securitas Technology
Location: Stratford
Posted on: 04/5/2026

Maintenance Forecasting Analyst (Onsite)
Description: Date Posted: 2026-02-24 Country: United States of America Location: US-CT-EAST HARTFORD-ETC 400 Main St BLDG ETC Position Role Type: Onsite U.S. Citizen, U.S. Person, or Immigration Status Requirements: (more...)
Company: RTX
Location: East Hartford
Posted on: 04/5/2026

Web Developer
Description: Bluu Inc. is provider of payment processing services, and a developer of Point-Of-Sales solutions. We provide one-stop shopping or payment processing, POS hardware and software, online ordering, online (more...)
Company: United Merchant Services
Location: Hackensack
Posted on: 04/5/2026

Embedded Service Technician II
Description: Securitas Technology Corporation STC Technology and Solutions groups are experiencing tremendous success, and we currently have an Embedded Technician opening for team-oriented individuals possessing (more...)
Company: Securitas Technology
Location: West Haven
Posted on: 04/5/2026

Cloudhealth Customer Success Manager
Description: Position: Cloudhealth Customer Success Manager Job Description: Arrow Enterprise Computing Solutions ECS , a part of Arrow Electronics, brings innovative IT solutions to the market to solve complex (more...)
Company: Arrow Electronics, Inc.
Location: Melville
Posted on: 04/5/2026

Loading more jobs...

Site Reliability Engineer II

Didn't find what you're looking for? Search again!

Other IT / Software / Systems Jobs

Log In or Create An Account