Senior Manager, Site Reliability Engineering
$164.6k - $235.1k par annéeTubi - Canada
About the Role:
Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization that applies a developer's mindset and toolkit to the challenges of building and running large-scale, distributed systems. Our mission is to engineer resilience from the ground up, enabling our product teams to innovate rapidly while ensuring our users have a stellar experience. We own the availability, latency, performance, and capacity of our platform, and we achieve our goals through a culture of data-driven decision-making, blameless learning, and relentless automation.
We are seeking an experienced and visionary Senior SRE Manager to lead and grow our newly built Site Reliability Engineering team. You are more than a people manager or a tech lead; you are the strategic leader responsible for architecting our reliability roadmap. You will build and mentor a team of talented engineers, foster a culture of blameless learning and continuous improvement, and champion the engineering practices that allow us to balance rapid innovation with rock-solid stability. You will be a key influencer in our engineering leadership, partnering with peers across the organization to ensure reliability is a shared responsibility and a core tenet of our engineering culture.
What You'll Do:
- Team Leadership & Mentorship:
- Lead, mentor, and grow a team of Site Reliability Engineers. Foster a culture of innovation and technical excellence where engineers feel empowered to do their best work. Provide personalized coaching, create professional development plans, and guide the careers of senior and emerging talent within the team.
- Establish equitable, sustainable on-call practices (including global coverage where applicable) that protect focus time and avoid burnout.
- Define team rituals - runbook reviews, game days, and incident retros - that reinforce quality and learning.
- Strategic Planning & Vision: Define and drive the multi-year technical strategy and vision for Tubi’s observability, and automation platforms. Partner with infra lead to align Tubi’s infrastructure & SRE roadmap. Partner with tech leaders to align the SRE roadmap with business objectives. Champion a data-driven approach to reliability, using Service Level Objectives (SLOs) and error budgets to facilitate productive conversations about risk and feature velocity.
- Operational Excellence & Incident Management:
- Own the end-to-end availability, performance, and efficiency of our critical user-facing services. Evolve our incident response practice to reduce Mean Time to Resolution (MTTR) and Mean Time Between Failures (MTBF). Champion a rigorous, blameless, and data-driven post-mortem culture to ensure we learn from both successes and failures, driving eng teams for systemic fixes and automation to prevent the recurrence of incidents.
- Streamline and improve our existing processes and practices, and collaborate with other teams to enhance our production release standards by improving current processes.
- Define and tune a 24×7 on-call rotation for low noise and fast response; act as executive escalation partner during major incidents.
- Own disaster-recovery strategy (playbooks, failover drills, recovery simulations) and track SLO gaps with time-bound remediations.
- Financial & Vendor Management: Own the SRE budget, tooling, and headcount. Manage relationships with key third-party vendors for our observability and SRE related AI platforms, work with infra lead and finance team for contract negotiations and ensure we derive maximum value from our investments.
- Cross-Functional Collaboration: Act as a key influencer and strategic partner to leaders in Software Engineering, Product Management, and Infra/Sec. Drive the adoption of SRE best practices and principles throughout the organization, ensuring new services are designed for reliability, scalability, and observability from day one.
- The AI Mandate : Building the Future of Observability with AI. You will not just manage a team that uses AI; you will lead the charge in building an AI-native SRE function. This is a strategic mandate that requires a forward-thinking leader who understands both the potential and the pitfalls of integrating intelligent systems into critical operations. This includes:
- AIOps Strategy Development: Developing and executing the strategy for integrating AIOps and machine learning into our observability stack. Your goal will be to move the team from a reactive monitoring posture to one of predictive maintenance and automated anomaly detection, fundamentally changing how we ensure reliability.
- Accelerating Automation with AI: Championing the effective and responsible use of AI-assisted coding tools (e.g., Claude Code, Cursor) within the SRE team. You will set the standards and practices to leverage these tools to accelerate the development of automation, operational tooling, and infrastructure code.
- Building the Business Case: Building the techno-economic case for new AI tooling, managing vendor relationships, and ensuring the cost-effective and secure implementation of these powerful systems. You must be able to articulate the ROI of these investments in terms of reduced downtime, improved operational efficiency, and faster incident resolution.
- Fostering Critical AI Literacy: Fostering a culture that can critically evaluate, debug, and learn from the outputs of AI systems. This involves extending our blameless post-mortem philosophy to AI-driven actions and recommendations, ensuring that the team remains in control and understands the "why" behind automated decisions.
Your Background:
- 8+ years of experience in a technical field, with at least a year in an engineering leadership position managing SRE, DevOps, or Production Engineering teams.
- A deep, principled understanding of SRE tenets, including Service Level Indicators (SLIs), SLOs, error budgets, toil reduction, and capacity planning.
- Exceptional communication, negotiation, and influencing skills, with the ability to articulate complex technical concepts and strategies to both technical and non-technical stakeholders at all levels of the organization.
- A strong technical background as a hands-on software engineer or site reliability engineer prior to moving into management. Deep knowledge of AWS services (especially networking, IAM, EKS, ALBs/NLBs, Route 53, CloudWatch). Proven experience with Kubernetes in production (EKS preferred), including service exposure, networking, and availability engineering.
- Hands-on familiarity with modern SRE tools and technologies, including Infrastructure as Code (e.g., Terraform, Ansible), container orchestration (Kubernetes), observability platforms (e.g., Prometheus, Grafana, Datadog, Splunk), and incident tooling (e.g., PagerDuty, FireHydrant), deployment-safety tooling (e.g., Argo Rollouts, LaunchDarkly), and observability standards (e.g., OpenTelemetry).
#LI-BT1
#LI-Hybrid
Pursuant to local pay disclosure requirements, the pay range for this role, with final offer amount dependent on education, skills, experience, and location is as listed annually below.
This role is also eligible for an annual discretionary bonus, long-term incentive plan, and various benefits including medical/dental/vision, insurance, vacation/paid time off and other benefits in accordance with applicable plan documents.
Toronto, Canada
$164,600—$235,100 CAD
Tubi Media Group is a division of Fox Corporation, and the FOX Employee Benefits summarized here, covers the majority of employee benefits. The following distinctions below outline the differences between the Tubi and FOX benefits:
For all salaried employees, in lieu of the FOX Vacation policy, Tubi offers a Flexible Time Off Policy to manage all personal matters.
For all full-time, regular employees, in lieu of FOX Paid Parental Leave, Tubi offers a generous Parental Leave Program, which allows parents twelve (12) weeks of paid bonding leave (top up in Canada) within the first year of birth, adoption, surrogacy, or foster placement of a child in addition to applicable government leave program(s) and FOX’s short-term disability policy (if applicable). This time is 100% paid through a combination of any applicable government leaves and wage-replacement programs in addition to contributions made by Tubi.
For all full-time, regular employees, Tubi offers a monthly wellness reimbursement.
About Tubi:
Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals made for the most passionate fans. Headquartered in San Francisco and founded in 2014, Tubi is part of Tubi Media Group, a division of Fox Corporation.
We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, gender identity, disability, protected veteran status, or any other characteristic protected by law. We will consider for employment qualified applicants with criminal histories consistent with applicable law.
$130k - $180k par année
...to legally work in Canada (visa or sponsorship won't be provided) Our Platform is growing and we are looking to hire a Senior Site Reliability Engineer (SRE) / Cloud Engineer Our main Cloud Platform is Azure (those with Azure will be prioritized first) About Us:...SeniorTravail à distanceParrainage de visaVisa de travailHoraires flexibles- ...Senior Site Reliability Engineer - Edge Location : Ottawa/Toronto, On-Site Reports to: Head of Security The Role You own the edge compute module — the standard hardware stack, OS image, and runtime that integrates with Dominion Dynamics's mesh radios, sensors...SeniorTemps plein
$110k - $120k par année
...recognition programs that celebrate your impact The Job: Site Reliability Engineer The Site Reliability Engineer is responsible for ensuring... ...reporting, SLO performance metrics, and incident trends to senior management. What You Bring : Technical Proficiency:...SeniorTravail temporaireStageTravail au bureauTravail à distance$100k - $125k par année
...We are seeking an experienced and motivated Software Engineer to join our dynamic Site Reliability Engineering (SRE) team. As a Site Reliability Engineer... ...procurement, employee expenses, corporate cards, supplier management, tax compliance, and treasury. Tipalti partners with...SuggéréTemps pleinTravail au bureauHoraires flexibles$192k - $288k par année
...partnering with product squads to scale reliability best practices and design safe deployment... ...complex architectural challenges, mentoring engineers, and shaping the long-term resilience... ...across multiple teams. Background in Site Reliability Engineering (SRE) Familiarity...SuggéréContrat Longue DuréeTemps pleinTravail au bureau- ...best of both work styles in a workplace that is intentional about belonging, collaboration, and accomplishment. Being a Senior Site Reliability Engineer at iManage Means… You are an engineer, a builder, and a systems thinker. You’ll create middleware and platform...SeniorTemps pleinTravail au bureauZone localeTravail à distanceLe monde entierLundi au vendrediHoraires flexibles
$141k - $191k par année
...Do you have experience in Service Management, working with cloud providers, software... ...SRE Manager, you will lead a team of 10+ engineers, oversee their development and ensure operational... ...the Role: In this opportunity as Site Reliability Engineering Manager , you will be...Travail au bureauZone localeHoraires flexibles2 jours par semaine3 jours par semaine- ...platforms across capital markets, investment management, and data-driven decision-making. Technology is a strategic enabler, and reliability, security, and governance are... ...analytics teams. Your new role As a Site Reliability Engineer (SRE) focused on User Access &...Сontrat
- ...functions including trading, market data, portfolio management, risk, and investment operations. Reliability, controlled change, and clear operational... ...supports the business. Your new role As a Site Reliability Engineer (SRE) – Applications, you will focus on operational...Contrat Longue DuréeСontrat
- ...Site Reliability Engineer (SRE) – Azure AKS, Observability & Terraform Key Responsibilities Observability, SRE, DevOps roles with expertise... ...Azure Kubernetes Service (AKS), Terraform, Azure managed services What will you do Design and implement observability...Сontrat
- ...CIBC, please visit CIBC.com What You'll Be Doing As the Senior Manager, Data Engineer - CAM, you will be responsible for developing and maintaining... ...work arrangement where you'll spend 1-3 days per week on-site, while other days will be remote. How You'll Succeed Data...SeniorTemps pleinStage3 jours par semaine1 jour par semaine
$80k par année
...We have been recognized by Deloitte as one of Canada’s Best Managed Companies and by Waterstone Human Capital as one of Canada’s... ...upskilling within a quickly growing company JOB OVERVIEW: As the Site Manager, you will be responsible for managing day-to-day janitorial...Emploi permanentTemps pleinСontratTravail au bureauHoraires flexibles- ...together. Our Team, Your Impact Site General Manager, TGO Canada Teva is looking for an... ..., modernization, and growth. As the senior-most leader on site, you will set the vision... .... Customer Service & Supply Reliability Lead a best‑in‑class S&OP process to...SeniorContrat Longue DuréeStageLe monde entierTravail posté
$110k - $130k par année
...sense of belonging at Quince. THE ROLE Senior Site Merchandiser We are seeking a Senior... ...conversion. This role is responsible for managing day-to-day content updates, homepage... ...follow-through Responsibilities Own and manage day-to-day homepage updates, ensuring...SeniorEmploi saisonnierTravail au bureauZone locale$150k - $170k par année
...General Information Job Title: Senior Engineering Manager Location: Toronto, ON (Onsite/Hybrid) Job Type: Full-Time Reporting Line:... ...architecture decisions for scalability, security, maintainability, reliability, and cost efficiency. Partner with Technical Leads to...SeniorContrat Longue DuréeTemps pleinTravail au bureauRecrutement immédiatTravail à distanceHoraires flexibles$185k - $200k par année
...platform empowers branded merch distributors to work smarter — managing projects, building quotes, placing orders, and... ...something great, you’ll want to keep reading. About the Role: Senior Engineering Manager We’re growing — and we’re hiring a Senior Engineering Manager...SeniorContrat Longue DuréeStageTravail à distance$95k - $130k par année
...candidates can work from anywhere in Canada. The Platform Engineering Senior Manager leads a team of business analysts and platform engineering... ...flows, and non-functional requirements to ensure secure, reliable, and maintainable solutions. ~ Create and maintain architecture...SeniorTemps pleinPour les contractantsBureau à domicileHoraires flexibles- At Kepler Communications , we're not just imagining the future of on-demand space connectivity - we're leading it! Our mission is to provide real-time Internet access for space-based assets, enabling a new era of data-driven exploration and innovation. With 33 satellites...SeniorTemps pleinTravail au bureauProgramme de réinstallation
- ...Peter Lucas Project Management invests in people, community, and cutting-edge technology... ...currently looking for an Operational Reliability Project Engineer to join our team. At Peter Lucas, we... ...River, Ontario, in a full-time, on-site Monday–Friday role . This position thrives...Temps pleinStageLundi au vendredi
$50k - $60k par année
...Commercial Cleaning Services (CCS) is seeking a Site Manager for one of our facilities in the Toronto. The Site Manager is responsible for the daily management of janitorial services, staff performance, and service delivery excellence. CCS is a rapidly growing Building...RemplacementTemps pleinPour les contractantsTravail au bureau$105k - $234k par année
...the team We're growing our DevOps & Platform Engineering practice and are seeking an experienced Senior Manager to help lead it. Platform Engineering is one of the... ...IT functions, and enterprise-grade security and reliability. We help clients harmonize their ecosystem of platforms...SeniorEmploi permanentHoraires flexibles$69k - $114k par année
...mentoring and on the job coaching -- Deloitte Global is the engine of the Deloitte network. Our professionals reach across disciplines... ...• 5+ years of work experience, preferably in customer success management, account management, solution sales, or customer-facing roles...SeniorEmploi permanentTravail à distanceHoraires flexibles- ...go. Join EY and help to build a better working world. The opportunity We are seeking a highly experienced Senior Manager – Cloud Engineering to lead the design and delivery of integrated cloud solutions that align business strategy with modern technology platforms...SeniorTravail posté
- ...Senior Security Engineer Location : Toronto, On-Site Reports to: Head of Security The Role This is an early, high-ownership security engineering... ...identity platform as infrastructure-as-code (Terraform-managed IdP), including SSO/SCIM integrations across the...SeniorTemps plein
- ...to advance society sustainably. As a Senior Design Engineer , you'll be joining a global leader in... ...detailed drafting, analysis/simulation, site-based reverse engineering, and other technical... ...Overall, you will ensure the effective management and protection of the intellectual...SeniorContrat Longue DuréeTemps pleinTravail temporaireZone localeTravail à distanceHoraires flexibles
- ...infrastructure. ~ Deploy and manage services on Kubernetes-based... ...Amazon EKS and Google Kubernetes Engine (GKE). ~ Provision and... ...proactive solutions to enhance reliability and efficiency. ~ Implement... ...years of experience in a DevOps, Site Reliability Engineering, or Cloud...SeniorLe monde entier
$110k - $150k par année
...replace cars. Could you be the full-time Project Engineering Manager in Toronto, ON, we’re looking for? Your future role Take... ...through award-winning learning opportunities. Progress towards senior leadership roles or specialized technical expertise....SeniorContrat Longue DuréeTemps pleinLe monde entierHoraires flexibles$160k - $180k par année
...Founded in 1993, Kinross is a Canadian-based senior gold mining company with operations and... ...We are seeking a highly experienced Senior Manager - Environment to provide technical... ...working closely with Corporate Finance and site teams to ensure alignment between closure...SeniorContrat Longue DuréeTravail temporaireTravail occasionnelTravail au bureauRecrutement immédiat- ...group of innovative AEM DevSecOps engineers who prioritize security, reliability and automated compliance across our... ...customer and business user. The Senior AEM DevSecOps Engineer As a Senior... ...into the CI/CD pipeline and managing complex identity and content delivery...SeniorZone localeLe monde entier
$135k - $150k par année
...Kinross is a Canadian-based senior gold mining company with operations... ...and hands-on Senior Platform Engineer to drive the evolution of our... ...capabilities, improve reliability, accelerate delivery, and reduce... ...to platform lifecycle management, capacity planning, performance...SeniorContrat Longue DuréeTravail temporaireTravail occasionnelZone localeRecrutement immédiatTravail à distance
Voulez-vous recevoir plus d'offres d'emploi ?
S'abonner et recevoir des offres d'emploi similaires à Senior Manager, Site Reliability Engineering. Soyez parmi les premiers à postuler !
- production manager Toronto, ON
- directeur industriel Toronto, ON
- director of manufacturing Toronto, ON
- event production manager Toronto, ON
- industrial production manager Toronto, ON
- responsable équipe production Toronto, ON
- directeur production Toronto, ON
- manufacturing production manager Toronto, ON
- video production manager Toronto, ON
- responsable maintenance industrielle Toronto, ON
