Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Cloud Site Reliability Engineer

Temporary

Astra North Infoteck Inc.

Job Description: Skills: Dynatrace, Observability, Monitoring Engineering, SRE Practices

Experience: 6-8 years

Job Description

We are seeking a highly skilled Dynatrace Monitoring Engineer / Site Reliability Engineer (SRE) responsible for designing, implementing, and maintaining observability solutions across enterprise applications and infrastructure. This role focuses on proactive monitoring, performance visibility, incident prevention, and enforcing reliability standards through service-level objectives (SLOs). The ideal candidate brings deep Dynatrace expertise along with strong troubleshooting, communication, and architectural awareness.

Key Responsibilities

Dynatrace Engineering & Monitoring

Design, configure, and maintain Dynatrace dashboards, alerting rules, and synthetic monitoring for business-critical URLs.

Build customized dashboards for:

Application Performance (APM)

Infrastructure monitoring (hosts, processes, services) Kubernetes & cloud workloads Business metrics & SLA/SLO insights

Use DQL (Dynatrace Query Language) to create advanced tiles, analytic views, and metric visualizations.

Standardize dashboards to be reusable, scalable, and aligned with business KPIs.

Observability & SRE Practices

Define and manage Service Level Objectives (SLOs) to measure availability, reliability, and operational performance.

Exercise key SRE decision rights (e.g., rejecting operationally substandard software, advising developers on improvements).

Implement observability requirements ensuring systems meet expected service levels with proper operational characteristics.

Focus on reliability, scalability, and performance of production computing systems, including complex distributed systems.

Develop observability standards that ensure predictable system behavior and early detection of errors or failures.

Incident Management & Problem Resolution

Conduct root cause analysis (RCA) through post ‑ mortem reviews, ensuring permanent remediation and preventing recurrence.

Provide strong troubleshooting for application, infrastructure, and integration-level monitoring issues.

Integrate Dynatrace and monitoring workflows with ITSM platforms.

Cross ‑ Functional Collaboration

Work closely with infrastructure, application, cloud, and security teams to ensure seamless operational monitoring.

Lead or contribute to enterprise-wide initiatives as a subject matter expert.

Interact with governance, audit, compliance, and risk groups to provide observability insights and ensure adherence to standards.

Identify emerging technologies and propose innovative enhancements to monitoring and reliability engineering practices.

Essential Skills

Strong hands-on experience with Dynatrace SaaS/Managed, including dashboard creation, alert configuration, and synthetic monitoring.

Strong understanding of APM concepts, infrastructure monitoring, cloud monitoring, and (preferably) Kubernetes/microservices environments.

Familiarity with DQL, metrics, entity models, and relationships within Dynatrace.

Experience integrating Dynatrace or similar monitoring tools with ITSM systems.

Excellent troubleshooting and communication skills.

Strong foundation in networking, reliability engineering, scalability, and cloud operational characteristics.

Ability to drive SRE practices such as:

SLO creation

Release readiness assessments

Operational risk evaluation

Continuous improvement through automation and observability standards

Vacancy posted 8 hours ago
Similar jobs that could be interesting for youBased on the Cloud Site Reliability Engineer in Toronto, ON vacancy
  •  ...Site Reliability Engineer - Dynatrace & Ansible Required Skills & Experience (Mandatory) ~5–8 years of experience in SRE | DevOps | or Platform Engineering roles ~ Strong hands-on experience with Dynatrace for observability and monitoring ~ Strong hands-... 
    Suggested
    Full time

    Astra North Infoteck Inc.

    Toronto, ON
    a month ago
  •  ...incident response to raise the reliability and transparency of our...  ...end observability stack across Dynatrace, Splunk, Power BI, and Google...  ...service health and NOT platform engineering or DevOps provisioning. Is...  ...to, an accessible interview site, alternate format documents,... 
    Suggested
    Flexible hours

    Scotiabank

    Toronto, ON
    2 hours ago
  • $100.9k - $131.1k per year

     ...initiatives, contributing directly to the code base, guiding services to production readiness, and building common tooling for all of engineering. We are all curious folks and strive to be constantly learning! This role follows a hybrid schedule, with in-office work required... 
    Suggested
    Long term contract
    Permanent employment
    Temporary work
    Manual labor
    Work at office
    Remote work
    Flexible hours

    ecobee

    Toronto, ON
    1 day ago
  •  ...Job Description: Site Reliability Engineer (SRE) – Observability Toronto - Hybrid (1-2 days office) Role Summary We are looking for...  ...applications into observability platforms (e.g., Dynatrace, ELK, Datadog) • Configure dashboards, alerts, and basic anomaly... 
    Suggested
    Contract work
    Work at office

    Astra North Infoteck Inc.

    Toronto, ON
    8 days ago
  •  ...best of both work styles in a workplace that is intentional about belonging, collaboration, and accomplishment. Being a Site Reliability Engineer – Data Services at iManage Means… You are an engineer, a builder, and a systems thinker. You ensure data durability, optimize... 
    Suggested
    Full time
    Work at office
    Local area
    Remote work
    Worldwide
    Monday to friday
    Flexible hours

    iManage

    Toronto, ON
    1 day ago
  •  ...Job Description What is the opportunity? Join RBC as a Lead Site Reliability Engineer and take the lead in ensuring the reliability, scalability, and performance of our critical production systems and infrastructure. This is your chance to drive innovation through cutting... 
    Full time
    Remote work

    Royal Bank of Canada

    Toronto, ON
    10 days ago
  • $50 per hour

     ...Role: Site Reliability Engineer - Production Support Rate Max for $50/hr. Position Overview seeks a skilled and experienced Production Support Engineer through vendor staffing to support our digital applications. This role combines hands-on production support with... 
    Contract work

    Aarorn Technologies Inc.

    Toronto, ON
    9 hours ago
  • $72k - $138k per year

     ...mentoring and on the job coaching Summary As a Senior Site Reliability Engineer – Production Management, you will design, deliver, and...  ...using industry‑standard tooling (e.g. Datadog, AppDynamics, Dynatrace). Develop and operate services that rely on middleware technologies... 
    Temporary work
    Fixed term contract
    Flexible hours

    Deloitte

    Toronto, ON
    15 hours ago
  • $141k - $191k per year

     ...and develop your career. As an SRE Manager, you will lead a team of 10+ engineers, oversee their development and ensure operational excellence. About the Role: In this opportunity as Site Reliability Engineering Manager , you will be responsible for: Team Leadership... 
    Work at office
    Local area
    Flexible hours
    2 days per week
    3 days per week

    Thomson Reuters

    Toronto, ON
    15 days ago
  •  ...other’s unique experiences and embrace the flexibility to do your best work. Creating a career you love? It’s Possible. The Site Reliability Engineering organization at Pinterest is accountable for ensuring overall Pinterest availability as well as enhancing Engineering teams... 
    Work at office
    Local area
    Relocation
    Relocation package

    Pinterest

    Toronto, ON
    24 days ago
  •  ...San Francisco and founded in 2014, Tubi is part of Tubi Media Group, a division of Fox Corporation. About the Role: Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization that applies a developer's mindset... 
    Remplacement
    Full time
    Contract work
    Temporary work
    Flexible hours

    Tubi

    Toronto, ON
    14 days ago
  •  ...Title: Site Reliability Engineer (Production Support & Incident Management) Role: Site Reliability Engineer Location: Toronto...  ...of tools and languages, such as: • DevOps CI/CD • Dynatrace • Splunk • PagerDuty • ServiceNow • Software... 
    Contract work

    Astra North Infoteck Inc.

    Toronto, ON
    1 day ago
  •  ...Description What is the opportunity? This role will be responsible for the development, implementation, and support of Site Reliability Engineering (SRE) solutions for applications supported by the Digital Branch SRE organization. As the Engineering arm of the Digital... 
    Full time
    Flexible hours

    Royal Bank of Canada

    Toronto, ON
    16 days ago
  • $107k - $157.3k per year

     ...highly motivated and experienced Senior Site Reliability Developer (SRE) to manage critical...  ...cloud infrastructure. Reporting to the Engineering Manager, you will be leading design and...  ...with monitoring and logging tools such as Dynatrace, Grafana, DataDog, ELK Stack, and CloudWatch... 
    Full time
    For contractors

    Autodesk

    Toronto, ON
    1 day ago
  •  ...Production Support Engineer / SRE Work Mode: 4 days Onsite Production Support...  ...and database issues using: · Dynatrace · OpenShift (OCP) · Elastic / Kibana...  ...· Experience with DevOps and Site Reliability Engineering tools such as: Helios, UCD... 
    Full time

    Astra North Infoteck Inc.

    Toronto, ON
    a month ago
  • ~ University degree in an Engineering discipline or technologist course  relevant to job/equipment function. ~3 to 5 years working as...  ...record of at least 2 to 3 years demonstrating application of Reliability methods and analysis ~ Experience in controlling and being accountable... 
    Local area

    Alstom

    Toronto, ON
    7 days ago
  •  ...Job Title: Rotating Engineer – Offshore Reliability Experience: Minimum 12 Years Qualification: Bachelor’s Degree in Mechanical Engineering Industry: Oil & Gas / Refinery (Offshore) Work Location : Saudi Arab Job Description: The Rotating Engineer – Offshore... 
    Permanent employment
    Full time

    Hudson Manpower

    Toronto, ON
    2 days ago
  •  ...Job Title: Rotating Engineer – Onshore Reliability Experience: Minimum 12 Years Qualification: Bachelor’s Degree in Mechanical Engineering...  ...ensure safe and efficient plant operations. # Ensure reliable operation and optimal performance of rotating equipment including... 
    Permanent employment
    Full time

    Hudson Manpower

    Toronto, ON
    2 days ago
  •  ...Job Title: Mechanical Engineer – Onshore Reliability Experience: Minimum 12 Years Qualification: Bachelor’s Degree in Mechanical Engineering...  ..., and optimize maintenance activities to ensure safe, reliable, and efficient plant operations. # Develop and implement... 
    Permanent employment
    Full time

    Hudson Manpower

    Toronto, ON
    2 days ago
  •  ...Job Title: Mechanical Engineer – Offshore Reliability Experience: Minimum 12 Years Qualification: Bachelor’s Degree in Mechanical Engineering Industry: Oil & Gas / Refinery (Offshore) Work Location : Saudi Arab Job Description: The Mechanical Engineer... 
    Permanent employment
    Full time

    Hudson Manpower

    Toronto, ON
    2 days ago
  • $133k - $199.6k per year

     ...communications. Our team collaborates closely with engineering teams across Stripe and internal...  ...an ability to establish priorities and reliably execute on solutions (often with hard deadlines...  ...office for team/business meetings, on-sites, meet-ups, and events, our expectation is... 
    Full time
    Work at office
    Local area
    Remote work
    Work from home
    Relocation

    Stripe

    Toronto, ON
    18 days ago
  •  ...Protecnium  is an international consulting firm specializing in engineering and technical services ( . We are currently looking for a  Site Engineer  to join our team. -Project:   subway/tunnel project -Estimated length : 18 months- with the possibility of being extended... 
    Contract work
    Temporary work
    Local area
    Monday to friday
    Night shift

    Protecnium Ingeniería y Servicios, S. L.

    Toronto, ON
    3 days ago
  •  ...Position Purpose: The primary objective of the Database Reliability Engineer r is to provide expertise across database and data platform...  ...databases, ensuring they operate efficiently, securely, and reliable within private and public cloud environments Designing and... 
    Long term contract
    Full time
    Internship
    Rotating shift

    Orion Health

    Toronto, ON
    a month ago
  • $107k - $157.3k per year

     ...experienced Senior Software Reliability Developer to join our Autodesk...  ...relationships. Reporting to the Software Engineering Manager, you will be part of...  ..., including 3+ years in a Site Reliability Engineering role...  ..., and logging tools, such as Dynatrace, Splunk, OpenTelemetry,... 
    Long term contract
    Full time
    For contractors
    Work at office
    1 day per week

    Autodesk

    Toronto, ON
    1 day ago
  • $90k - $100k per year

     ...Efficiency and reliability are the pillars of our success. Amentum is looking for a Facilities Maintenance Site Manager to drive our preventive maintenance programs and optimize...  ...Qualifications: Bachelor’s degree in engineering, Business Administration, Facility Management... 
    Hourly pay
    Daily paid
    Long term contract
    Remplacement
    Contract work
    Work at office
    Local area
    Shift work
    Weekend work

    Amentum

    Toronto, ON
    17 days ago
  •  .... The Role The DevOps Engineer is responsible for providing...  ...-prem applications to ensure reliability and performance using SLOs, SLIs...  ..., Grafana, ELK stack or Dynatrace. ~ Working knowledge of Streaming...  ...to, an accessible interview site, alternate format documents,... 
    Internship

    Scotiabank

    Toronto, ON
    11 days ago
  •  ...to Ontario Faster, more frequent, and reliable access to rapid transit with more than 227...  .... Job Description The MEP Site Superintendent leads and supervises all MEP...  ...submittals. Liaise with consultants, engineers, and clients regarding technical issues and... 
    Full time
    Contract work
    For subcontractor
    Local area

    Ontario Transit Group

    Toronto, ON
    25 days ago
  • $70k - $80k per year

     ...Amentum is seeking a Reliability Planning Analyst I to support our team of multi-skilled technicians...  ...with OSHA, EPA and Company and Site-Specific rules and regulations always....  ...accomplish work. Identifies work requiring engineering and design and reviews with proper entities... 
    Hourly pay
    Contract work
    Work at office
    Local area
    Shift work
    Weekend work
    Day shift

    Amentum

    Toronto, ON
    18 days ago
  • $126.8k - $164.1k per year

     ...for our clients. Role Overview The Engineer II plays a key role in developing and...  ...expertise with a strong focus on efficiency, reliability, scalability, and security, supporting...  ...using tools such as Datadog and Dynatrace Improve system reliability through automation... 
    Work from home

    TD

    Toronto, ON
    9 days ago
  • $90k - $110k per year

     ...Spadina, Moss Park, Corktown). Job Description The Site Geologist will play a key role in integrating geological expertise...  ...closely with multidisciplinary teams including geotechnical engineers, designers, project managers, and contractors, this role will ensure... 
    Full time
    Contract work
    For contractors

    Ontario Transit Group

    Toronto, ON
    23 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Cloud Site Reliability Engineer. Be the first to apply!