Site Reliability Engineer (SRE) - Azure AKS, Observability & Terraform
Temporary
Astra North Infoteck Inc.
Site Reliability Engineer (SRE) – Azure AKS, Observability & Terraform
Remote Role
Key Responsibilities
- Observability, SRE, DevOps roles with expertise in infrastructure and application reliability
- Dynatrace, ELK, Splunk, PagerDuty
- SLI/SLO frameworks
- Azure Kubernetes Service (AKS), Terraform, Azure managed services
What will you do
- Design and implement observability-as-code solutions using Terraform for monitoring pipelines, dashboards, and alerting across distributed systems
- Drive observability improvements using Dynatrace, ELK, Splunk, PagerDuty for real-time performance insights and system visibility
- Instrument applications for end-to-end observability including distributed tracing, metrics collection, and log aggregation across Node.js and .NET microservices and event-driven architectures
- Troubleshoot complex production incidents across service layers, databases, caches, and APIs using SLI/SLO frameworks
- Investigate and resolve Azure Kubernetes Service (AKS) infrastructure issues ensuring reliability and scalability of containerized workloads using Terraform and Azure services (SQL MI, Redis, Functions, Event Grid)
- Translate business requirements into observable, resilient systems aligned to SLIs/SLOs
- Automate operational tasks using Infrastructure-as-Code and CI/CD to reduce toil and improve resilience
- Lead incident response and remediation for critical systems, including blameless postmortems and chaos engineering practices
- Collaborate with development, platform, and business teams to improve availability, scalability, and operational excellence
What do you need to succeed
Must-have
- 8+ years experience in SRE, DevOps, or Observability roles focused on infrastructure and application reliability
- Strong expertise in Dynatrace, ELK, Splunk, PagerDuty and observability principles (instrumentation, correlation IDs, SLIs/SLOs)
- Advanced proficiency in Azure Kubernetes Service (AKS), Terraform, and Azure managed services (SQL MI, Redis, Functions, Event Grid)
- Hands-on experience with observability instrumentation (distributed tracing, metrics, logs) across Node.js and .NET microservices and event-driven systems
- Strong troubleshooting skills across distributed systems (services, databases, caches, APIs) in production environments
- Incident management expertise using PagerDuty and ServiceNow, including high-severity incident resolution and RCA
- Knowledge of incident, problem, and change management, SRE principles, blameless postmortems, and chaos engineering
- Strong communication and leadership skills for cross-functional coordination and incident handling
Vacancy posted a month ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer (SRE) - Azure AKS, Observability & Terraform in Montréal, QC vacancy
$110k - $120k per year
...500/1000 companies to small and mid-sized organizations in Canada/US and Europe. We currently have a role as a Senior Site Reliability Engineer (SRE) – Automation & Observability with our large consulting client, working onsite at a major financial services client in...SuggestedPermanent employmentContract workWork at office- ...Job Title: Site Reliability Engineer (SRE) Location: Montreal, QC Note: Prior experience in BFSI, Public Sector, or Telecom is non-negotiable . Position Overview: Seeking an experienced Site Reliability Engineer (SRE) with a strong background in BFSI...Suggested
- ...good fit for you! About the Role We are looking for a Site Reliability Engineer to join our Network and Security Operations Center (NOC),... ...across AWS and Kubernetes, with a strong focus on automation, observability, and continuous improvement. This role blends reliability...SuggestedLong term contractPermanent employmentFull timeRemote work
$200k per year
...DevOps Engineer Job Opportunity DevOps Engineer Client Most Elite Tech Firm in Canada Compensation Up to $200k CAD + Bonus +... ...FinTech Firm is looking for a highly talented DevOps Engineer/Systems SRE to join a talented flat-structured team within a larger firm!...SuggestedPermanent employmentWork at officeFlexible hours- ...or other immigration support). What you'll do We are looking for an engineer to join an already established SRE team for the SAP Business Technology Platform. As a Site Reliability Engineer, you will have the opportunity to operate and support business critical...SuggestedPermanent employmentFull timeLmiaWork at officeLocal areaWorldwideFlexible hours3 days per week
- ...RoleThe Director of Infrastructure & SRE owns the function end-to-end: reliability, security, scalability, and... ...peer to the Director of Software Engineering, Director of Data Engineering, and... ...60% of your time hands-on (writing Terraform, leading incidents, doing architecture...InternshipManual laborRemote workShift work
- ...Finance Regulatory RPE as a Senior Data Reliability Engineering Specialist to lead critical... ...production activities. Champion adoption of observability metrics and monitoring to improve... ...Autosys and Jira. Familiarity with SRE principles. Strong troubleshooting...Full timeWork at officeRemote work
- ...Systems Reliability Engineering (SRE) is a production-oriented discipline focused on improving system service availability, observability, scalability, performance, and reliability for technology products across Morgan Stanley by applying sound software engineering principles...Full timeWork at officeRemote work
$115k - $120k per year
...as a Senior Cloud Architect - Azure with our large consulting... ...provisioning of cloud services using Terraform, scripting, and related tools... ...and ensure platform reliability and performance Train and advise... ...English/French) Experience with AKS, FinOps, and high availability...Permanent employmentContract workWork at office- ...recherche d' un(e) architecte infonuagique Azure ayant d'excellentes aptitudes... ...Infrastructure as Code, monitoring et observabilité; Sécurité infonuagique, gestion des identités... ...Application Insights; Azure B2C, ARM, Bicep ou Terraform; Certifications Microsoft Azure;...Hourly payContract workApprenticeshipRemote workFlexible hours
- ...growth, we are looking for a Microsoft Azure Developer to join large-scale projects with... ...teams to ensure performance and reliability of solutions. Required Skills Bachelor... ...’s degree in Computer Science, Software Engineering, or equivalent experience. Minimum of...Permanent employmentFull time
$80k - $120k per year
Architecte Azure – Montréal Description de poste Architecte Azure... ...) et l’Infrastructure as Code (Terraform, ARM templates). • Surveiller... ...(App Services, Functions, AKS, Stockage, Réseaux, Sécurité).... ...clients. • Une présence sur site au bureau de Montréal est requise...Work at office2 days per week- ...stabilité des environnements via Azure DevOps et les services cloud... ...-as-Code (IaC) comme Terraform ou les modèles ARM. Collaborer... ...CV par l'intermédiaire de ce site web ou directement aux gestionnaires... ...skilled and proactive DevOps Engineer to join our growing team to...Full timeInternshipFlexible hours
- ...run analysis. Today, that means Azure VMs running Dockerized... ...looking for a senior software engineer who loves working at the boundary... ...is not a traditional DevOps or SRE role. You’ll spend most of your... ..., bootstrapping, permissions, observability, and security. The right person...Permanent employmentFull timeSummer workInternshipWork at officeRemote workWork from home
- ...in 2010 by two aeronautical engineers who realized that the healthcare... ...teams working to scale reliable, maintainable systems. You will... ...Operations & Reliability (SRE/DevOps) Operational Maturity... ...IaC, backup/DR strategies, and observability across cloud environments (Azure...Long term contractFull timeImmediate startRemote workWorldwide
- ...applicative moderne basée sur des microservices, des API et des services Azure. Sans être exhaustifs, voici les services et livrables que... ...Azure DevOps, Azure B2C, Application Insights; ARM, Bicep ou Terraform; OpenSearch, Elasticsearch ou Azure Cognitive Search;...Hourly payFull timeContract workApprenticeshipRemote workFlexible hours
- ...email_hidden*** . Role :: Sr APIGEE Engineer Location :: Remote Job Type ::... ...the Apigee Hybrid runtime plane on Azure Kubernetes Service (AKS) and eventually on-premises Kubernetes... ...platform deployments and upgrades using Terraform, Python, and Helm to ensure...Full timeLocal areaImmediate startRemote work
- ...Contribuer à l'intégration de services dans un environnement Azure; Réaliser les essais unitaires, essais intégrés et corrections... ...Elasticsearch ou moteur de recherche distribué comparable; Bicep, ARM, Terraform ou Infrastructure as Code; Expérience avec FastAPI ou...Hourly payContract workApprenticeshipRemote workFlexible hours
- ...’excellence et un environnement de travail stimulant. Dans le cadre de notre croissance, nous recherchons un Développeur Microsoft Azure pour intervenir sur des projets d’envergure chez l’un de nos clients situé à Montréal. Responsabilités Agir comme expert technique...Daily paidPermanent employmentFull timeWork at office
- ...position: As a Senior DevOps Engineer (GCP), you will be a key... ...and operations teams to ensure reliable streaming, high availability... ...networking, security, Kubernetes, observability and CI/CD) and help drive... ...for GCP using tools such as Terraform and Helm for repeatable, auditable...Apprenticeship
- ...group, an international consulting and engineering group, a world leader in the design of transport... ...of a railway. Context The Site Supervisor will be responsible for... ...actions. Photograph key stages, record observations, and produce complete, timely progress reports...Contract workPart timeFor contractorsInternshipWork at office
$60k per year
...appreciate the freedom of managing your time and travel, you are reliable, and you enjoy interacting with people. If you consider yourself... ...then look no further, apply now! GDI is looking for a Multi-site Operations Supervisor, who will be required to travel to our various...Full time- ...The Senior Azure Administrator role is responsible for the design... ...are collaboration, innovation, reliability and responsiveness. We offer a... ...entre les environnements sur site (on-premises), Azure et les systèmes... ...Certified: Azure Security Engineer Associate Microsoft...Permanent employmentFull timeFlexible hours
- ...infrastructure primarily on AWS (with some Azure exposure), and driving a culture of automation, reliability, and operational excellence.... ...as Code (IaC) using Terraform, AWS CloudFormation, or Pulumi... ...Reliability & Observability Own observability standards...Full timeLocal areaFlexible hoursShift work
- ...Position: Développeur Big Data Python, Databricks, and Azure Location: Montreal, QC, Canada - Centre-ville Mode de travail: 6 jours par mois en bureau Contract Details: Consulting, Hourly Environnement/ Industrie: Transport Durée: Maintenant jusqu'...Hourly payContract workSummer workWork at officeDay shift
- ...Job Title: MLOps Engineer (BFSI) Position Overview: The MLOps Engineer will... ...validation. Establish model monitoring and observability. Manage ML infrastructure and... ...Python GitHub Actions Azure DevOps Terraform Monitoring Tools Required Qualifications...
$70k - $120k per year
Développeur .NET Sénior – Azure & Intégration Applicative Position Description Développeur .NET Cloud Sénior – Azure, DevOps & IA... ...septembre : 4 jours/semaine en présentiel Mode de travail : Sur site / hybride selon les besoins opérationnels À propos du rôle Nous...Work at officeDay shift$80k - $98k per year
Site Supervisor Technician ref.23787JW Our client, an engineering consulting firm, is looking for a Site Supervisor Technician (90% site work and 10% office work). ~Monitor municipal construction sites (cities, municipalities and developers); ~Read plans and specifications...For contractorsWork at office- ...Description de poste - Développeur(se) IA & Copilot (Power Platform & Azure) Vous vous ennuyez dans la routine ? Vous souhaitez maximisez... ...l'homme. Pour plus d'information consultez notre site web L'opportunité vous intéresse? Faites-nous parvenir...Permanent employmentApprenticeshipRemote work
- ...travail par semaine Temps plein Type de poste : à distance / sur site : Présentiel - Montréal, QC Durée de la mission en mois 3... ...avancé des infrastructures TI, de la gestion des environnements Azure Cloud et Active Directory ainsi que du maintien des opérations...Full timeDay shift
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Site Reliability Engineer (SRE) - Azure AKS, Observability & Terraform. Be the first to apply!
Related searches
- site reliability engineer Montréal, QC
- site reliability engineer sre Montréal, QC
- site reliability engineer remote Montréal, QC
- senior site reliability engineer Montréal, QC
- site reliability engineer intern Montréal, QC
- stage creation site web Montréal, QC
- site carpenter Montréal, QC
- website developer Montréal, QC
- site safety Montréal, QC
- site maintenance Montréal, QC

