Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

MÉCANICIEN MACHINES INDUSTRIELLES

$90k - $140k per year

Principal Site Reliability Engineering specialist (SRE)

Principal Site Reliability Engineer (SRE)

Languages: Bilingual (French & English)

We are hiring a Principal Site Reliability Engineering specialist (SRE) to support the design, evolution, and operation of mission critical technology platforms. In this strategic and handson role, you will lead the adoption of SRE best practices, shape cloud and application architectures, and drive the reliability, performance, and availability of client services. You will influence engineering standards, strengthen operational excellence, and collaborate across development, operations, security, and business teams to deliver resilient, scalable, and modern cloud solutions.

You are an experienced SRE professional with deep technical expertise and a strong ability to improve reliability at scale. You communicate effectively with technical and business stakeholders, collaborate naturally across teams, and consistently drive continuous improvement. Recommend reliability focused solutions based on business and technical needs.

  • Define and influence cloud and application architectures aligned with performance, availability, and resilience goals.
  • Build, enhance, and maintain monitoring, logging, and alerting capabilities.

Develop and improve observability frameworks (monitoring, alerting, logging).

  • Automate operational and reliability processes using Python, Bash, Ansible, and cloud native tooling.
  • Integrate reliability automation into CI/CD pipelines and optimize delivery workflows.

Incident Management & Continuous Improvement

  • Lead major incident response, root cause analysis, and post mortem activities.
  • Collaboration & Technical Leadership
  • Partner with development, DevOps, architecture, security, and business stakeholders.
  • Act as a technical authority and trusted advisor on service reliability.
  • Promote knowledge sharing and foster continuous improvement in engineering practices.

Bachelor’s degree in Computer Science, Software Engineering, or related field—or equivalent experience.

  • Bilingual (French/English)
  • 5+ years of experience in SRE, DevOps, operations, or distributed systems.
  • Strong experience with cloud platforms (AWS, Azure, or GCP) and modern architectural patterns.
  • Proficiency in Linux, automation scripting (Python, Bash), and Infrastructure as Code (Terraform, CloudFormation).
  • Ability to influence stakeholders and provide strategic technical guidance.
  • French proficiency required; English proficiency considered an asset or required based on client context.

________________________________________

Core: SRE, DevOps, Incident Management, Observability, SLIs/SLOs/SLAs

  • Cloud: AWS / Azure / GCP
  • Infrastructure: Linux, Terraform, CloudFormation
  • Automation: Python, Bash, Ansible
  • The determination of this range includes factors such as skill set level, geographic market, experience and training, and licenses and certifications. At CGI, we value the strength that diversity brings and are committed to fostering a workplace where everyone belongs. Spécialiste principal(e) en ingénierie de la fiabilité des sites (SRE)

Langues : Bilingue (français et anglais)

Type d’emploi : Temps plein

Nous recrutons un(e) Principal Ingénieur Site Reliability (SRE) pour soutenir la conception, l’évolution et l’exploitation de plateformes technologiques critiques. Dans ce rôle stratégique et très opérationnel, vous dirigerez l’adoption des meilleures pratiques SRE, façonnerez les architectures cloud et applicatives, et piloterez la fiabilité, la performance et la disponibilité des services clients. Vous influencerez les normes d’ingénierie, renforcerez l’excellence opérationnelle et collaborerez avec les équipes de développement, d’exploitation, de sécurité et métiers afin de livrer des solutions cloud résilientes, évolutives et modernes.

Vous êtes un(e) professionnel(le) SRE expérimenté(e), doté(e) d’une expertise technique approfondie et d’une forte capacité à améliorer la fiabilité à grande échelle. Vous communiquez efficacement avec les parties prenantes techniques et métiers, collaborez naturellement entre les équipes et favorisez en permanence l’amélioration continue. Recommander des solutions axées sur la fiabilité en fonction des besoins métiers et techniques.

  • Définir et influencer les architectures cloud et applicatives alignées sur les objectifs de performance, de disponibilité et de résilience.
  • Concevoir, améliorer et maintenir les capacités de supervision, de journalisation et d’alerte.

Développer et améliorer les cadres d’observabilité (supervision, alerting, journalisation).

  • Automatiser les processus opérationnels et de fiabilité à l’aide de Python, Bash, Ansible et d’outils cloud natifs.
  • Intégrer l’automatisation de la fiabilité dans les pipelines CI/CD et optimiser les flux de livraison.

Gestion des incidents et amélioration continue

  • Diriger la gestion des incidents majeurs, l’analyse des causes profondes et les activités de post mortem.
  • Collaboration et leadership technique
  • Travailler en partenariat avec les équipes de développement, DevOps, d’architecture, de sécurité et les parties prenantes métiers.
  • Agir en tant qu’autorité technique et conseiller de confiance en matière de fiabilité des services.
  • Encourager le partage de connaissances et promouvoir l’amélioration continue des pratiques d’ingénierie.

Baccalauréat en informatique, en génie logiciel ou dans un domaine connexe — ou expérience équivalente.

  • Plus de 5 ans d’expérience en SRE, DevOps, exploitation ou systèmes distribués.
  • Forte expérience avec les plateformes cloud (AWS, Azure ou GCP) et les architectures modernes.
  • Maîtrise de Linux, des scripts d’automatisation (Python, Bash) et de l’infrastructure en tant que code (Terraform, CloudFormation).
  • Capacité à influencer les parties prenantes et à fournir une orientation technique stratégique.
  • Maîtrise du français requise ; la maîtrise de l’anglais est considérée comme un atout ou requise selon le contexte client.

Principales : SRE, DevOps, gestion des incidents, observabilité, SLI/SLO/SLA

  • Cloud : AWS / Azure / GCP
  • Infrastructure : Linux, Terraform, CloudFormation
  • Automatisation : Python, Bash, Ansible
  • Le calcul de cette fourchette dépend de divers facteurs, notamment le niveau de compétence, le marché géographique, l’expérience, la formation ainsi que les licences et certifications professionnelles. Principal Site Reliability Engineering specialist (SRE)

Principal Site Reliability Engineer (SRE)

Languages: Bilingual (French & English)

We are hiring a Principal Site Reliability Engineering specialist (SRE) to support the design, evolution, and operation of mission critical technology platforms. In this strategic and handson role, you will lead the adoption of SRE best practices, shape cloud and application architectures, and drive the reliability, performance, and availability of client services. You will influence engineering standards, strengthen operational excellence, and collaborate across development, operations, security, and business teams to deliver resilient, scalable, and modern cloud solutions.

You are an experienced SRE professional with deep technical expertise and a strong ability to improve reliability at scale. You communicate effectively with technical and business stakeholders, collaborate naturally across teams, and consistently drive continuous improvement. Recommend reliability focused solutions based on business and technical needs.

  • Define and influence cloud and application architectures aligned with performance, availability, and resilience goals.
  • Build, enhance, and maintain monitoring, logging, and alerting capabilities.

Develop and improve observability frameworks (monitoring, alerting, logging).

  • Automate operational and reliability processes using Python, Bash, Ansible, and cloud native tooling.
  • Integrate reliability automation into CI/CD pipelines and optimize delivery workflows.

Incident Management & Continuous Improvement

  • Lead major incident response, root cause analysis, and post mortem activities.
  • Collaboration & Technical Leadership
  • Partner with development, DevOps, architecture, security, and business stakeholders.
  • Act as a technical authority and trusted advisor on service reliability.
  • Promote knowledge sharing and foster continuous improvement in engineering practices.

Bachelor’s degree in Computer Science, Software Engineering, or related field—or equivalent experience.

  • Bilingual (French/English)
  • 5+ years of experience in SRE, DevOps, operations, or distributed systems.
  • Strong experience with cloud platforms (AWS, Azure, or GCP) and modern architectural patterns.
  • Proficiency in Linux, automation scripting (Python, Bash), and Infrastructure as Code (Terraform, CloudFormation).
  • Ability to influence stakeholders and provide strategic technical guidance.
  • French proficiency required; English proficiency considered an asset or required based on client context.

________________________________________

Core: SRE, DevOps, Incident Management, Observability, SLIs/SLOs/SLAs

  • Cloud: AWS / Azure / GCP
  • Infrastructure: Linux, Terraform, CloudFormation
  • Automation: Python, Bash, Ansible
  • The determination of this range includes factors such as skill set level, geographic market, experience and training, and licenses and certifications. At CGI, we value the strength that diversity brings and are committed to fostering a workplace where everyone belongs. Spécialiste principal(e) en ingénierie de la fiabilité des sites (SRE)

Langues : Bilingue (français et anglais)

Type d’emploi : Temps plein

Nous recrutons un(e) Principal Ingénieur Site Reliability (SRE) pour soutenir la conception, l’évolution et l’exploitation de plateformes technologiques critiques. Dans ce rôle stratégique et très opérationnel, vous dirigerez l’adoption des meilleures pratiques SRE, façonnerez les architectures cloud et applicatives, et piloterez la fiabilité, la performance et la disponibilité des services clients. Vous influencerez les normes d’ingénierie, renforcerez l’excellence opérationnelle et collaborerez avec les équipes de développement, d’exploitation, de sécurité et métiers afin de livrer des solutions cloud résilientes, évolutives et modernes.

Vous êtes un(e) professionnel(le) SRE expérimenté(e), doté(e) d’une expertise technique approfondie et d’une forte capacité à améliorer la fiabilité à grande échelle. Vous communiquez efficacement avec les parties prenantes techniques et métiers, collaborez naturellement entre les équipes et favorisez en permanence l’amélioration continue. Recommander des solutions axées sur la fiabilité en fonction des besoins métiers et techniques.

  • Définir et influencer les architectures cloud et applicatives alignées sur les objectifs de performance, de disponibilité et de résilience.
  • Concevoir, améliorer et maintenir les capacités de supervision, de journalisation et d’alerte.

Développer et améliorer les cadres d’observabilité (supervision, alerting, journalisation).

  • Automatiser les processus opérationnels et de fiabilité à l’aide de Python, Bash, Ansible et d’outils cloud natifs.
  • Intégrer l’automatisation de la fiabilité dans les pipelines CI/CD et optimiser les flux de livraison.

Gestion des incidents et amélioration continue

  • Diriger la gestion des incidents majeurs, l’analyse des causes profondes et les activités de post mortem.
  • Collaboration et leadership technique
  • Travailler en partenariat avec les équipes de développement, DevOps, d’architecture, de sécurité et les parties prenantes métiers.
  • Agir en tant qu’autorité technique et conseiller de confiance en matière de fiabilité des services.
  • Encourager le partage de connaissances et promouvoir l’amélioration continue des pratiques d’ingénierie.

Baccalauréat en informatique, en génie logiciel ou dans un domaine connexe — ou expérience équivalente.

  • Plus de 5 ans d’expérience en SRE, DevOps, exploitation ou systèmes distribués.
  • Forte expérience avec les plateformes cloud (AWS, Azure ou GCP) et les architectures modernes.
  • Maîtrise de Linux, des scripts d’automatisation (Python, Bash) et de l’infrastructure en tant que code (Terraform, CloudFormation).
  • Capacité à influencer les parties prenantes et à fournir une orientation technique stratégique.
  • Maîtrise du français requise ; la maîtrise de l’anglais est considérée comme un atout ou requise selon le contexte client.

Principales : SRE, DevOps, gestion des incidents, observabilité, SLI/SLO/SLA

  • Cloud : AWS / Azure / GCP
  • Infrastructure : Linux, Terraform, CloudFormation
  • Automatisation : Python, Bash, Ansible
  • Le calcul de cette fourchette dépend de divers facteurs, notamment le niveau de compétence, le marché géographique, l’expérience, la formation ainsi que les licences et certifications professionnelles.
Vacancy posted a month ago
Similar jobs that could be interesting for youBased on the MÉCANICIEN MACHINES INDUSTRIELLES in Montréal, QC vacancy
  •  ...inclusive, adaptable, and forward-thinking organization, apply now.   We are currently seeking a Python Developer - Site Reliability Engineering (SRE) to join our team in Montreal, Quebec (CA-QC), Canada (CA).   Job Summary We are seeking a skilled Python... 
    Website
    Work at office
    Remote work
    Flexible hours

    NTT DATA Services

    Montréal, QC
    5 days ago
  •  ...technologiques KLANIK ESPORT : club professionnel e-sport ouvert aux collaborateurs Le Poste Titre du poste : Ingénieur en Fiabilité de Site (SRE) - Kubernetes sur Azure L'Ingénieur en Fiabilité de Site (SRE) spécialisé en Kubernetes sur Azure sera responsable de garantir la... 
    Website
    Daily paid

    Klanik

    Montréal, QC
    21 days ago
  •  ...Role: SRE +AI Hyrbid: 3 days in office- Location: Montreal Experience: 8+ years of experience as a Site Reliability Engineer or in a similar role, with hands-on experience in supporting IaaS platforms with networking and system engineer-ing knowledge. Roles and Responsibilities... 
    Website
    Contract work
    Work at office

    Astra North Infoteck Inc.

    Montréal, QC
    6 days ago
  •  ...COO. The mission is to ensure reliability, resilience, and performance of...  ...DevOps Service Platform by applying SRE principles. The role combines platform reliability engineering, DevOps practices, toolchain...  ...Platform en appliquant les principes SRE. Le rôle combine l’ingénierie... 
    Website
    Full time

    CGI

    Montréal, QC
    29 days ago
  •  ...Description What is the opportunity? This role will be responsible for the development, implementation, and support of Site Reliability Engineering (SRE) solutions for applications supported by the Digital Branch SRE organization. As the Engineering arm of the Digital... 
    Website
    Full time
    Worldwide
    Flexible hours

    Royal Bank of Canada

    Montréal, QC
    24 days ago
  • $200k per year

     ...DevOps Engineer Job Opportunity DevOps Engineer Client Most Elite Tech Firm in Canada Compensation Up to $200k CAD + Bonus +...  ...FinTech Firm is looking for a highly talented DevOps Engineer/Systems SRE to join a talented flat-structured team within a larger firm!... 
    Website
    Permanent employment
    Work at office
    Flexible hours

    Hunter Bond

    Montréal, QC
    a month ago
  •  ...and promote best practices to deliver a reliable, high‑performance gameplay experience....  ...University degree in Computer Science, Computer Engineering, or any relevant field. Experience...  ...load balancing, TLS). Assets: Unreal Engine 5 (or similar engine), DevOps methodology... 
    Full time

    Ubisoft

    Montréal, QC
    7 days ago
  • $60k - $115k per year

    DevOps SRE Specialist (Intermediate) Position Description To foster agility, software craftsmanship, and DevOps practices within the bank...  ...(intermédiaire) Job Description Mode de travail : Sur site Pour favoriser l'agilité, l'artisanat logiciel et les pratiques... 
    Website
    Montréal, QC
    more than 2 months ago
  • $99.26k - $118.57k per year

     ...Number: 50021606 Department: Mechanical, Industrial and Aerospace Engineering Grade: GR14 Campus: Sir George Williams (Downtown) Salary:...  ...comprehensive benefits, a defined pension plan and numerous on site well-being facilities such as a state of the art gym and health... 
    Website
    Permanent employment
    Full time

    Concordia University

    Montréal, QC
    22 hours ago
  •  ...social and native. Ingénieur Backend Principal – Plateforme Edge (Systèmes de diffusion...  ...Description du poste Relevant du VP Engineering, nous recherchons un(e) Ingénieur(e) Backend...  ...(latency, memory, throughput) Reliability, resiliency, and failover strategies... 
    Principal
    Website
    Daily paid
    Long term contract
    Full time
    Temporary work
    Remote work
    Work from home
    Flexible hours

    Perion

    Montréal, QC
    22 days ago
  •  ...help create the unknown! Job Description The Engineering Challenge Over 4,000 developers share a single...  ...think about it. That platform is what we build. The Reliability domain owns the orchestration engines, developer tooling, and quality intelligence... 
    Full time
    Internship
    Local area

    Ubisoft

    Montréal, QC
    26 days ago
  •  ...to take the next step in your career? Hatch is seeking a Principal - Commissioning for it’s Eastern Canada (ECA) Region who...  ...responsibilities, ensure training and mentoring and promote site experience to design engineers Provide technical support and governance to projects... 
    Principal
    Website
    Long term contract
    Part time
    Internship
    Work at office
    Local area
    Flexible hours

    Hatch

    Montréal, QC
    13 days ago
  •  ...team and are looking for a Web Application Support Specialist (m/f/d) Location: Montreal Your responsibilities...  ...looking forward to a great conversation! Step 2 – On-Site Interview at HYPE As a next step, we invite you to visit our office... 
    Website
    Full time
    Work at office
    Remote work
    Flexible hours

    HYPE Softwaretechnik GmbH

    Montréal, QC
    13 days ago
  •  ...systems, software, hardware and certification engineering services to the aerospace, defense, space...  ...Roles / Responsibilities As a V&V Specialist , you will be responsible for testing...  ...~ Work arrangement: Hybrid : 3 days on site Knowledge of English is required to... 
    Website
    Long term contract
    Work at office

    MANNARINO

    Montréal, QC
    14 days ago
  •  ...passionate and experienced Security Specialist to join our cyber security...  ...environment, ensuring high reliability and performance. Conduct...  ...Good Understanding of Detection engineering: rule writing, ATT&CK mapping...  ..., please apply for this role on Internal Career Site.... 
    Website
    Long term contract
    Flexible hours

    Intact

    Montréal, QC
    14 days ago
  •  ...Inviting applications for the role of Principal Consultant, Prod Support with...  ...disruptions. Ensure system reliability, availability, and performance using SRE principles. Automate manual processes...  ...with the best – Learn from top engineers, data scientists, and AI experts... 
    Principal

    Genpact

    Montréal, QC
    1 day ago
  •  ...afin d'assurer la fiabilité des sites de Turbulent Surveiller les...  ...autres ingénieurs DevOps et SREs de votre équipe Qu'attendons-nous de notre DevOps Principal·e ? ~5 ans+ d'expérience dans...  ...strategic in ensuring the reliability of Turbulent sites Monitoring... 
    Principal
    Website
    Summer work
    Immediate start
    Shift work

    Cloud Imperium Games Montreal

    Montréal, QC
    12 days ago
  •  ...les systèmes qui alimentent notre produit principal. Vous concevrez et développerez des...  ...niveau d’excellence au sein de l’équipe engineering. Responsabilités Diriger la conception...  ...architecture, and help the team ship reliable, high-performance software. This is a hands... 
    Principal
    Long term contract
    Full time
    Work at office
    Worldwide
    Shift work

    Botpress Technologies Inc.

    Montréal, QC
    a month ago
  •  ...LAPORTE is a consulting engineering firm specializing in pharmaceutical, agrifood, industrial...  ...senior pharmaceutical validation engineer or specialist with ideally at least 5 years of...  ...processes; Willingness to travel (to client sites).     Benefits What we offer:... 
    Website
    Permanent employment
    Full time
    Work at office
    Flexible hours
    2 days per week

    LAPORTE

    Montréal, QC
    12 days ago
  •  ...We're seeking someone to join our Institutional Lending Technology team as a QA Automation Engineering Specialist in FICFX to lead quality assurance initiatives for our technology platforms. This role is responsible for ensuring the delivery of high-quality software solutions... 
    Full time
    Work at office
    Remote work

    Morgan Stanley

    Montréal, QC
    1 day ago
  •  ...L’ Ingénieur SRE Senior est le garant de la conception, de l’évolution et de la supervision de nos plateformes. Ton rôle est structurant et s’articule autour de trois piliers majeurs : une maîtrise totale de l’ Observabilité , une expertise Kubernetes (OKD) et une solide... 

    Ringover

    Montréal, QC
    more than 2 months ago
  •  ...Canada is part of the SYSTRA group, an international consulting and engineering group, a world leader in the design of transport infrastructures...  ..., privatization of a railway. Context The Site Supervisor will be responsible for monitoring the refurbishment... 
    Website
    Contract work
    Part time
    For contractors
    Internship
    Work at office
    Montréal, QC
    5 days ago
  • $64.5k - $100k per year

     ...hybride, avec une présence sur site à Montréal de 2 à 3 jours par...  ...efficiency, your ideas will help engineer solutions for stronger...  ...contributors: Buildings. Our AI engine supports a self-operating building...  ...backend functionality and reliable APIs. Develop and maintain... 
    Website
    Apprenticeship
    Work from home
    Day shift
    2 days per week
    3 days per week

    Trane Technologies

    Montréal, QC
    14 days ago
  •  ...We are currently looking to fill multiple GenAI engineering roles across the Montreal office. The firm is heavily investing in Montreal as an AI location with roles ranging from chatbot engineering, AI solution and platform engineering to data engineering for AI projects.... 
    Full time
    Work at office

    Morgan Stanley

    Montréal, QC
    1 day ago
  • $90k per year

     ...activities across USA, Produce accurate financial statements in line with US GAAP and Group accounting standards. Ensure timely and reliable financial data integration for consolidation purposes. Prepare monthly, quarterly, and annual closings. Ensure compliance with... 
    Principal
    Permanent employment
    Work at office
    Local area

    Michael Page

    Montréal, QC
    9 days ago
  • $35 - $60 per hour

     ...systems—documenting every failure mode so we can harden model reasoning. Organization : Alignerr Position : Computer Engineering Expert - AI Content Specialist Type : Hourly Contract Compensation : $35–$60 /hour Location : Remote Commitment : 10–40 hours/week What... 
    Hourly pay
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    Montréal, QC
    9 days ago
  • $70k - $135k per year

    DevOps- SRE Description de poste SRE (Site Reliability Engineer) un développeur Java avec une majeur en DevOps (pipelines, monitoring-alerting-tracing (as-code), experience avec Github action et Argo CD Fonctions et responsabilités -Assurer la disponibilité et la... 
    Website
    Montréal, QC
    a month ago
  • $60k - $115k per year

    Développeur en spécialiste de la fiabilité des sites (SRE) et JAVA Position Description Work...  ...Engineering (SRE) Developer, you are a specialist in developing and managing resilient...  ...knowledge, Kubernetes, Google Kubernetes Engine (GKE), AWS Required qualifications to... 
    Website
    Montréal, QC
    more than 2 months ago
  • $35 - $60 per hour

     ...documenting every failure mode so we can harden model reasoning. Organization : Alignerr Position : Mechanical Engineering Expert - AI Content Specialist Type : Hourly Contract Compensation : $35–$60 /hour Location : Remote Commitment : 10–40 hours/week What... 
    Hourly pay
    Contract work
    Freelance
    Remote work
    Flexible hours

    Alignerr

    Montréal, QC
    19 days ago
  •  ...We are seeking a Senior Cloud Engineering Specialist to join our dynamic Cloud Infrastructure team. The ideal candidate brings deep technical...  ...markets and shape the future of our communities. This is a Principal Cloud & Infrastructure Engineering position at Vice President... 
    Full time
    Work at office
    Remote work

    Morgan Stanley

    Montréal, QC
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to MÉCANICIEN MACHINES INDUSTRIELLES. Be the first to apply!