Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer, Metal

$100k per year

Tenstorrent

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

Tenstorrent is building large-scale AI systems across internal clusters and customer deployments. This role sits at the intersection of site reliability, infrastructure operations, and customer engineering, ensuring our systems are reliable, observable, and production-ready.

This role is hybrid, based out of Toronto, ON; Austin, TX; or Santa Clara, CA.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.

 

Who You Are

  • Experienced in site reliability, infrastructure, or systems engineering in distributed environments.
  • Strong Linux systems knowledge with the ability to troubleshoot complex multi-layer issues.
  • Proficient with observability tools such as Prometheus, Grafana, and alerting systems.
  • Comfortable with scripting and automation using Python, Go, or similar languages.
  • Solid understanding of networking fundamentals and how systems behave at scale.

 

What We Need

  • Ensure reliability and operational health of Tenstorrent systems across internal and customer environments.
  • Troubleshoot complex issues across compute, networking, and software layers.
  • Partner with engineering teams and customers to resolve production incidents.
  • Design and improve monitoring, observability, and alerting systems.
  • Build automation to reduce operational toil and improve system reliability.

 

What You Will Learn

  • How large-scale AI infrastructure is operated across internal clusters and customer deployments.
  • How distributed systems behave under real-world production conditions.
  • How observability and automation drive reliability at scale.
  • How hardware, networking, and software systems interact in AI environments.
  • How customer-facing AI infrastructure is deployed, supported, and optimized.

 

Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

This offer of employment is contingent upon the applicant being eligible to access U.S. export-controlled technology.  Due to U.S. export laws, including those codified in the U.S. Export Administration Regulations (EAR), the Company is required to ensure compliance with these laws when transferring technology to nationals of certain countries (such as EAR Country Groups D:1, E1, and E2).   These requirements apply to persons located in the U.S. and all countries outside the U.S.  As the position offered will have direct and/or indirect access to information, systems, or technologies subject to these laws, the offer may be contingent upon your citizenship/permanent residency status or ability to obtain prior license approval from the U.S. Commerce Department or applicable federal agency.  If employment is not possible due to U.S. export laws, any offer of employment will be rescinded.

Vacancy posted 10 hours ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer, Metal in Toronto, ON vacancy
  •  ...environments. We are a modern, IoT-enabled, cloud-based tool for reliability, safety, and operations of physical equipment and...  ...valuing the company at $2.5 billion. We’re looking for a Site Reliability Engineer to help advance MaintainX’s reliability, observability, and... 
    Suggested

    MaintainX

    Toronto, ON
    10 hours ago
  • $110k - $120k per year

     ...professional development support, discounts through Perkopolis, and recognition programs that celebrate your impact The Job: Site Reliability Engineer The Site Reliability Engineer is responsible for ensuring the availability, performance, and resilience of the... 
    Suggested
    Temporary work
    Internship
    Work at office
    Remote work

    Momentum Financial Services Group

    Toronto, ON
    10 hours ago
  • $192k - $288k per year

     ...partnering with product squads to scale reliability best practices and design safe deployment...  ...complex architectural challenges, mentoring engineers, and shaping the long-term resilience...  ...across multiple teams. Background in Site Reliability Engineering (SRE)  Familiarity... 
    Suggested
    Long term contract
    Work at office

    Nubank

    Toronto, ON
    10 hours ago
  • $136k - $187k per year

     ...users worldwide. Our commitment to reliability is a key foundation of our product and...  ...availability expectations is a core engineering focus. As a Senior Site Reliability Engineer, you'll join...  ...solutions that make our system more reliable by design. What you’ll do: Design... 
    Suggested
    Local area
    Remote work
    Worldwide

    Okta

    Toronto, ON
    10 hours ago
  • $164.6k - $235.1k per year

     ...About the Role: Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization that applies a developer's mindset and toolkit to the challenges of building and running large-scale, distributed systems. Our mission... 
    Suggested
    Long term contract
    Remplacement
    Full time
    Contract work
    Temporary work
    Local area
    Flexible hours

    Tubi - Canada

    Toronto, ON
    10 hours ago
  •  ...best of both work styles in a workplace that is intentional about belonging, collaboration, and accomplishment. Being a Site Reliability Engineer at iManage Means… You are an engineer, a builder, and a systems thinker. You’ll create middleware and platform guardrails... 
    Full time
    Work at office
    Local area
    Remote work
    Worldwide
    Monday to friday
    Flexible hours

    iManage

    Toronto, ON
    a month ago
  • $100k - $125k per year

     ...We are seeking an experienced and motivated Software Engineer to join our dynamic Site Reliability Engineering (SRE) team. As a Site Reliability Engineer,...  ...culture Tech at Tipalti  Our tech teams are the engine behind our business. Tipalti’s tech ecosystem is extremely... 
    Work at office
    Flexible hours

    Tipalti

    Toronto, ON
    10 hours ago
  •  ...Technology is a strategic enabler, and reliability, security, and governance are...  ...analytics teams.  Your new role   As a Site Reliability Engineer (SRE) focused on User Access & Applications...  .... Your mandate is to ensure secure, reliable, and compliant access across key enterprise... 
    Contract work

    Hays

    Toronto, ON
    a month ago
  •  ...Years of Experience: 6-8 We are seeking a Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of platform services. The ideal candidate will bring strong expertise in SRE practices, observability, infrastructure automation, and developer... 
    Contract work

    Astra North Infoteck Inc.

    Toronto, ON
    29 days ago
  •  ...data, portfolio management, risk, and investment operations. Reliability, controlled change, and clear operational readiness are...  ...technology supports the business. Your new role   As a Site Reliability Engineer (SRE) – Applications, you will focus on operational documentation... 
    Long term contract
    Contract work

    Hays

    Toronto, ON
    a month ago
  •  ...Site Reliability Engineer – APM, Dynatrace, Observability Duration: 12 months Location: Toronto Hybrid: 2 days in office a week SRE Lead Deep application and system-level knowledge across complex end-to-end environments, including tightly integrated on prem... 
    Contract work
    Work at office
    2 days per week

    Astra North Infoteck Inc.

    Toronto, ON
    28 days ago
  • $141k - $191k per year

     ...and develop your career. As an SRE Manager, you will lead a team of 10+ engineers, oversee their development and ensure operational excellence. About the Role: In this opportunity as Site Reliability Engineering Manager , you will be responsible for: Team Leadership... 
    Work at office
    Local area
    Flexible hours
    2 days per week
    3 days per week

    Thomson Reuters

    Toronto, ON
    more than 2 months ago
  •  ...project delivery. We are currently looking for an Operational Reliability Project Engineer to join our team. At Peter Lucas, we offer a variety of...  ...join our team in Blind River, Ontario, in a full-time, on-site Monday–Friday role . This position thrives in complex, fast... 
    Full time
    Internship
    Monday to friday

    Peter Lucas Project Management Inc.

    Toronto, ON
    7 days ago
  •  .... Job Description We are looking for a Database Reliability Engineer to join our team. This is not a traditional DBA role — you are...  ...to join MUFG Investor Services?  Take a look at our careers site and you’ll find everything you’d expect working with one of the... 
    Permanent employment
    Full time
    Internship
    Remote work
    Worldwide

    MUFG Investor Services

    Toronto, ON
    10 days ago
  •  ...Job Title: Platform Reliability Engineer Location: Toronto, ON Note: Prior experience in BFSI, Public Sector, or Telecom is non-negotiable . Position Overview: Seeking an experienced Platform Reliability Engineer with a strong background in BFSI,... 

    NavitasPartners

    Toronto, ON
    7 days ago
  •  ...Job Title: Production Reliability Engineer Location: Toronto, ON Note: Prior experience in BFSI, Public Sector, or Telecom is non-negotiable . Position Overview: Seeking an experienced Production Reliability Engineer with a strong background in BFSI... 

    NavitasPartners

    Toronto, ON
    7 days ago
  •  ...Job Title: Mechanical Engineer – Onshore Reliability Experience: Minimum 12 Years Qualification: Bachelor’s Degree in Mechanical Engineering...  ..., and optimize maintenance activities to ensure safe, reliable, and efficient plant operations. # Develop and implement... 
    Permanent employment
    Full time

    Hudson Manpower

    Toronto, ON
    3 days ago
  •  ...Job Title: Mechanical Engineer – Offshore Reliability Experience: Minimum 12 Years Qualification: Bachelor’s Degree in Mechanical Engineering Industry: Oil & Gas / Refinery (Offshore) Work Location : Saudi Arab Job Description: The Mechanical Engineer... 
    Permanent employment
    Full time

    Hudson Manpower

    Toronto, ON
    3 days ago
  •  ...Job Title: Rotating Engineer – Onshore Reliability Experience: Minimum 12 Years Qualification: Bachelor’s Degree in Mechanical Engineering...  ...ensure safe and efficient plant operations. # Ensure reliable operation and optimal performance of rotating equipment including... 
    Permanent employment
    Full time

    Hudson Manpower

    Toronto, ON
    3 days ago
  •  ...Job Title: Rotating Engineer – Offshore Reliability Experience: Minimum 12 Years Qualification: Bachelor’s Degree in Mechanical Engineering Industry: Oil & Gas / Refinery (Offshore) Work Location : Saudi Arab Job Description: The Rotating Engineer – Offshore... 
    Permanent employment
    Full time

    Hudson Manpower

    Toronto, ON
    3 days ago
  • $20 - $25 per hour

     ...Role Overview The Site Hand will be responsible for keeping project sites clean, safe, and stocked. This role involves traveling between...  ...handling, and general site upkeep. The ideal candidate is reliable, hardworking, and physically capable of handling labor-intensive... 
    Full time
    Early shift

    Clearspace

    Toronto, ON
    9 days ago
  •  ...to Ontario Faster, more frequent, and reliable access to rapid transit with more than 227...  ...option. Job Description The Site Superintendent supports the TBM operations...  ...with General Super, Project Managers, Engineers, and stakeholders to develop and implement... 
    Full time
    Contract work
    For subcontractor
    Work at office
    Relocation

    Ontario Transit Group

    Toronto, ON
    16 days ago
  •  ...Overview:     Within the context set out by the Vice President Reliability Standards and Chief Regulatory Officer, and as informed and...  ..., and standards pertaining to the practice of Professional Engineering in Ontario; and # Other external mandatory reliability-related... 
    Work at office

    Hydro One Networks Inc

    Toronto, ON
    1 day ago
  • $2000 per month

     ...About Roles Arvo Metals is currently hiring 2 experienced Sales Representatives to join our growing team in Scarborough . We are seeking dynamic and results-driven candidates with experience in non-ferrous scrap metal sourcing, purchasing, trading, or business development... 
    Long term contract

    Arvo Metals

    Toronto, ON
    15 days ago
  • $20.4 per hour

     ...today! Allied Universal is seeking Security Guard - Corporate Site in Downtown Toronto, Ontario. Job Title : Security Guard...  ...Monday to Friday Overview : We are currently seeking a reliable and dedicated Security Guard for a Corporate Site in Downtown... 
    Hourly pay
    Full time
    Casual work
    Work at office
    Immediate start
    Monday to friday
    Shift work
    Night shift

    Allied Universal

    Toronto, ON
    2 days ago
  • $80k per year

     ...Opportunities for career advancement, on-the-job training, and upskilling within a quickly growing company JOB OVERVIEW: As the Site Manager, you will be responsible for managing day-to-day janitorial operations for our client and ensuring a safe and professional work... 
    Permanent employment
    Full time
    Contract work
    Work at office
    Flexible hours

    BEST - For A Cleaner World

    Toronto, ON
    16 days ago
  • $59.14 per hour

    Overview Languages English Education Bachelor's degree or equivalent experience Experience 5 years or more On site Work must be completed at the physical location. There is no option to work remotely. Responsibilities Tasks Coordinate... 
    For subcontractor
    Remote work

    Green Infrastructure Partners Inc.

    Toronto, ON
    3 days ago
  • $60k - $65k per year

     ...Job Responsibility: Assistant Site Manager Job Overview: Reporting to the Site Manager, the Assistant Site Manager will be an employee of the Aurum Group of Companies and will be working with one of the biggest clients of Aurum. The Assistant Site Manager (ON) is responsible... 
    Permanent employment
    Full time
    Contract work
    Shift work

    Aurum

    Toronto, ON
    9 days ago
  •  ...We are seeking a driven and experienced Site Supervisor to lead and manage daily site...  ...and reports. ~ Valid drivers license and reliable transportation; safety certifications (e....  ...degree in Construction Management, Civil Engineering, or related field. IICRC Certification... 
    Full time
    Temporary work
    For contractors
    For subcontractor

    The Headhunter Karriera L.L.C

    Toronto, ON
    a month ago
  • $50k - $60k per year

     ...Commercial Cleaning Services (CCS) is seeking a Site Manager for one of our facilities in the GTA. The Site Manager is responsible for the daily management of janitorial services, staff performance, and service delivery excellence. CCS is a rapidly growing Building Service... 
    Remplacement
    Full time
    For contractors
    Work at office

    Commercial Cleaning Services

    Toronto, ON
    11 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer, Metal. Be the first to apply!