Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Research Engineer, Model Evaluation

Cohere

Who are we?

Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.

We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.

Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.

Join us on our mission and shape the future!

Why this role?

Evaluation is critical to making progress in scaling intelligence. As models continue to become superhuman in many real-world use cases, we must continue to develop new techniques to accurately measure our models' performance on frontier capabilities. In this role, you are responsible for creating next-generation evaluation methods and scalable infrastructure to measure LLM progress.

As a Senior Research Engineer, Model Evaluation, you will:

  • Develop evaluation benchmarks, datasets, and environments for measuring the bleeding edge of model capabilities

  • Conduct research to push the state-of-the-art in LLM evaluation methods, including training LLM judges; improving evaluation efficiency; and scalably building high-quality datasets

  • Build scalable tools for investigating and understanding evaluation results that are used by all members of technical staff at Cohere, as well as leadership and our CEO

  • Learn from and work with the best researchers and engineers in the field

You may be a good fit if:

  • You enjoy pushing the limits of what LLMs are capable of, and you have built high-quality evaluation resources to measure those capabilities (datasets, simulators, environments, etc.)

  • You have a track record of developing new methods and/or data to evaluate LLMs, e.g. publications at top-tier conferences, popular benchmarks, etc.

  • You have deep experience building with and around LLMs, and you have built tools for analyzing and understanding their performance

  • You have strong software engineering skills

If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! If you want to work really hard on a glorious mission with teammates that want the same thing, Cohere is the place for you.

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form , and we will work together to meet your needs.

Full-Time Employees at Cohere enjoy these Perks:

An open and inclusive culture and work environment 

‍ Work closely with a team on the cutting edge of AI research 

Weekly lunch stipend, in-office lunches & snacks

Full health and dental benefits, including a separate budget to take care of your mental health 

100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK

Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement

Remote-flexible, offices in Toronto, New York, San Francisco and London and co-working stipend

✈️ 6 weeks of vacation

Note: This post is co-authored by both Cohere humans and Cohere technology.

Vacancy posted more than 2 months ago
Similar jobs that could be interesting for youBased on the Senior Research Engineer, Model Evaluation in United States vacancy
  • $155k - $269k per year

     ..., scalable, controllable, and efficient simulation. As a Research Scientist in World Models, you will develop algorithms and productionize the next generation...  ...data of driving scenes. Collaborate with simulation engineers to integrate models into large-scale, distributed... 
    Suggested
    Remote job
    Full time
    Work at office
    Work from home
    Flexible hours

    Waabi

    United States
    more than 2 months ago
  •  ...collaboration, join us! The Figma Research team is seeking a dynamic and...  ...managers, designers, and engineers to integrate user feedback into...  ...to cross-functional teams and senior leadership, telling the story...  ...development process Continuously evaluate existing product experiences... 
    Senior
    Long term contract
    Full time
    Remote work
    Work from home

    Figma

    United States
    15 hours ago
  • $155k - $213k per year

     ...datasets essential for training and evaluating our online mapping system....  ...vehicles.  Champion engineering excellence, ensuring high-quality...  ...Contribute to the broader research community by publishing findings...  ...machine learning features/models into production. Previous... 
    Suggested
    Remote job
    Full time
    Work at office
    Work from home
    Flexible hours
    Shift work

    Waabi

    United States
    more than 2 months ago
  •  ...collaboration, join us! We're looking for an AI Model Designer to join our team and become the bridge connecting our AI research innovations with world-class design...  ...our AI models' design capabilities and oversee evaluation quality. This is an exciting opportunity to... 
    Suggested
    Full time
    For contractors
    Remote work
    Work from home

    Figma

    United States
    15 hours ago
  •  ...everyone. Join some of the best Graphics Engineers in the world supporting the largest...  ...of this is based on an in-house rendering engine built from the ground up, supporting DirectX...  ...about the state-of-the-art in rendering research ~ Mobile or console development experience... 
    Senior
    Full time
    Work at office
    Local area
    Immediate start
    Visa sponsorship
    Monday to friday

    Roblox

    United States
    9 days ago
  •  ...shape the future of design and collaboration, join us! The Figma Research team is hiring a top-tier researcher to support our core...  ...broader business impact. Partnering with Product, Design, and Engineering, you will tackle our highest impact problems, influencing company... 
    Long term contract
    Full time
    Temporary work
    Remote work
    Work from home

    Figma

    United States
    15 hours ago
  • $185k - $225k per year

     ...Gauntlet leads the field in quantitative research and optimization of DeFi economics. We manage...  ...-edge research that informs our risk models, alerts, and analysis, and is among the most...  ...and code reviews, maintaining high engineering standards. Leverage AI-assisted development... 
    Senior
    Contract work
    Work at office
    Remote work
    Work from home

    SevenStar HR

    United States
    a month ago
  •  ...About This Role We're hiring a Senior or Staff Software Engineer to work across our product teams. This...  ...across the stack: backend APIs, data modeling in MySQL and Spanner, and polished frontend...  ...review, and design conversations Model what great engineering looks like:... 
    Senior
    Self employment
    Remote work
    Flexible hours

    FEG Outsourcing Administrativo

    United States
    6 days ago
  •  ...have built a team of experienced engineers and product leaders from...  ...Looking For We’re looking for a  Senior / Staff Backend Engineer (API)...  ..., ownership, and subscription models Creator payouts and...  ...Experience building subscription engines, marketplaces, accounting systems... 
    Senior
    Long term contract
    Remote work
    Worldwide
    Flexible hours

    Circa Logica Group

    United States
    11 days ago
  •  ...intelligence, and over 15 years of cutting-edge research at top universities, Keebo reduces...  .... About the Opportunity As a Senior Machine Learning Engineer on our Algorithms team, you will...  ...ML engineers and data engineers to evaluate algorithmic and business problems and... 
    Senior
    Remote job
    Full time
    Internship
    Local area
    Worldwide
    Home office

    Keebo

    United States
    more than 2 months ago
  • $201k - $251k per year

     ...you will: Partner with data science & engineering teams to design and deploy ML & Gen AI...  ...~ SQL, dbt, Python ~ OLAP / OLTP data modelling and architecture ~ Key-value stores: Redis...  ...qualify it as an AEDT. As part of the evaluation process we provide Covey with job... 
    Senior
    Remote job

    Mercury

    United States
    more than 2 months ago
  • $210k - $365k per year

     ...Foundation (SDF) is expanding its scientific and technological research efforts and looking for someone to help guide that growth with purpose...  ...(e.g., tenured or tenure-track faculty) or industry (e.g., senior research scientist or director-level role). ~ Strong track record... 
    Senior
    Remote job
    Long term contract
    Temporary work
    Internship
    Work at office
    Local area
    Worldwide
    Flexible hours

    Stellar Development Foundation

    United States
    more than 2 months ago
  •  ...by Greylock, Y Combinator, and other top investors. As a  Senior Backend Engineer on the  App Systems team, you'll be a high-ownership...  ...relational and NoSQL databases, SQL, and thoughtful database modeling ~ You're familiar with building and observing applications... 
    Senior
    Remote work
    Work from home
    Flexible hours

    Apollo Technical LLC

    United States
    a month ago
  • $167k - $208k per year

     ...embodies the elegance of simplicity in engineering, transforming the demanding task of controlling...  ...is growing and we’re looking to hire senior backend engineers . We’re a team of...  ...may qualify it as an AEDT. As part of the evaluation process we provide Covey with job... 
    Senior
    Remote job

    Mercury

    United States
    more than 2 months ago
  • $248.53k - $288.1k per year

     ...place to chat, explore and build with a wide variety of AI language models (bots), including GPT-5.4, Claude-Opus-4.6, Gemini-3.1-Pro, Nano...  ...on our Quora product. About the Team and Role: Our small engineering team works on challenging problems every day. We have a culture... 
    Senior
    Remote work
    Monday to friday
    Flexible hours

    Stonehill

    United States
    a month ago
  • $239k - $299k per year

     ...systems to make it happen.  Your job as an engineering manager at Mercury is to make our...  ...role, you will: Lead a team of 4 to 8 senior engineers to deliver high-availability banking...  ...qualify it as an AEDT. As part of the evaluation process we provide Covey with job... 
    Senior
    Remote job

    Mercury

    United States
    more than 2 months ago
  • $150k - $180k per year

     ...most complete and connected ecosystem in senior living. Founded by Michael Wang, a former...  ...augments and empowers human care. As a Software Engineer on our Intelligence & Integrations team,...  ...leadership to evolve our core data models and APIs, transitioning from our current point... 
    Senior
    Remote job
    Flexible hours

    Inspiren

    United States
    more than 2 months ago
  •  ...with formal logic and physics-based modeling, we create verifiable,...  ...collaborate with and support human researchers in high-stakes scientific and engineering workflows.  Our mission, 30×30...  ...technical partner during pilots, evaluations, and early production deployments... 
    Long term contract
    Work at office
    Remote work

    Axiomatic_AI

    United States
    9 days ago
  • $100k per year

     ...unify innovations in software models, compilers, platforms, networking...  ...for contributors of all seniorities. Tenstorrent is seeking a Signal Integrity Engineer to join our growing team. The ideal...  ...technologies and materials, including evaluating material trade-offs for... 
    Permanent employment

    Tenstorrent

    United States
    15 hours ago
  • $166k - $195k per year

     ..., CNBC Disruptor 50 , and TIME Magazine's 100 Most Influential Companies . About the Role We are seeking a Systems Designer-Engineer motivated by the opportunity to learn from an exceptional team and define how designers and engineers work together to deliver exceptional... 
    Senior
    Remote job
    Long term contract
    Full time
    Work from home
    Home office
    Relocation package
    Flexible hours

    Ramp

    United States
    more than 2 months ago
  • $201k - $251k per year

     ...Roman engineers built aqueducts that quietly carried water across cities, sustaining empires...  ...business today and for years to come. As a Senior Engineer on this team, you won’t just...  ...qualify it as an AEDT. As part of the evaluation process we provide Covey with job requirements... 
    Senior
    Remote job
    Long term contract
    Bank staff
    Immediate start

    Mercury

    United States
    more than 2 months ago
  •  ...infrastructure for training cutting-edge models to platforms for writing agentic...  ...infrastructure for training, evaluating, deploying, and serving the AI models...  ...used across Figma. Modeling Platform engineers partner with our AI Research team to accelerate the end-to-end model... 
    Full time
    Remote work
    Work from home
    Flexible hours

    Figma

    United States
    6 days ago
  •  ...experienced Machine Learning / AI Engineer to join our growing AI team,...  ...closely with engineers, researchers, designers, and product managers...  ...build, and productionize ML models for Search, Discovery,...  ...collect high-quality training and evaluation datasets, including annotation... 
    Long term contract
    Full time
    Remote work
    Work from home

    Figma

    United States
    2 days ago
  • $155k - $213k per year

     ...positive way. To learn more visit: As a Senior Software Engineer embedded within our Autonomy &...  ...develop data pipelines needed to train and evaluate Waabi’s autonomous platform, enabling our...  ...to provide a holistic understanding of model performance and enable the discovery of... 
    Senior
    Remote job
    Full time
    Work at office
    Work from home
    Flexible hours

    Waabi

    United States
    more than 2 months ago
  •  ...opportunity for AI experience engineering is greater than ever. You’ll collaborate...  ...with a cutting edge AI Research team and your work will help...  ..., from ideating on which models to use through productionizing...  ...peers, ex. mentoring/aligning senior engineers across organizations... 
    Senior
    Long term contract
    Full time
    Temporary work
    Remote work
    Work from home

    Figma

    United States
    15 hours ago
  •  ...Bradken is a global manufacturer of custom-engineered products and solutions for the resources...  ...engineering input, support field evaluations and ensure technical information is shared...  ...product improvement work. Prepare solid models, drawings, analysis/simulation inputs... 
    Immediate start
    Visa sponsorship
    Work visa
    Flexible hours

    Bradken

    United States
    9 days ago
  •  ...global manufacturer of custom-engineered products and solutions for...  ...design, reverse engineering, modelling, drafting, simulation and technical...  ...: · Create and update 3D models, detailed drawings and design...  .... · Support field evaluations and product reviews through site... 
    Work at office
    Immediate start
    Visa sponsorship
    Work visa
    Flexible hours

    Bradken

    United States
    9 days ago
  •  ...amazing work! The Invert team is comprised of creative and talented engineers, data scientists, biologists, and more, and we are supported by...  ...: GitHub, Linear, Slack, Notion The role Mission As a Senior Software Engineer, you will ensure that your team efficiently... 
    Senior
    Remote job

    Invert

    United States
    more than 2 months ago
  •  ...us! We are looking for a Data Platform Engineer to join our Data Engineering team and help...  ...across the company, and bring data and models closer to the product experience. This...  ...-end ML systems in production (training, evaluation, deployment, monitoring) ~ Strong programming... 
    Full time
    Remote work
    Work from home

    Figma

    United States
    15 hours ago
  • $160k - $190k per year

     ...payment fraud, account takeovers, and social engineering scams. We have raised $75M from world-...  ...We are seeking a highly skilled Senior Software Engineer to lead the development...  ...Learning Integration : Apply machine learning models where appropriate to enhance device recognition... 
    Senior
    Remote job
    Worldwide
    Home office
    Flexible hours

    Sardine

    United States
    more than 2 months ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Research Engineer, Model Evaluation. Be the first to apply!