Senior Research Engineer, Model Evaluation
Cohere
Who are we?
Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.
We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.
Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.
Join us on our mission and shape the future!
Why this role?
Evaluation is critical to making progress in scaling intelligence. As models continue to become superhuman in many real-world use cases, we must continue to develop new techniques to accurately measure our models' performance on frontier capabilities. In this role, you are responsible for creating next-generation evaluation methods and scalable infrastructure to measure LLM progress.
As a Senior Research Engineer, Model Evaluation, you will:
Develop evaluation benchmarks, datasets, and environments for measuring the bleeding edge of model capabilities
Conduct research to push the state-of-the-art in LLM evaluation methods, including training LLM judges; improving evaluation efficiency; and scalably building high-quality datasets
Build scalable tools for investigating and understanding evaluation results that are used by all members of technical staff at Cohere, as well as leadership and our CEO
Learn from and work with the best researchers and engineers in the field
You may be a good fit if:
You enjoy pushing the limits of what LLMs are capable of, and you have built high-quality evaluation resources to measure those capabilities (datasets, simulators, environments, etc.)
You have a track record of developing new methods and/or data to evaluate LLMs, e.g. publications at top-tier conferences, popular benchmarks, etc.
You have deep experience building with and around LLMs, and you have built tools for analyzing and understanding their performance
You have strong software engineering skills
If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! If you want to work really hard on a glorious mission with teammates that want the same thing, Cohere is the place for you.
We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form , and we will work together to meet your needs.
Full-Time Employees at Cohere enjoy these Perks:
An open and inclusive culture and work environment
Work closely with a team on the cutting edge of AI research
Weekly lunch stipend, in-office lunches & snacks
Full health and dental benefits, including a separate budget to take care of your mental health
100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK
Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
Remote-flexible, offices in Toronto, New York, San Francisco and London and co-working stipend
✈️ 6 weeks of vacation
Note: This post is co-authored by both Cohere humans and Cohere technology.
$85 per hour
...technical talent with leading AI research labs. Headquartered in San Francisco... ...Dorsey . Position: iOS Engineer (Coding Agent Experience) Type... ...coding agents to complete and evaluate complex engineering tasks. Review model-generated mobile application code...SuggestedRemote jobContract workSummer work- ...driving stack is powered by Waabi World, which delivers realistic, scalable, controllable, and efficient simulation. As a Research Engineer in the World Models team, you will develop algorithms and productionize the next generation of World Models that can reason about...SuggestedFull time
$80 - $120 per hour
...creative and technical talent with leading AI research labs. Headquartered in San Francisco, our... ...customer research and feedback synthesis Evaluator Type: Contract... ...asynchronously to meet deadlines and improve AI model performance . Qualifications Must...SuggestedRemote jobContract workSummer workWork at office$84k - $126k per year
...Job Type: Permanent Work Model: Hybrid Reference code: 133164 Primary Location: Toronto, ON All Available Locations... ...various complex financial analyses including independent derivative evaluation, customer behavior modeling, and new innovations such as Machine...SeniorPermanent employmentFlexible hours$56k - $84k per year
...Job Type: Permanent Work Model: Hybrid Reference code: 133442 Primary Location: Toronto... ...look like? As an Analyst, Consultant, or Senior Consultant focusing on the Insurance practice in our Financial Engineering & Modeling team, you will: Conduct...SeniorPermanent employmentFlexible hours- ...Job Id JREQ186136 Job Type Full time Hybrid Senior Research Engineer Do you love creating innovative solutions for customers?... ..., etc.) #LI-JF1 What's in it For You? Hybrid Work Model: We've adopted a flexible hybrid working environment (2-3 days...SeniorFull timeWork at officeRemote workFlexible hours2 days per week3 days per week
- ...expertise spanning machine learning, bioinformatics, data science, engineering, and drug development, our multidisciplinary team in Toronto... ...how new medicines are created. Ideal Candidate You are a research engineer who bridges the gap between fast-paced, experimental...SeniorFull time
- ...positive way. To learn more visit: The Evaluation Algorithms team is responsible for... ...highly realistic closed-loop simulation engine built with the latest in generative AI technologies... ...to provide a holistic understanding of model performance and enable the discovery of...Full timeInternship
$140k - $175k per year
...Thomson Reuters Labs. We are seeking a Lead Research Engineer who will bring expertise in AI and ML... ...with research scientists to evaluate, prototype and productionize research concepts... ...technology Familiarity with probabilistic models and have an understanding of the mathematical...Work at officeLocal areaRemote workFlexible hours2 days per week3 days per week$101k - $169k per year
...Job Type: Permanent Work Model: Hybrid Reference code: 133422 Primary Location... ...our exponentially expanding Financial Engineering and Modeling group? Are you up for the challenge... ...,000 (Manager) and $126,000 - $234,000 (Senior Manager), and individuals may be eligible...SeniorPermanent employmentFlexible hours$155k - $269k per year
..., scalable, controllable, and efficient simulation. As a Research Scientist in World Models, you will develop algorithms and productionize the next generation... ...data of driving scenes. Collaborate with simulation engineers to integrate models into large-scale, distributed...Remote jobFull timeWork at officeWork from homeFlexible hours$100k - $145k per year
...for technology at Thomson Reuters Labs. We are seeking a Senior Research Engineer who will bring expertise in AI and ML and is interested in... ...Typescript, etc.) #LI-SM2 What’s in it For You? Hybrid Work Model: We’ve adopted a flexible hybrid working environment (2-3...SeniorFull timeWork at officeLocal areaRemote workFlexible hours2 days per week3 days per week- ...leader to build a new centralized Evaluation team. This team will be... ...realistic closed-loop simulation engine built with the latest in... ...discussions, collaborating with researchers and engineers. - Conduct... ...evaluating AI or machine learning models, ideally in self-driving or...Full time
$72k per year
...plant Responsibilities Tasks Determine product specifications Evaluate chemical process technology and equipment Conduct research into the development or improvement of chemical engineering processes, reactions and materials Design and test chemical...Permanent employmentFull timeRemote work$94.6k - $176k per year
...fraud or marketing campaign model validation, model development... ...statistics, computer science or engineering Main Responsibilities... ...escalating where necessary to senior management. Provide consultancy... ...management, execution, evaluation and sustainment of initiatives...SeniorFull timeContract workPart timeShift work$105k - $155k per year
...Sr. User Experience Researcher Role Summary: Come join... ...how to use them. As a Senior UX Researcher, you will define... ...User Experience, Human Factors, Engineering Psychology, Interactive... ...in it For You? Hybrid Work Model: We’ve adopted a flexible hybrid...SeniorRemplacementFull timeWork at officeLocal areaRemote workFlexible hours2 days per week3 days per week- ...Purpose of Job The Treasury Modelling team within the Treasury group is responsible for the management of the department’s programmed... ...supporting Treasury’s programming and quantitative analysis. The incumbent will be instrumental in automating complex departmental models....SeniorFull time
- ...impact the world in a positive way. To learn more visit: As a Research Engineer in Neural Rendering, you will create the next generation of... ...real-time rendering with NeRF, 3D Gaussian Splatting, diffusion models, etc. - Collaborate with Waabi’s autonomy and safety teams...Full time
- Job Responsibility: About Us: Meitou Inc. is a financial startup based in Toronto, Canada. Our mission is to provide global Chinese-speaking investors with professional, trustworthy, and insightful U.S. stock analysis and contents. Our Chinese U.S. stock investment platform...SeniorFull timeWork at officeRelocationMonday to friday
$20 per hour
...technical talent with leading AI research labs. Headquartered in San... ...Generate high-quality human evaluation data by identifying response... ...completeness of responses. Ensure model responses align with expected... ..., analytics, linguistics, engineering) Preferred Prior...Remote jobContract workPart timeSummer work$94.6k - $176k per year
...Provides oversight, monitoring and reporting on model risk for a designated portfolio. Develops... ...advisor. Makes recommendations to senior leaders on strategy and new initiatives,... ..., stakeholder management, execution, evaluation and sustainment of initiatives. Leads...SeniorFull timeContract workPart timeShift work$118k - $162k per year
...s talk. Position Description: As a Senior Research Operations Program Manager at Okta, you... ...collaboration across research, design, product and engineering teams, you will programmatically manage... ...or managing beta and/or early release evaluation programs ~ Experience managing...SeniorLocal areaWorldwide- ...Join the leading chiplet startup! As an Eliyan Verification Engineer , you will be working at a fast-paced early-stage startup creating... ...Serdes. You will be developing state-of-the-art AMS systemVerilog models (RNM) for best-in-class PHYs. You will own verification of AMS...Full timeInternship
- ...bioinformatics, data science, engineering, and drug development, our multidisciplinary... ...an exceptional and creative Senior/Staff Machine Learning... ...innovate within our core AI research team, specifically focusing on... ...of Biological Foundation Models (BioFMs). You will pioneer novel...SeniorFull time
$80 - $120 per hour
...creative and technical talent with leading AI research labs. Headquartered in San Francisco, our... .... Position: Healthcare operations Evaluator Type: Contract... ...structured written feedback to enhance AI model outputs . Apply deep subject-matter...Remote jobContract workSummer workWork at office$80 - $120 per hour
...creative and technical talent with leading AI research labs. Headquartered in San Francisco, our... ...: BI dashboards / performance reporting Evaluator Type: Contract... ...structured written feedback to improve AI model outputs. Apply deep subject-matter expertise...Remote jobContract workSummer workWork at office- ...for diverse, innovative and collaborative candidates who want to impact the world in a positive way. To learn more visit: As a Research Engineer in Sensor Signal Processing, you will be a key contributor to the research and development of Waabi’s signal processing stack...Full time
- ...( eqbank.ca ) one of the top banks in Canada on the Forbes World's Best Banks list since 2021. The Work The Senior Analyst - Cyber Threat Modeling and Risk supports the Threat Modeling and Risk Assessment program by assisting with the identification, assessment, and...SeniorFull time
$80 - $120 per hour
...creative and technical talent with leading AI research labs. Headquartered in San Francisco, our... ...: Biology / environmental science Evaluator Type: Contract Compensation... ...written feedback to improve AI model outputs. Collaborate with AI research...Remote jobContract workSummer workWork at office$70 per hour
...creative and technical talent with leading AI research labs. Headquartered in San Francisco, our... ...Remote Role Responsibilities Evaluate AI-generated responses to ensure... ...structured written feedback to improve AI model outputs . Identify nuances, implicit...Remote jobContract workSummer work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Research Engineer, Model Evaluation. Be the first to apply!
- ingénieur de recherche Toronto, ON
- research engineer Toronto, ON
- deep learning research engineer Toronto, ON
- mechanical research engineer Toronto, ON
- figure model Toronto, ON
- director quantitative analyst model validation Toronto, ON
- model plus size Toronto, ON
- energy modelling Toronto, ON
- hair models wanted Toronto, ON
- cat modeling Toronto, ON
