Evaluation Lead
waabi
Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical AI. With a world-class team, we're unlocking the next era of autonomous transportation with technology that's powering commercial autonomous trucks and robotaxis. Waabi is backed by and partners with world leaders in AI, automotive, logistics, and deep tech.
With offices in Toronto, San Francisco, Dallas, and Pittsburgh, Waabi is growing quickly and looking for diverse, innovative and collaborative candidates who want to impact the world in a positive way. To learn more visit:
We are looking for a hands-on leader to build a new centralized Evaluation team. This team will be responsible for providing comprehensive and holistic analysis on all aspects of performance of the autonomy system. In this role, you will collaborate closely with the systems & safety team, responsible for defining the requirements & evaluation criteria, as well as the autonomy teams to understand their evaluation needs. You will get to work with Waabi World, our highly realistic closed-loop simulation engine built with the latest in generative AI technologies to deliver the evaluation capabilities needed to support the safe development of the next generation of autonomous vehicles!
You will...
- Lead and build a cross functional team of software engineers, data analysts, and data scientists supporting automated workflows that provide high signal on autonomy performance.
- Design scalable production frameworks for sampling evaluation sets, developing and improving metrics, and systematically measuring the performance of both autonomy and the eval ecosystem itself.
- Design pipelines, tools, and dashboards to characterize autonomy performance for technical teams and executive leadership, collaborating closely with platform teams on implementation, and autonomy, systems and safety and product teams on requirements.
- Work closely with simulation and software teams to build solutions that leverage our data, metrics and simulation platforms effectively.
- Lead technical projects; contributing as an IC while also managing the team.
- Participate and share ideas in technical and architecture discussions, collaborating with researchers and engineers.
- Conduct regular one-on-one meetings to offer guidance and constructive feedback to direct reports.
Qualifications:
- Minimum of 6+ years of autonomous vehicle industry experience including at least 2+ years managing high performing teams
- Experience evaluating AI or machine learning models, ideally in self-driving or related fields
- MS/PhD or Bachelors degree in Computer Science, Data Science, Robotics and/or similar technical field(s) of study
- Strong statistical background
- Experience working with internal cross-functional partners/stakeholders
- Experience with system design/architecture and algorithms
- Open-minded and collaborative team player with willingness to help others
- Passionate about self-driving technologies, solving hard problems, and creating innovative solutions.
Bonus/nice to have:
- Previous experience leading Autonomy Evaluation teams
- Experience with large scale databases and analytics
- ...Join us as a volunteer and play a vital role in empowering our partner, The HEART Trust Foundation, by enhancing their Monitoring & Evaluation (M&E) systems. Your efforts will help them gain deeper insights into the impact of their programs and improve the effectiveness of their...SuggestedPermanent employmentWork at officeLocal areaFree visa
$80 - $120 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...Dorsey . Position: Customer success / support operations Evaluator Type: Contract Compensation: $80–$120/hour...SuggestedRemote jobContract workSummer workWork at office$80 - $120 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...Summers , and Jack Dorsey . Position: Healthcare operations Evaluator Type: Contract Compensation: $80–$120/hour...SuggestedRemote jobContract workSummer workWork at office$80 - $120 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...Dorsey . Position: BI dashboards / performance reporting Evaluator Type: Contract Compensation: $80–$120/hour...SuggestedRemote jobContract workSummer workWork at office$80 - $120 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...Jack Dorsey . Position: Biology / environmental science Evaluator Type: Contract Compensation: $80–$120/hour...SuggestedRemote jobContract workSummer workWork at office$70 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...hour Location: Remote Role Responsibilities Evaluate AI-generated responses to ensure accuracy and depth in reasoning...Remote jobContract workSummer work$80 - $120 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...and Jack Dorsey . Position: Public health communications Evaluator Type: Contract Compensation: $80–$120/hour...Remote jobContract workSummer workWork at office$80 - $120 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...Jack Dorsey . Position: IP / trademark / copyright law Evaluator Type: Contract Compensation: $80–$120/hour...Remote jobContract workSummer workWork at office$80 - $120 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...and Jack Dorsey . Position: Training / onboarding / L&D Evaluator Type: Contract Compensation: $80–$120/hour...Remote jobContract workSummer workWork at office$80 - $120 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...Position: Compliance / regulatory response with financial-services AI Evaluator Type: Contract Compensation: $80–$120/hour...Remote jobContract workSummer workWork at office$85 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...Responsibilities Use frontier AI coding agents to complete and evaluate complex engineering tasks. Review model-generated mobile...Remote jobContract workSummer work$80 - $120 per hour
...elite creative and technical talent with leading AI research labs. Headquartered in San... ...Operations / inventory / capacity planning Evaluator Type: Contract... ...meet deadlines and ensure high-quality evaluations. Qualifications Must-Have...Remote jobContract workSummer workWork at office- ...Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical AI. With a world-class team, we're unlocking the next era of... ...impact the world in a positive way. To learn more visit: The Evaluation Algorithms team is responsible for building the algorithms & tooling...Full timeInternship
- ...We are seeking Audio Evaluators to participate in an exciting project focused on evaluating audio clips in English (Pakistani) . This... ...• Apply consistent and objective judgment based on provided evaluation guidelines. • Maintain high quality and timely completion of...Remote jobContract workFreelance
$15 - $20 per hour
...Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors... ...sources and external tools . Generate high-quality human evaluation data by identifying response strengths, areas for improvement, and...Remote jobContract workSummer work$65k - $80k per year
...About the Opportunity Lead and support a small accounting team, ensuring accurate and timely financial processes. Monitor month-end close and escalate delays or issues. Review and post accounting entries, reconciliations, prepaids, fixed assets, and intercompany transfers...Work at officeLocal area- ...Toronto and the Greater Toronto Area to evaluate and improve next-generation AI career guidance... ...guidance within the field Develop evaluation rubrics to objectively assess and score AI... ...required Nice to Have Team Lead, Supervisor, or management experience in...Hourly payOngoing contractLong term contractContract workFor contractorsFreelanceInternshipRemote workFlexible hours
- ...looking for experienced Medical Doctors and Physicians in Toronto to evaluate and improve AI-generated career guidance — ensuring it's... ...level career guidance within the medical field Create objective evaluation rubrics used to assess AI-generated career advice Identify...Hourly payOngoing contractLong term contractContract workFreelanceInternshipRemote workFlexible hours
- ...experienced Corporate Communications Specialists in Toronto to evaluate and improve AI career guidance systems — ensuring the advice AI... ...expert-level career guidance within your field Create objective evaluation rubrics used to assess AI-generated career advice Identify...Hourly payOngoing contractLong term contractContract workFreelanceInternshipRemote workFlexible hours
$122k - $192k per year
...Innovations is looking for an experienced DSM Evaluation Manager to join our expanding team. In... ...primary responsibilities will include leading impact evaluations and offering engineering... .... Duties and Responsibilities Lead and conduct impact evaluations and cost-effectiveness...Full timeInternshipWork at officeLocal area$90k - $175k per year
...are someone who is: ~ Experienced in leading teams through client delivery, RFP/RFI... ...archiving o Technology innovation ideation, evaluation, and adoption o Enterprise architecture... ...and how you can contribute. Be the leader you want to be Some guide teams, some change...Permanent employmentRemote workFlexible hours$4102 per week
...Manager and/or the Sourcing Operations Manager, the Sourcing Category Lead is responsible for leading overall planning, execution and... ...supply needs and develop new sources as required. Perform a thorough evaluation of companies manufacturing facilities, financial health, safety...Long term contractContract workTemporary work$10k per year
...Informatics (MHI) Sessional Dates of Appointment: Fall 2026, September to December Existing Vacancy: Yes Course Title: MHI2009H: Evaluation Methods for Health Informatics Course Description: There is little debate that health information systems have...Long term contract- ...range of perspectives is a requirement for building great products. Join us on our mission and shape the future! Why this role? Evaluation is critical to making progress in scaling intelligence. As models continue to become superhuman in many real-world use cases, we...Full timeWork at officeRemote workFlexible hours
- ...clean energy infrastructure. Job Summary: The Design Delivery lead is required to provide leadership and governance of engineering... ...Lead design change management, ensuring all changes are properly evaluated Support formal design governance reviews, stage gate approvals...Full timeContract workTemporary workFor contractorsInternshipWork at officeLocal areaRemote workRelocationFlexible hours
- ...Job Responsibility: The Technology Lead is a senior management level position responsible for accomplishing results through the management... ...as well as conduct personnel duties for team (e.g. performance evaluations, hiring and disciplinary actions) Utilize in-depth knowledge...Full time
- ...Responsibility: Summary Reporting to the Production Supervisor, this role leads a small team to deliver high quality results in a productive and... ...post-secondary education preferred Understands and is able to evaluate Batch documents and production dates (e.g. Julian date,...Full time
$85k - $156k per year
...will your typical day look like? The Data Quality Management Lead is responsible for designing, implementing, and operationalizing... ...and remediation. Benchmark against industry best practices. Evaluate emerging technologies and innovations that enable Data Quality...Permanent employmentFlexible hours$105 per hour
SAP Functional Lead - Finance (S/4HANA) Contract: 6 months (extension / conversion potential) Work Model: Hybrid - 2 days onsite + 4th... ...implementation Create and review functional specifications Evaluate solution options, risks, and impacts Act as liaison between...Contract work$84k - $105k per year
...of how Lyft's talent acquisition team operates day-to-day. You'll lead the coordination team — setting the standard for scheduling... ...automation capabilities — and you use that knowledge to mentor, evaluate, and improve, not just to manage. You'll contribute to workstreams...Hourly payWork at officeImmediate startFlexible hours3 days per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Evaluation Lead. Be the first to apply!
