Agent Evaluation Jobs

Browse 13 exciting jobs hiring in Agent Evaluation now. Check out companies hiring such as Preference Model, Artificial Intelligence Underwriting Company, Fieldguide in St. Louis, Greensboro, Shreveport.

VIEW COMPANIES

Senior Software Engineer, RL Environments

Preference Model Hybrid San Francisco

VIEW

Posted 8 hours ago

Lead design and delivery of realistic, multi-step RL environments at an early-stage startup partnering with frontier AI labs to improve model robustness and training quality.

Member of technical staff - core eng

Artificial Intelligence Underwriting Company Hybrid San Francisco

VIEW

Posted 2 days ago

Design and ship production-grade evaluation infrastructure for cutting-edge AI agents while leading customer-facing certifications and shaping product strategy at AIUC.

AI Engineer, Quality (Evals)

Fieldguide Hybrid San Francisco

VIEW

Posted 3 days ago

Lead the design and implementation of evaluation infrastructure and observability for enterprise-grade AI agents powering audit and assurance workflows at Fieldguide's San Francisco office.

Senior Software Engineer, GenAI Platform

Chime Financial, Inc Hybrid San Francisco, CA, USA

VIEW

Posted 4 days ago

Help scale Chime's AI-powered Jade assistant by building platform tooling, backend services, and observability systems as a Senior Full-Stack Engineer.

Senior Staff Machine Learning Engineer - Agentic Systems

Spotify Hybrid New York, NY

VIEW

Posted 7 days ago

Inclusive & Diverse

Empathetic

Take Risks

Transparent & Candid

Feedback Forward

Mission Driven

Collaboration over Competition

Work/Life Harmony

Maternity Leave

Paternity Leave

Snacks

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Resources

Life insurance

401K Matching

Paid Sick Days

Paid Time-Off

Paid Volunteer Time

Lead the architecture and productionization of Spotify’s shared Agent Engine to power scalable, reliable agent-based experiences across the platform.

Director of AI Engineering

Cover Whale Hybrid No location specified

VIEW

Posted 8 days ago

Lead and build the agentic AI platform that enables pods of engineers and AI agents to safely and reliably deliver production software at scale.

Machine Learning Engineer, AI Agent Platform

Arta Finance Hybrid Mountain View

VIEW

Posted 11 days ago

Help build and deploy production AI agent platforms that power personalized financial advisory workflows for institutional clients at Arta.

ServiceNow AI.Accelerate Bootcamp

ServiceNow Hybrid Building A,B,C 2225 Lawson Lane, Santa Clara, CALIFORNIA, United States

VIEW

Posted 20 days ago

Inclusive & Diverse

Mission Driven

Rise from Within

Diversity of Opinions

Work/Life Harmony

Empathetic

Feedback Forward

Take Risks

Collaboration over Competition

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Resources

Life insurance

Disability Insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

Conferences Stipend

Paid Time-Off

Maternity Leave

Equity

A selective, eight-week (mostly virtual) unpaid bootcamp at ServiceNow for undergraduate students to learn agentic AI, build and evaluate agents, and present a capstone project during an in-person finale.

Applied AI & Agent Engineering Lead - Vice President

DB Hybrid Cary, 3000 CentreGreen Way

VIEW

Posted 23 days ago

Senior engineering leader to design, evaluate and productionize agentic AI systems, prompt architectures and multi-agent orchestration for critical banking workflows at Deutsche Bank in Cary, NC.

Senior Software Engineer

Awesome Motive Hybrid Chicago

VIEW

Posted 27 days ago

Experienced software engineers with strong system-design and ML/LLM experience are needed to build and productionize LLM-powered agents, evaluation pipelines, and scalable AI infrastructure at Permute.

Staff, Machine Learning Engineer

Fullscript Hybrid No location specified

VIEW

Posted 27 days ago

Fullscript is looking for a Staff Machine Learning Engineer to architect and ship production LLM-driven clinical features that improve clinician workflows and patient outcomes.

AI Agent Engineer - San Francisco Only

TRM Labs Hybrid San Fracisco

VIEW

Posted 30 days ago

Work on TRM’s AI Engineering team to design and ship agentic LLM systems and scalable infrastructure that augment investigations and ensure safe, auditable behavior in high-sensitivity environments.

AI Engineer

Varick Agents Hybrid No location specified

VIEW

Posted 30 days ago

Varick seeks an AI Engineer to architect and ship production-grade agent systems, evaluation pipelines, and retrieval-driven context strategies for enterprise AI deployments.

Employment type

Remote/Onsite

Application Type

Date Posted

Department

Work Experience

Industries

Skills

Company size

Funding

Company Culture

Benefits & Perks