Browse 7 exciting jobs hiring in Evals now. Check out companies hiring such as Fieldguide, Tetrix, Artificial Intelligence Underwriting Company in Grand Prairie, Greensboro, Chicago.
Lead the design and implementation of evaluation infrastructure and observability for enterprise-grade AI agents powering audit and assurance workflows at Fieldguide's San Francisco office.
Lead the design and delivery of production AI systems—agents, extraction engines, and evaluation pipelines—at a fast-growing startup transforming private market data into reliable, auditable insights.
Join an early-stage AI safety startup as a founding Forward Deployed Engineer to design rigorous AI evals, lead customer implementations, and shape product strategy for certification of real-world AI agents.
Harvey is hiring engineers to build and optimize agent systems that automate complex legal workflows using LLMs, custom tools, and evaluation-driven iteration.
Instrument is hiring a Senior AI Engineer to design and implement the core multi-agent intelligence, context management, and evals infrastructure for a large-scale, stateful generative-AI simulation project.
Lead the Agent engineering team at Descript to deliver a best-in-class, scalable agentic video editing experience by driving technical execution, product-driven experimentation, and team growth.
At Variance, you will design and implement domain-specific benchmarks and evaluation systems that reveal failure modes and drive improvements in ML and agent behavior for fraud, identity, and risk workflows.