Rise Jobs & Careers icon Ai Evaluation Jobs

Browse 51 exciting jobs hiring in Ai Evaluation now. Check out companies hiring such as Fieldguide, Awesome Motive, Latitude in Little Rock, Louisville/Jefferson County, Anchorage.

Photo of the Rise User
Posted 2 hours ago

Lead the design and implementation of evaluation infrastructure and observability for enterprise-grade AI agents powering audit and assurance workflows at Fieldguide's San Francisco office.

Photo of the Rise User
Posted 10 hours ago

Apply state-of-the-art AI to financial workflows at Rowspace by building retrieval systems, agentic pipelines, and evaluation frameworks that turn unstructured data into actionable investment insights.

Photo of the Rise User
Posted 12 hours ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Passion for Exploration
Dare to be Different
Growth & Learning
Medical Insurance
Paid Time-Off
Maternity Leave
Equity
Learning & Development
Dental Insurance
Vision Insurance

Latitude seeks a PhD AI research intern to build a benchmark library and evaluate SOTA LLM behavior within our story engine, producing publishable results and a public report.

Posted 3 days ago

Lead the design and delivery of scalable, secure AI-native systems for sophisticated legal customers as a Staff Software Engineer / Architect on Thomson Reuters' CoCounsel FDE team.

Posted 3 days ago

Sony AI’s Research Ethics team is hiring a remote Research Intern to work on generative AI ethics, evaluation, and harm-mitigation research with opportunities for publication.

Photo of the Rise User
Cover Whale Hybrid No location specified
Posted 5 days ago

Lead and build the agentic AI platform that enables pods of engineers and AI agents to safely and reliably deliver production software at scale.

LanguageWire Hybrid No location specified
Posted 5 days ago

LanguageWire is hiring an AI Engineer to design and productionize LLM-based translation workflows and bridge ML experimentation with production engineering.

EQL Tech Hybrid No location specified
Posted 6 days ago

Work on a mission-driven fintech team to build and ship core AI products (LLM/VLM and evaluation pipelines) that power eligibility and compliance for education savings accounts.

Photo of the Rise User
Mercor Hybrid No location specified
Posted 6 days ago

Lead and grow an Applied AI engineering team at Mercor to build scalable evaluation and data systems that measurably improve frontier model performance.

Lead the product vision and engineering for clinician-facing AI tools at knownwell, building and operating RAG-based clinical decision support with full product ownership and direct clinician partnership.

Photo of the Rise User
Brillio Hybrid New York, New York, United States
Posted 7 days ago

Experienced technical product leader needed to own prioritization, quality, and stakeholder alignment for LLM-driven products while staying hands-on with architecture, code reviews, and AI cost optimization.

Photo of the Rise User

Help build and deploy production AI agent platforms that power personalized financial advisory workflows for institutional clients at Arta.

Photo of the Rise User
Salesforce Hybrid California - San Francisco
Posted 10 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Paid Time-Off
Maternity Leave
Paternity Leave
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Employee Resource Groups

Lead Slack's search and AI platform as VP Product to set strategy, drive model and infrastructure decisions, and deliver reliable, scalable AI-powered search and knowledge services for enterprise users.

Photo of the Rise User
Posted 10 days ago

NiCE is hiring a Forward Deployed Engineer to design, ship, and operate production-scale conversational AI agents that solve high-impact enterprise problems.

Experienced domain experts in Business Operations & Communications or Education and Academic Research are needed for a remote, retainer-based 2‑week role evaluating and crafting prompts for AI writing models with US-contextual standards.

Join an early-stage AI safety startup as a founding Forward Deployed Engineer to design rigorous AI evals, lead customer implementations, and shape product strategy for certification of real-world AI agents.

Posted 11 days ago

Epoch AI is hiring remote Researchers and Senior Researchers to conduct data-driven investigations, build benchmarks, and forecast AI capabilities and trends.

Photo of the Rise User

Visa is hiring a Product Analyst to define and scale generative AI platform capabilities, combining product analytics, prototyping, and cross-functional collaboration to deliver responsible, enterprise-grade AI solutions.

Photo of the Rise User
Posted 11 days ago

Colibri Group is hiring an AI Engineering Intern to help design and evaluate AI-driven educational tools, focusing on model behavior, alignment, and responsible AI practices under senior mentorship.

Posted 13 days ago

Unstructured is hiring an AI Engineer to architect and ship production-grade RAG and agentic systems that process messy multimodal data for high-impact government and military contracts.

Weekday AI Hybrid No location specified
Posted 15 days ago

Contract opportunity to evaluate and improve LLM conversational responses in Hindi and English by performing fact-checking, annotation, and qualitative assessment.

Photo of the Rise User
Posted 15 days ago

Lead the design and production of LLM-driven coaching systems at Valence, applying deep ML and engineering expertise to build enterprise-grade, context-aware AI experiences.

Photo of the Rise User
Posted 17 days ago

LinkedIn seeks a Hybrid Machine Learning Engineer to build and deploy scalable relevance and evaluation models for recommender systems and generative/NLP-driven product features.

Photo of the Rise User
ServiceNow Hybrid Building A,B,C 2225 Lawson Lane, Santa Clara, CALIFORNIA, United States
Posted 17 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

A selective, eight-week (mostly virtual) unpaid bootcamp at ServiceNow for undergraduate students to learn agentic AI, build and evaluate agents, and present a capstone project during an in-person finale.

Photo of the Rise User

AIR is hiring a Technical Assistance Consultant to develop and deliver workforce-focused TA, training, and capacity-building to advance economic mobility, workforce development, and future-of-work strategies including AI integration.

Photo of the Rise User
ServiceNow Hybrid 15725 Dallas Pkwy, Addison, TX 75001, USA
Posted 19 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

Lead the strategic integration of AI across ServiceNow marketing by owning the MarTech and agentic product portfolio to drive adoption, efficiency, and measurable business impact.

Posted 19 days ago

Senior engineering leader to design, evaluate and productionize agentic AI systems, prompt architectures and multi-agent orchestration for critical banking workflows at Deutsche Bank in Cary, NC.

Generative AI Analyst at Welocalize to craft prompts, annotate and evaluate LLM outputs, and lead labeling workflows in a remote full-time role.

Photo of the Rise User
Posted 21 days ago

Lead the design and implementation of secure, scalable Generative AI and ML architectures for an EdTech organization focused on building production-ready RAG, retrieval, and MLOps solutions.

Photo of the Rise User
Posted 21 days ago

Build the internal tooling and evaluation infrastructure that empowers engineers and researchers to iterate quickly and reliably on Crosby’s LLM-powered legal platform.

Posted 21 days ago

Neighbors Bank is looking for a decisive, process-improvement focused Recruiting Coordinator to manage hiring pipelines, conduct candidate evaluations, and help evolve recruiting practices in a fully remote role.

Photo of the Rise User
Posted 22 days ago
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Sabbatical
Paid Holidays

Handshake is hiring an ML Research Scientist to drive open scientific research, create public benchmarks, and collaborate with top AI labs to advance data and evaluation methods for frontier models.

MLabs Hybrid No location specified
Posted 23 days ago

Lead the design and evaluation of agentic LLM systems that power a fintech's financial intelligence platform, ensuring correctness, scalability, and production reliability.

Photo of the Rise User

SweetRush is hiring an Instructional Designer/eLearning Developer to create and deliver IT-focused learning solutions (AI, cybersecurity, workplace apps) for a global enterprise in a remote, Eastern Time–preferred contract role.

Photo of the Rise User
Posted 24 days ago

Experienced software engineers with strong system-design and ML/LLM experience are needed to build and productionize LLM-powered agents, evaluation pipelines, and scalable AI infrastructure at Permute.

Photo of the Rise User
Posted 24 days ago

Fullscript is looking for a Staff Machine Learning Engineer to architect and ship production LLM-driven clinical features that improve clinician workflows and patient outcomes.

Photo of the Rise User
Inclusive & Diverse
Diversity of Opinions
Growth & Learning
Mission Driven
Social Impact Driven
Empathetic
Dental Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Performance Bonus
Family Medical Leave
Paid Holidays

Khan Academy is hiring a Senior AI Engineer (24-month fixed-term) to lead integration, evaluation, and quality improvements of generative AI features that support learning at scale.

Photo of the Rise User
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Sabbatical
Paid Holidays

Handshake seeks experienced 3D Slicer users to remotely evaluate AI-generated medical imaging content and provide expert feedback on segmentation, DICOM workflows, and clinical research relevance.

Photo of the Rise User
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Sabbatical
Paid Holidays

Handshake seeks experienced Shotcut users to evaluate AI-generated video edits and create tool-focused assessment materials on a flexible, remote, hourly contract basis.

Photo of the Rise User
ServiceNow Hybrid 15725 Dallas Pkwy, Addison, TX 75001, USA
Posted 26 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

Lead the AI product portfolio for marketing to turn enterprise AI strategy into a cohesive MarTech roadmap, measurable productivity gains, and durable automation at scale.

Photo of the Rise User
ServiceNow Hybrid 275 Wyman St 2nd floor, Waltham, MA 02451, USA
Posted 26 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

Lead the AI MarTech product portfolio at ServiceNow to convert AI strategy into scalable agentic workflows, measurable productivity gains, and sustained marketing leverage.

Photo of the Rise User

Work on TRM’s AI Engineering team to design and ship agentic LLM systems and scalable infrastructure that augment investigations and ensure safe, auditable behavior in high-sensitivity environments.

Posted 27 days ago

aiEDU is hiring a Senior Lead, Research & Evaluation to design and run impact measurement, lead research strategy, and build data systems that inform program decisions across the organization.

Varick Agents Hybrid No location specified
Posted 27 days ago

Varick seeks an AI Engineer to architect and ship production-grade agent systems, evaluation pipelines, and retrieval-driven context strategies for enterprise AI deployments.

Photo of the Rise User

Lead the design, production deployment, and continual improvement of AI-powered features for Savvas's flagship K-12 platform, applying deep LLM, cloud, and software engineering expertise to improve student learning at scale.

Posted 28 days ago

Rwazi is hiring a Decision Intelligence Analyst to validate and improve AI-driven decision outputs by identifying failure modes, formalizing evaluation rubrics, and refining judgment frameworks.

Photo of the Rise User
ServiceNow Hybrid 15725 Dallas Pkwy, Addison, TX 75001, USA
Posted 28 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

Lead the AI product portfolio for marketing at ServiceNow, defining and delivering a unified MarTech and agentic roadmap that drives measurable productivity and enterprise-scale adoption.

Photo of the Rise User

Lead architecture and delivery of scalable, secure AI and agentic systems at PointClickCare to drive measurable clinical and operational outcomes across the platform.

Photo of the Rise User

Contract reviewers are needed to compare AI-generated English text pairs, choose the clearer response, and provide concise explanations to help improve model output quality.

Posted 29 days ago

Virtue AI is seeking a hands-on Testing Engineer to lead product and backend QA, automate system testing, and perform model red-teaming for a cutting-edge AI security platform.

Employment type
Remote/Onsite
Application Type
Date Posted
Department
Work Experience
Industries
Skills
Company size
Funding
Company Culture
Benefits & Perks
Company Rating
Salary (USD)
Keywords to Exclude

How much do ai evaluation jobs pay?

Below 50k*
2
33%
50k-100k*
0
0%
Over 100k*
4
67%
*average yearly salary (USD)

Best cities to find ai evaluation jobs