Rise Jobs & Careers icon Evaluation Jobs

Browse 138 exciting jobs hiring in Evaluation now. Check out companies hiring such as METR, Artificial Intelligence Underwriting Company, Tech Firefly in Worcester, Los Angeles, Cleveland.

Photo of the Rise User
Posted 7 hours ago

METR is seeking experienced researchers and research leads to develop benchmarks, run evaluations, and build infrastructure to measure and mitigate risks from advanced AI systems.

Design and ship production-grade evaluation infrastructure for cutting-edge AI agents while leading customer-facing certifications and shaping product strategy at AIUC.

Photo of the Rise User
Tech Firefly Hybrid No location specified
Posted 11 hours ago

Lead the technical architecture and cross-domain dependency mapping for a fast-paced, remote contract engagement supporting an academic medical center’s multi-year healthcare technology rollout.

Lead a U.S.-based team to migrate template systems to LLM autoraters and optimize model performance using advanced prompt engineering and evaluation methods.

Welo Data is seeking US-based English speakers to remotely evaluate and rate search results to improve search relevancy and AI performance.

Photo of the Rise User

Picogrid seeks a Strategy & Business Operations Lead to design and run the internal systems, metrics, and cross-functional programs that will let the company scale efficiently during rapid growth.

Photo of the Rise User
Posted 2 days ago

Lead the design and implementation of evaluation infrastructure and observability for enterprise-grade AI agents powering audit and assurance workflows at Fieldguide's San Francisco office.

Photo of the Rise User
Posted 2 days ago

Apply state-of-the-art AI to financial workflows at Rowspace by building retrieval systems, agentic pipelines, and evaluation frameworks that turn unstructured data into actionable investment insights.

Photo of the Rise User
Posted 2 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Passion for Exploration
Dare to be Different
Growth & Learning
Medical Insurance
Paid Time-Off
Maternity Leave
Equity
Learning & Development
Dental Insurance
Vision Insurance

Latitude seeks a PhD AI research intern to build a benchmark library and evaluate SOTA LLM behavior within our story engine, producing publishable results and a public report.

Photo of the Rise User
Posted 2 days ago

Energy Trust of Oregon seeks an Engineer, Planning & Evaluation to perform measure development, cost-benefit analyses, pilot design, and technical review to support cost-effective energy-efficiency programs.

Help scale Chime's AI-powered Jade assistant by building platform tooling, backend services, and observability systems as a Senior Full-Stack Engineer.

Photo of the Rise User
Posted 3 days ago

Experienced systems engineering and test & evaluation advisor needed to provide SETA support to the government for verification, test planning, execution, and evaluation of DoD systems.

Photo of the Rise User

Northwestern Medicine is hiring a licensed Occupational Therapist (OTR/L) for per-diem inpatient care in Winfield, IL to provide evaluations, treatment, documentation, and interdisciplinary collaboration.

Photo of the Rise User
Posted 4 days ago

Support ACS’s Employee Wellness program by coordinating and delivering on-site wellness activities across NYC locations while tracking participation and reporting outcomes.

Posted 4 days ago

Lead data-driven program performance analysis and provide actionable recommendations to support DoD and civilian federal programs as a Senior Program Management Analyst at One Federal Solution.

GoodAtNumbers is hiring a US-based remote Machine Learning Engineer Intern to push ML research into production by building, evaluating, and deploying reliable LLM-driven features during a paid 12-week summer internship.

Posted 5 days ago

Lead the design and delivery of scalable, secure AI-native systems for sophisticated legal customers as a Staff Software Engineer / Architect on Thomson Reuters' CoCounsel FDE team.

Posted 5 days ago

Sony AI’s Research Ethics team is hiring a remote Research Intern to work on generative AI ethics, evaluation, and harm-mitigation research with opportunities for publication.

Posted 6 days ago

Serve as Foster America's South Carolina Site Lead to coordinate partners, drive implementation of the OPT-In initiative, and translate learning into sustained local impact for families.

Evaluate luxury brand experiences in the Seattle/Bellevue area through short, flexible missions for CXG and help top brands improve service.

Photo of the Rise User
Inclusive & Diverse
Empathetic
Take Risks
Transparent & Candid
Feedback Forward
Mission Driven
Collaboration over Competition
Work/Life Harmony
Maternity Leave
Paternity Leave
Snacks
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
401K Matching
Paid Sick Days
Paid Time-Off
Paid Volunteer Time

Lead the architecture and productionization of Spotify’s shared Agent Engine to power scalable, reliable agent-based experiences across the platform.

Photo of the Rise User
National Vision Hybrid 2435 Commerce Ave NW, Duluth, GA 30096, USA
Posted 6 days ago

Lead the People Development team at National Vision to design and deliver scalable, measurable learning solutions for corporate, retail, manufacturing, and clinical associates.

Photo of the Rise User
Cover Whale Hybrid No location specified
Posted 6 days ago

Lead and build the agentic AI platform that enables pods of engineers and AI agents to safely and reliably deliver production software at scale.

LanguageWire Hybrid No location specified
Posted 7 days ago

LanguageWire is hiring an AI Engineer to design and productionize LLM-based translation workflows and bridge ML experimentation with production engineering.

Evaluate luxury brand experiences for CXG through flexible in-store or online missions that provide actionable feedback to premium brands.

EQL Tech Hybrid No location specified
Posted 8 days ago

Work on a mission-driven fintech team to build and ship core AI products (LLM/VLM and evaluation pipelines) that power eligibility and compliance for education savings accounts.

Iambic Therapeutics seeks a Software Engineer II to co-develop and harden ML training, evaluation, and productization workflows that enable AI-driven drug discovery.

Photo of the Rise User
Mercor Hybrid No location specified
Posted 8 days ago

Lead and grow an Applied AI engineering team at Mercor to build scalable evaluation and data systems that measurably improve frontier model performance.

Photo of the Rise User
Posted 8 days ago

Application Engineering Intern at Renesas Hi-Rel to perform lab-based evaluations of power/ADC products, produce technical analysis, and present findings.

Evaluate machine-translated English (US) to Japanese (Japan) song lyrics for meaning, fluency, and cultural accuracy on a flexible, remote freelance project with Welo Data.

Photo of the Rise User
Posted 9 days ago

Anduril seeks an experienced manager to lead flight test integration and operations for UAS platforms, overseeing system integration, mesh networking, and Flight Test Operations as an RPIC.

Photo of the Rise User
Posted 9 days ago
Mission Driven
Social Impact Driven
Passion for Exploration
Reward & Recognition

Senior NDE Engineer (Radiography Testing) to design, prototype, and deploy advanced radiography and automated inspection solutions to improve manufacturing quality and flight reliability at SpaceX.

Lead the product vision and engineering for clinician-facing AI tools at knownwell, building and operating RAG-based clinical decision support with full product ownership and direct clinician partnership.

Photo of the Rise User
Brillio Hybrid New York, New York, United States
Posted 9 days ago

Experienced technical product leader needed to own prioritization, quality, and stakeholder alignment for LLM-driven products while staying hands-on with architecture, code reviews, and AI cost optimization.

Photo of the Rise User

Help build and deploy production AI agent platforms that power personalized financial advisory workflows for institutional clients at Arta.

Contract freelance raters in the United States will evaluate personalized map and search recommendations using their Google Maps activity history and follow project guidelines to rate relevance and usefulness.

Posted 11 days ago

Welo Data is building a flexible, remote contributor network of native English speakers to annotate, evaluate, and create prompts that improve AI systems.

Photo of the Rise User

Carilion Clinic is hiring a part-time Community Outreach Specialist to deliver evidence-based pediatric health education and support community partnerships across the Roanoke area.

Evaluate machine-translated English (US) to German (Germany) song lyrics for accuracy, fluency, and cultural appropriateness in a remote freelance role.

Photo of the Rise User
Salesforce Hybrid California - San Francisco
Posted 12 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Paid Time-Off
Maternity Leave
Paternity Leave
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Employee Resource Groups

Lead Slack's search and AI platform as VP Product to set strategy, drive model and infrastructure decisions, and deliver reliable, scalable AI-powered search and knowledge services for enterprise users.

Photo of the Rise User

Lead AbbVie's Neurosciences Search & Evaluation team to identify, assess, and advance high-value external partnering opportunities that strengthen the company’s neuroscience pipeline and strategic goals.

Photo of the Rise User
Posted 12 days ago

NiCE is hiring a Forward Deployed Engineer to design, ship, and operate production-scale conversational AI agents that solve high-impact enterprise problems.

Photo of the Rise User
Montefiore Hybrid 2532 Grand Concourse
Posted 12 days ago

Montefiore is hiring a licensed Psychologist (PhD/PsyD) to conduct disability-related psychological assessments and clinical consultations for participants in the WeCARE employment-focused program.

Experienced domain experts in Business Operations & Communications or Education and Academic Research are needed for a remote, retainer-based 2‑week role evaluating and crafting prompts for AI writing models with US-contextual standards.

Join an early-stage AI safety startup as a founding Forward Deployed Engineer to design rigorous AI evals, lead customer implementations, and shape product strategy for certification of real-world AI agents.

Posted 12 days ago

Work as a freelance luxury brand evaluator for CXG, discreetly assessing boutique and online experiences to help premium brands refine their service.

Serve as the MHPSS Technical Advisor for IRC RAI, providing evidence-based guidance, training, and partnership support to improve mental health and psychosocial services for forcibly displaced populations in the U.S.

Lead and develop a remote evaluation team in WGU’s School of Technology to ensure accurate, scalable competency-based assessment and continuous improvement for Electrical and Computer Engineering programs.

Posted 12 days ago

Epoch AI is hiring remote Researchers and Senior Researchers to conduct data-driven investigations, build benchmarks, and forecast AI capabilities and trends.

Photo of the Rise User

Visa is hiring a Product Analyst to define and scale generative AI platform capabilities, combining product analytics, prototyping, and cross-functional collaboration to deliver responsible, enterprise-grade AI solutions.

Employment type
Remote/Onsite
Application Type
Date Posted
Department
Work Experience
Industries
Skills
Company size
Funding
Company Culture
Benefits & Perks
Company Rating
Salary (USD)
Keywords to Exclude

How much do evaluation jobs pay?

Below 50k*
4
13%
50k-100k*
9
28%
Over 100k*
19
59%
*average yearly salary (USD)

Best cities to find evaluation jobs