Job details

Senior AI Engineer (Contract)

Instrument is a digitally native design and technology company built to help brands unlock their full potential. Since 2005, our team of makers, thinkers, and storytellers has partnered with leading brands like Google, Nike, Uber, ŌURA, and Eventbrite to craft digital experiences that create impact and drive results.

Unlike traditional agencies, we don’t just design—we build. Our work lives at the intersection of taste and technology, powered by curiosity, thoughtful curation, and a commitment to delivering the most fitting solution for every brief. We bring this to life across three core offerings: Brand, Marketing, and Product.

As a member of our freelance network, you’ll collaborate with our teams to bring bold ideas to life—whether launching new brands, building digital products, or shaping experiences that move people. We welcome collaborators from all backgrounds and experiences who share our curiosity, creativity, and care for craft.

We believe great work comes from diverse perspectives and shared purpose. If you’re passionate about learning, experimenting, and making work that matters, we’d love to hear from you.

The Project: We are developing a highly complex, stateful, multi-agent simulation engine driven by generative AI. This is a greenfield project exploring new forms of non-linear, narrative-based user interaction at scale. Unlike typical stateless web apps, this system requires persistent "memory" for thousands of unique, evolving user sessions. We are building an architecture where multiple AI agents orchestrate complex logic, maintain long-term context, and react to user inputs in real-time.

The Role: You will serve as the Sr. AI Engineer, focusing on the core intelligence and agentic logic that drives the application. You will be responsible for designing the multi-agent architecture that manages our core interactive loops, dynamic scenario generation, and global system state aggregation. You will be the team’s primary authority on how we talk to large language models, ensuring that our agents are fast, reliable, and strictly scoped to their specific domains to minimize latency and hallucination.

This is a part-time contract role (20 hrs/week) from 4/13–5/29, with a strong likelihood of extension through October at full-time hours.

What You'll Do

Agent Design & Orchestration: Build and manage the logic for complex multi-agent workflows. You will design the systems that handle user onboarding (profile generation), dynamic scenario creation, and real-time interactive simulation loops.
Context Engineering: Architect state management for the LLMs to prevent "context rot" and hallucination. You will strictly govern what each agent knows, structuring context dynamically to maximize token caching and minimize latency.
Advanced Prompting & Evals Infrastructure: Write, test, and version-control robust system instructions for standalone LLMs and multi-agent workflows. You will design, implement, and own a rigorous evaluations (evals) framework to programmatically score both individual prompt performance and end-to-end agent lifecycles. You will establish the CI/CD-style testing loops required to iterate on model behavior predictably and safely at scale.
Moderation and Security Risk Mitigation: Design and implement pipelines in collaboration with our backend team that moderate harmful or offensive user inputs while also mitigating prompt injection attacks and undesired LLM outputs.
Full-Stack Integration: Work closely with backend and frontend teams to seamlessly integrate AI outputs into the user interface, ensuring smooth data flow from the models down to the client.

What You'll Bring

Core Engineering Foundation: Strong traditional programming background. You must understand software architecture and be capable of writing production-grade code. You cannot rely solely on AI coding assistants or vibe coding.
Generative AI Experience: 1–3 years of deep, hands-on experience building and deploying LLM-backed applications in production.
Language Proficiency: Strong proficiency in Python. Strong proficiency in TypeScript and familiarity with modern reactive frontend frameworks (preferably Angular v21).
Agent Frameworks: Hands-on experience with modern agent harnesses (e.g., LangGraph, CrewAI, OpenAI Agents SDK, Claude Agent SDK, Google ADK). Strong preference for candidates with experience using ADK.
Context & Latency Optimization: Deep understanding of how LLMs process information. You must have proven experience optimizing token usage, leveraging caching, prompt and context engineering, and designing systems that fetch only the exact context an agent needs at any given moment.
Risk Mitigation: Hands-on experience with designing and employing guardrails for agents’ actions and outputs while also mitigating prompt injection attacks.

Ideally You Are

A Pioneer: You thrive in an emerging tech landscape where best practices are still being written, and you are excited to help define them.
A Precision Communicator: You understand that a single ambiguous word in a system prompt can derail an entire multi-agent workflow.
Latency-Obsessed: You don't just care that the model gets the right answer; you care about how many milliseconds it took to generate it, and you actively design to reduce that overhead, especially when combined with content moderation and prompt injection mitigation strategies.

Core Tech Skills

Android
Augmented Reality / Virtual Reality (AR/VR)
AWS
Back-end
Creative Technologist
Database
Dev-Ops
Django
E-Commerce
Front-end
Full-stack *
GCP *
iOS
Java
Javascript *
Leader
Marketing
Media
Mobile
Node
Objective-C
Product Development
Product Manager
Prototyping *
Python *
QA
React
Swift
Systems Architecture *
Tech Producer
Typescript *
Unity
UX *
Wagtail

Additional Hard Skills/Knowledge:

Prompt Engineering
Context Engineering
LLM APIs
LLM Agents
Agent Orchestration Harness (e.g. OpenAI Agents SDK / Claude Agent SDK / Google ADK / LangGraph / CrewAI)
AI Evals
AI security, guardrails, and risk management

Pay Range

The expected pay range for this role is $61 -$78 per hour based on the US 3 pay range for a W-2 temporary engagement
- Our company has three regional pay bands that it adheres to depending on your location, we reference them as US 1, US 2, and US 3
- US 3 is our base pay. Examples of cities in US 3 are Portland, Houston and Miami.
- US 2 pay is 7.5% higher than US 3 to meet the market rates. Examples of cities in US 2 are Los Angeles, Chicago and Seattle
- US 1 pay is 15% higher than US 3 to meet market rates. Examples of cities in US 1 are Brooklyn and San Francisco
If you are curious which region you are in, please apply and get connected with our recruiting team!

Senior AI Engineer LLM multi-agent agent orchestration prompt engineering context engineering AI evals prompt-injection moderation Python TypeScript Angular LangGraph Google ADK OpenAI Agents latency optimization production

Instrument Glassdoor Company Review

3.7

Instrument DE&I Review

No rating

CEO of Instrument

Laurel Burton

Approve of CEO

Average salary estimate

$144560 / YEARLY (est.)

min

max

$126880K

$162240K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Site Reliability Engineer

Clarity Innovations Hybrid Remote

VIEW

Posted 7 hours ago

Experienced Site Reliability Engineer needed to lead observability, automation, and data-focused reliability efforts for cloud-based national security systems in a collaborative, mission-driven environment.

Technical Lead - Angular / AI

Entain Hybrid 210 Hudson St, Jersey City, New Jersey, United States

VIEW

Posted 9 hours ago

Lead frontend delivery for BetMGM using Angular while championing AI-assisted engineering practices to accelerate quality and developer productivity.

Machine Learning Infrastructure Engineer, GenAI Technology

Point72 Hybrid United States

VIEW

Posted 7 hours ago

Point72 is hiring a Machine Learning Infrastructure Engineer to build and operate scalable GenAI infrastructure that accelerates model development and production across cloud and on-prem environments.

Senior Software Engineer, Data Platform

Chime Financial, Inc Hybrid San Francisco, CA, USA

VIEW

Posted 7 hours ago

Senior Software Engineer, Data Platform to own and scale Chime’s core data infrastructure—ETL/ELT frameworks, streaming pipelines, governance, and observability—across batch and streaming domains.

Senior Software Engineer

NextGen Federal Systems Hybrid Remote

VIEW

Posted 13 hours ago

NextGen Federal Systems seeks a seasoned Senior Software Engineer to lead full-stack TypeScript/React/Node development and deliver secure, mission-critical software in an agile, DevSecOps-aware environment.

Software Engineer Intern

Awesome Motive Hybrid United States - Wayne, PA

VIEW

Posted 10 hours ago

An opportunity for a motivated student to join a development team as a Software Engineer Intern and work on Angular front-ends and C# backend services while leveraging AI development tools.

Senior Software Engineer

Fundrise Hybrid No location specified

VIEW

Posted 8 hours ago

Work on high-impact screening and fraud-prevention systems at Fundrise, building reliable, scalable software that protects millions of users while partnering closely with Legal, Finance, and Operations.

NBC News Technology Internships – Academic Year

NBCUniversal Hybrid 30 Rockefeller Plaza, New York, NY 10111, USA

VIEW

Posted 13 hours ago

NBC News is hiring Academic Year interns in New York across product, design, data/graphics, mobile development, and software engineering to contribute to real projects while earning $30/hour.

ML Research Engineer (Model Training)

Awesome Motive Hybrid Palo Alto

VIEW

Posted 9 hours ago

Help architect and operate the systems that take neuroscience datasets from raw experiments through large-scale model training, evaluation, and optimized production inference at Metamorphic.

Full Stack Developer

The Real Deal Hybrid New York City

VIEW

Posted 8 hours ago

The Real Deal seeks a Full Stack Developer to build scalable, data-driven web applications and intuitive user experiences for its high-traffic real estate products.

Senior Software Engineer, Data Platform

Lithic Hybrid Remote

VIEW

Posted 8 hours ago

Customer-Centric

Collaboration over Competition

Fast-Paced

Growth & Learning

Lithic seeks a Senior Software Engineer, Data Platform to build production Python backend services and REST APIs that reliably expose Snowflake-powered data to internal consumers.

PL/SQL Developer

Agile Defense Hybrid Rosslyn, VA

VIEW

Posted 8 hours ago

Agile Defense is hiring a PL/SQL Developer in Rosslyn, VA to support and modernize Oracle-backed applications within an agile, mission-focused engineering team.