Job details

Senior Software Engineer, ML Infrastructure

About Decagon

Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.

Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.

We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.

We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.

About the Team

The ML Infrastructure team builds the systems that power every stage of Decagon's model lifecycle. We own the platforms for model training, the infrastructure for model evaluation and experimentation, and the routing layer that manages inference across multiple providers.

We work at the intersection of research and production: translating cutting-edge ML models into reliable, scalable systems that run in customer environments. We collaborate closely with Research, Infrastructure, and Product teams to ensure models train efficiently, serve reliably, and deliver exceptional user experiences.

The team values technical rigor, pragmatic decision-making, and building systems that others love to use.

About the Role

We're hiring a Senior ML Infrastructure Engineer to own the platforms powering Decagon's model training and inference. You'll build distributed training systems, design inference architecture across multiple providers, and create the frameworks that let our Research and Product teams ship faster.

This role is for someone who thrives on technical depth, can lead multi-quarter initiatives, and wants to shape the long-term architecture of our ML stack.

In this role, you will

Design and build distributed training platforms for LLM and multimodal fine-tuning and post-training at scale
Integrate state-of-the-art training algorithms into production pipelines
Own inference architecture and multi-provider routing, including failover and optimization
Lead initiatives to improve latency and cost efficiency across the training and serving stack
Build evaluation and experimentation infrastructure that enables rapid, reliable iteration
Drive technical direction, mentor engineers, and establish best practices for ML infrastructure

Your background looks something like this

6+ years building ML infrastructure or production systems at scale
Deep experience with distributed training: multi-node GPU clusters, fault tolerance, and optimization
Strong understanding of LLM inference: latency optimization, provider tradeoffs, and serving architecture
Proven track record leading complex, multi-quarter technical projects

Benefits

Medical, dental, and vision benefits
Take what you need vacation policy
Daily lunches, dinners and snacks in the office to keep you at your best

Compensation

$250K – $330K + Offers Equity

ML Infrastructure Senior Software Engineer LLM Distributed Training PyTorch Deepspeed GPU Inference Kubernetes AWS GCP Model Serving Latency Optimization MLOps

Decagon Glassdoor Company Review

3.9

Decagon DE&I Review

No rating

CEO of Decagon

Unknown name

Approve of CEO

Average salary estimate

$290000 / YEARLY (est.)

min

max

$250000K

$330000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Senior Technical Architect

MLabs Hybrid No location specified

VIEW

Posted 14 hours ago

Senior Technical Architect needed to lead architecture, prototyping, and technical decisions for R&D product work on a Tiered Pricing Mechanism in a remote Web3/DeFi research unit.

Staff Software Engineer

HubSpot Hybrid Remote - USA

VIEW

Posted 23 hours ago

Mission Driven

Customer-Centric

Transparent & Candid

Growth & Learning

Fast-Paced

Inclusive & Diverse

Work/Life Harmony

Rise from Within

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Resources

Life insurance

Disability Insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

Education Stipend

Learning & Development

Bias Training

Performance Bonus

Staff Software Engineer to build and scale AI-native full-stack products at HubSpot Foundry, shipping rapid prototypes and production-ready features that help SMBs grow.

Technical Lead - Angular / AI

Entain Hybrid 210 Hudson St, Jersey City, New Jersey, United States

VIEW

Posted 12 hours ago

Lead frontend teams to design and deliver scalable Angular applications for BetMGM, championing AI-assisted engineering practices to accelerate delivery and improve code quality.

Senior Software Engineer - Mobile

Rev Hybrid Austin

VIEW

Posted 22 hours ago

Senior Software Engineer (Mobile) to lead and deliver high-quality React Native mobile experiences while contributing across Rev’s full-stack platform to accelerate growth and engagement.

Associate Software Developer

godirect Hybrid Hartford, CT

VIEW

Posted 14 hours ago

Entry-level software developer role at Voya Financial working on designing, coding, testing and maintaining application components while supporting user requirements and learning from senior engineers.

Deployed Engineer (Salt Lake City)

LangChain Hybrid No location specified

VIEW

Posted 2 hours ago

Work with customers to co-architect, build, and operate production AI agents using LangChain’s platform and tools.

Fullstack Engineer (Founder's Office)

Bioscope AI Hybrid Salt Lake City

VIEW

Posted 18 hours ago

Work directly with the founder to harden rapid AI-driven prototypes into battle-tested, frontend-forward foundations for an early-stage precision medicine platform.

Angular Frontend Web Developer - Contingent

Aretum Hybrid No location specified

VIEW

Posted 14 hours ago

Experienced Angular frontend developer needed to implement accessible, component-driven web interfaces for a federal modernization program and collaborate with UX, backend, and product teams.

Senior Angular/Full-Stack Software Engineer, Wellfit Plans

Wellfit Technologies Hybrid Irving, TX

VIEW

Posted 19 hours ago

Senior Angular/Full-Stack Engineer to drive front-end architecture and build provider-facing treatment planning and eligibility UIs at Wellfit, working across Product, Design, and backend teams.

Senior Manager, Software Engineering

Jobgether Hybrid US

VIEW

Posted 8 hours ago

Lead and mentor cloud-focused engineering teams to deliver scalable, production-ready systems that expand access to technology-enabled pediatric care.

Senior Director of Engineering – Web Platform

A Place for Mom Hybrid No location specified

VIEW

Posted 2 hours ago

Lead and scale the Web Platform engineering organization to deliver high-performance, SEO-driven web experiences using modern web technologies and strong cross-functional collaboration.

Software Engineer Lead & Architect

Accenture Federal Services Hybrid Arlington, VA

VIEW

Posted 11 hours ago

Lead architecture and engineering efforts to design, build, and deliver scalable, containerized applications using Golang, JavaScript, and Python for mission-driven federal clients.

Senior Software Engineer

NextGen Federal Systems Hybrid Remote

VIEW

Posted 16 hours ago

NextGen Federal Systems seeks a seasoned Senior Software Engineer to lead full-stack TypeScript/React/Node development and deliver secure, mission-critical software in an agile, DevSecOps-aware environment.