About Decagon
Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.
Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.
We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.
We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.
The ML Infrastructure team builds the systems that power every stage of Decagon's model lifecycle. We own the platforms for model training, the infrastructure for model evaluation and experimentation, and the routing layer that manages inference across multiple providers.
We work at the intersection of research and production: translating cutting-edge ML models into reliable, scalable systems that run in customer environments. We collaborate closely with Research, Infrastructure, and Product teams to ensure models train efficiently, serve reliably, and deliver exceptional user experiences.
The team values technical rigor, pragmatic decision-making, and building systems that others love to use.
We're hiring a Senior ML Infrastructure Engineer to own the platforms powering Decagon's model training and inference. You'll build distributed training systems, design inference architecture across multiple providers, and create the frameworks that let our Research and Product teams ship faster.
This role is for someone who thrives on technical depth, can lead multi-quarter initiatives, and wants to shape the long-term architecture of our ML stack.
Design and build distributed training platforms for LLM and multimodal fine-tuning and post-training at scale
Integrate state-of-the-art training algorithms into production pipelines
Own inference architecture and multi-provider routing, including failover and optimization
Lead initiatives to improve latency and cost efficiency across the training and serving stack
Build evaluation and experimentation infrastructure that enables rapid, reliable iteration
Drive technical direction, mentor engineers, and establish best practices for ML infrastructure
6+ years building ML infrastructure or production systems at scale
Deep experience with distributed training: multi-node GPU clusters, fault tolerance, and optimization
Strong understanding of LLM inference: latency optimization, provider tradeoffs, and serving architecture
Proven track record leading complex, multi-quarter technical projects
Medical, dental, and vision benefits
Take what you need vacation policy
Daily lunches, dinners and snacks in the office to keep you at your best
$250K – $330K + Offers Equity
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Senior Technical Architect needed to lead architecture, prototyping, and technical decisions for R&D product work on a Tiered Pricing Mechanism in a remote Web3/DeFi research unit.
Staff Software Engineer to build and scale AI-native full-stack products at HubSpot Foundry, shipping rapid prototypes and production-ready features that help SMBs grow.
Lead frontend teams to design and deliver scalable Angular applications for BetMGM, championing AI-assisted engineering practices to accelerate delivery and improve code quality.
Senior Software Engineer (Mobile) to lead and deliver high-quality React Native mobile experiences while contributing across Rev’s full-stack platform to accelerate growth and engagement.
Entry-level software developer role at Voya Financial working on designing, coding, testing and maintaining application components while supporting user requirements and learning from senior engineers.
Work with customers to co-architect, build, and operate production AI agents using LangChain’s platform and tools.
Work directly with the founder to harden rapid AI-driven prototypes into battle-tested, frontend-forward foundations for an early-stage precision medicine platform.
Experienced Angular frontend developer needed to implement accessible, component-driven web interfaces for a federal modernization program and collaborate with UX, backend, and product teams.
Senior Angular/Full-Stack Engineer to drive front-end architecture and build provider-facing treatment planning and eligibility UIs at Wellfit, working across Product, Design, and backend teams.
Lead and mentor cloud-focused engineering teams to deliver scalable, production-ready systems that expand access to technology-enabled pediatric care.
Lead and scale the Web Platform engineering organization to deliver high-performance, SEO-driven web experiences using modern web technologies and strong cross-functional collaboration.
Lead architecture and engineering efforts to design, build, and deliver scalable, containerized applications using Golang, JavaScript, and Python for mission-driven federal clients.
NextGen Federal Systems seeks a seasoned Senior Software Engineer to lead full-stack TypeScript/React/Node development and deliver secure, mission-critical software in an agile, DevSecOps-aware environment.