Browse 14 exciting jobs hiring in Llm Serving now. Check out companies hiring such as FriendliAI, Zencore, Intel in Laredo, Tampa, Cincinnati.
Help architect and operate FriendliAI’s enterprise inference platform as a Senior Backend Engineer focused on APIs, multi-tenant SaaS features, and data/system reliability at scale.
Lead design and delivery of secure, scalable, production-grade AI/ML solutions as Zencore’s Principal Architect, advising clients and shaping cloud-native architectures.
Intel is hiring an AI Software Engineer to develop deployment, data, and evaluation infrastructure for agentic AI frameworks and model-serving systems.
Lead Zapier's AI Platform team to build reusable model-serving, evaluation, and MLOps tooling that helps product teams ship AI features quickly, safely, and cost-effectively.
Lead system- and hardware-focused optimizations for LinkedIn’s AI inference platform, improving GPU utilization, compiler workflows, and low-latency model serving at scale.
Lead the design and delivery of a closed-loop intelligence layer that enables an autonomous trading fleet to learn from real-time outcomes and improve profitability.
Straia seeks a Senior Platform Engineer to design and operate the data movement, model-serving, and platform infrastructure that powers low-latency AI analytics for higher education.
AI Engineering Intern at Actian to help integrate ML models into production applications while gaining hands-on experience with model serving, data pipelines, and full-stack development.
Lead the design and scaling of Bumble’s matching, recommendation, and agentic AI systems to deliver low-latency, ML-powered experiences across the product.
Drive production-quality integrations of NVIDIA Grove into Dynamo and leading open-source AI frameworks, delivering adapters, runtime components, and developer tooling for scalable training and inference.
Shape and own the QA strategy for FriendliAI’s inference platform, covering backend, frontend, model deployments, and novel validation for LLM inference quality.
Senior Backend Engineer needed to design and operate production-grade APIs and backend systems for a fast-moving AI inference platform serving enterprise deployments.
Toyota Research Institute is hiring a Senior Machine Learning Engineer to build ML infrastructure, integrate and fine-tune LLMs, and operationalize multimodal research workflows for robotics, autonomy, energy, and materials programs.
Decagon is hiring a Senior ML Infrastructure Engineer to design and scale distributed training and multi-provider inference platforms for LLMs and multimodal models.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
1
|