Browse 66 exciting jobs hiring in Inference now. Check out companies hiring such as NVIDIA, Superhuman, Hinge Health in Norfolk, Virginia Beach, Arlington.
Senior Architect role to design and implement high-performance AI communication and memory libraries while driving hardware-software co-optimization across GPUs, DPUs, NICs, and interconnects at NVIDIA.
Superhuman is hiring Data Scientists in San Francisco to lead experimentation, causal inference, and ML-driven analyses that influence product and growth decisions.
Lead the measurement, experimentation, and data architecture for HingeSelect as the first dedicated Staff Product Data Scientist driving causal analysis, funnel optimization, and supply-demand modeling.
Point72 is hiring a Machine Learning Infrastructure Engineer to build and operate scalable GenAI infrastructure that accelerates model development and production across cloud and on-prem environments.
Help architect and operate the systems that take neuroscience datasets from raw experiments through large-scale model training, evaluation, and optimized production inference at Metamorphic.
Thrad seeks an Applied Scientist to design, train, and productionize real-time contextual ad relevance and bidding models for LLM conversations at its San Francisco HQ.
Thrad seeks a hands-on Data Scientist to drive product and commercial decisions through rigorous causal analysis, experimentation, and analytics at its San Francisco HQ.
FanDuel is hiring a VP of Marketing Analytics / Marketing Science to lead measurement strategy and analytics that quantify ROI and commercial value across media, sponsorships, and partnerships.
Lead and grow Zoox's ML Platform engineering organization to deliver scalable training and low-latency inference infrastructure for large foundation and RL models across vehicle and cloud environments.
Develop and productionize agent systems and the Friendli Agent API at FriendliAI to enable developers to build reliable, high-impact AI agent applications.
Drive product decisions for Spotify Premium as a Data Scientist focused on experimentation, AI-enabled analytics, and insights that increase conversion and retention.
Senior Machine Learning Engineer needed to transform prototype AI models into optimized, production-ready systems for secure, distributed public sector and edge deployments.
Lead performance and scalability improvements for LLM inference by optimizing runtime components, multi-GPU execution, and open-source serving frameworks at scale.
Contribute to state-of-the-art robot learning and on-robot deployment at a fast-moving consumer robotics startup focused on dexterous home manipulation.
Aviator Health seeks a Technical Ex‑Founder to lead 0→1 consumer product development and build autonomous agent systems that navigate real healthcare workflows from our NYC office.
Drive production-ready model optimization, custom kernel development, and edge deployment to enable real-time inference of large-scale models on vehicle SOCs for Zoox's Perception team.
Multi Media LLC is hiring a Senior Data Scientist to lead rigorous statistical analyses and measurement efforts that drive product and business decisions for a high-traffic live streaming platform.
Lead system- and hardware-focused optimizations for LinkedIn’s AI inference platform, improving GPU utilization, compiler workflows, and low-latency model serving at scale.
Lead DeepWalk’s computer vision platform as a Staff Software Engineer, driving the architecture and productionization of ML systems that process millions of images for sidewalk inspection and city infrastructure decisions.
Lead a small analytics team to drive causal, hypothesis-driven investigations into network reliability and subscriber experience for a major communications client while producing executive-ready insights and recommendations.
pureIntegration is hiring a Mid-Level Data Analyst to analyze large-scale datasets, produce dashboards and reports, and deliver actionable insights to improve network reliability and subscriber experience on a remote contract.
Lead cutting-edge research on multimodal foundation models and efficient GenAI at Bosch Research Pittsburgh, translating innovations into industrial and product impact while publishing at top-tier venues.
Lead the design and delivery of a closed-loop intelligence layer that enables an autonomous trading fleet to learn from real-time outcomes and improve profitability.
Help scale production ML infrastructure and retrieval systems at Foxglove to enable high-performance semantic search and data mining over multimodal robotics data.
Twelve Labs is hiring a senior Machine Learning Engineer to optimize and scale multimodal video foundation models for deployment across cloud and data platforms.
Solace is seeking a hands-on Marketing Analytics Manager to build and own attribution, incrementality testing, and measurement infrastructure that drives data-informed growth decisions for a fast-scaling healthcare startup.
Deepgram is hiring an ML Ops Infrastructure Engineer to design and operate scalable model deployment, CI/CD, and monitoring systems that deliver production-grade voice AI at scale.
Lead the design and deployment of low-latency, production ML systems for voice, audio, and agentic control at an early-stage hardware and software startup in New York City.
Tavus is hiring a Multimodal AI Model Optimization Research Engineer to convert cutting-edge multimodal models into efficient, low-latency production systems.
Work with research teams to productionize large-scale generative models, build GPU inference infrastructure, and ensure reliable deployment and observability for production ML workloads.
Work across modeling, systems, and product to design, optimize, and ship production-grade AI systems for real-world users.
A Research Engineer role focused on GPU/kernel and distributed-training optimizations to scale and accelerate real-time world-model AI.
Lead and build True Anomaly’s AI platform and engineering team to deliver production-grade model hosting, agent infrastructure, and enterprise AI tooling that embed AI across the company.
Lead the development of custom quantization algorithms and low-precision techniques to maximize model performance on Quadric's Chimera GPNPU from our Burlingame engineering office.
Drive the design and implementation of experimentation methodologies, inference pipelines, and production tooling as a Full‑Stack Data Scientist on Netflix’s Experimentation Platform.
Fundamental is hiring a Model Serving Engineer to build and optimize production inference infrastructure for NEXUS, focusing on Triton-based pipelines, GPU efficiency, and low-latency, high-throughput serving.
Lead Blackbird’s analytics layer to translate product and customer data into strategic decisions that accelerate growth and retention.
Pluralsight seeks an experienced Data Scientist to design, validate, and deploy machine learning and NLP solutions that drive product and business impact.
Triumph is hiring a Data Scientist to build pricing, risk, and behavior models that drive monetization and retention for a high-growth real-money gaming platform.
Dentsu is hiring a VP of Data Science to lead and productize advanced measurement science (MMM, RBA, Bayesian methods) and scale a distributed team to deliver client-facing analytics products.
Lead ML-driven improvements to ad auction performance by building scalable models, running experiments, and partnering with engineering and product teams at a fast-paced ad tech organization.
Develop and optimize high-performance C++ AI and computer-vision software for embedded camera systems used in mission-critical public safety and security applications at Motorola Solutions.
Lead the design and productionization of mission-critical NLP and LLM-powered features at Laurel, shaping the AI platform that returns time to professional services firms.
Lead the Core GenerativeAgent team to design, build, and deploy low-latency, enterprise-grade conversational voice AI combining LLMs with speech-to-text, text-to-speech, and real-time streaming pipelines.
Lead product strategy and discovery for Kamiwaza’s on-prem enterprise AI orchestration platform, turning customer problems into coherent, outcome-driven releases.
Amazon Security seeks a Senior Security Engineer to lead offensive operations and research against AI systems, scaling automated threat emulation across the AI portfolio.
Shape and own the QA strategy for FriendliAI’s inference platform, covering backend, frontend, model deployments, and novel validation for LLM inference quality.
Senior-level embedded AI engineer role at Renesas to lead development of model translation tooling and high-performance inference for resource-constrained MCUs/MPUs.
Senior technical role focused on researching, engineering, and scaling privacy-preserving ML and LLM alignment solutions across LinkedIn's platforms.
Decagon is hiring a Senior ML Infrastructure Engineer to design and scale distributed training and multi-provider inference platforms for LLMs and multimodal models.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
26
|