Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Senior Software Engineer - AI Inference image - Rise Careers
Job details

Senior Software Engineer - AI Inference

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Software Engineer - AI Inference in United States.

This role offers an opportunity to work at the forefront of large language model inference, contributing directly to high-performance open-source serving frameworks used at scale. You will help shape how modern AI applications run efficiently on advanced GPU infrastructure by improving the performance, reliability, and scalability of inference systems. Working in a deeply technical and collaborative environment, you will focus on optimizing runtime behavior, reducing latency, and increasing throughput for production-grade AI workloads. The position combines systems engineering, low-level optimization, and open-source contribution, with direct impact on widely used AI frameworks. You will engage with a global engineering community while solving complex performance challenges across distributed GPU systems. This is an ideal role for a hands-on engineer passionate about AI infrastructure and high-performance computing.


Accountabilities:
  • Contribute features, optimizations, and fixes to open-source inference frameworks such as vLLM and SGLang
  • Design and improve inference runtime components including scheduling, batching, request handling, and KV-cache optimization
  • Profile and optimize performance-critical paths across Python, C++, and CUDA layers
  • Enhance multi-GPU inference performance through improved parallelism, communication strategies, and resource utilization
  • Develop benchmarking systems and regression tests to ensure performance stability and correctness across deployments
  • Investigate and resolve bottlenecks using profiling tools, GPU analysis, and data-driven performance evaluation
  • Collaborate with cross-functional teams to translate production needs into scalable, upstream-ready solutions
  • Participate in code reviews, architectural discussions, and open-source community contributions

Requirements:

  • 5+ years of experience in production software engineering with strong systems-level expertise
  • Hands-on experience with LLM inference or serving frameworks such as vLLM, SGLang, or similar systems
  • Strong programming skills in Python and C++ and/or CUDA with ability to debug and optimize performance-critical code
  • Experience with performance profiling tools, benchmarking, and latency/throughput optimization techniques
  • Solid understanding of distributed systems, concurrency, and multi-GPU or multi-node architectures
  • Strong communication skills and experience working in or contributing to open-source projects
  • Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or equivalent experience
  • Strong advantage: contributions to open-source AI, ML, or systems projects such as PyTorch, Triton, NCCL, or similar ecosystems
  • Strong advantage: experience with GPU memory optimization, kernel fusion, or advanced inference techniques such as quantization or speculative decoding
  • Strong analytical mindset with a focus on measurement-driven engineering

Benefits:

  • Competitive base salary ranging from $152,000 to $287,500 depending on level and experience
  • Equity participation in addition to base compensation
  • Comprehensive health, dental, and vision insurance coverage
  • Flexible work arrangements supporting work-life balance
  • Paid time off, holidays, and parental leave benefits
  • Professional development opportunities in advanced AI and systems engineering
  • Exposure to cutting-edge AI infrastructure and large-scale GPU computing systems
  • Inclusive and innovation-driven engineering culture.


How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

 Why Apply Through Jobgether? 

 

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

 

 

#LI-CL1

Average salary estimate

$219750 / YEARLY (est.)
min
max
$152000K
$287500K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User

Work remotely as a Front-End Application Developer building accessible, scalable React/Angular applications for environmental data platforms while contributing across the full stack.

Photo of the Rise User

Lead the development of scalable backend systems and CV-driven features for a fast-moving youth-sports platform, shaping automated highlights and video analytics used by millions.

Photo of the Rise User
Vendelux Hybrid No location specified
Posted 8 hours ago

Work with Vendelux's Product Engineering team to build user-facing full-stack features and gain hands-on startup engineering experience in a backend-focused, remote-friendly internship.

Photo of the Rise User
Posted 11 hours ago

The Real Deal seeks a Full Stack Developer to build scalable, data-driven web applications and intuitive user experiences for its high-traffic real estate products.

Photo of the Rise User
Posted 5 hours ago

Alegeus is hiring a Software Engineer II to design, develop, and maintain .NET-based SaaS applications that support fintech and healthtech solutions in a collaborative, hybrid environment.

Photo of the Rise User
Posted 10 hours ago
Inclusive & Diverse
Transparent & Candid
Growth & Learning
Diversity of Opinions
Mission Driven
Customer-Centric
Rapid Growth
Dare to be Different
Collaboration over Competition

Work on Patreon's Identity & Access team to design and implement authentication, verification, and anti-account-takeover systems that protect creators and fans while delivering a great user experience.

Photo of the Rise User

Build and own backend services, APIs, and customer-facing features for Astro Private Cloud to provision, configure, and operate Airflow environments at scale.

NextGen Federal Systems seeks a seasoned Senior Software Engineer to lead full-stack TypeScript/React/Node development and deliver secure, mission-critical software in an agile, DevSecOps-aware environment.

Posted 19 hours ago

Help design and implement the UI and interaction layer between engineers and Archie, shaping workflows and real-time systems that make AI a practical engineering teammate.

Photo of the Rise User
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

Contribute to Isaac Lab as a Software Engineering Intern focused on building scalable simulation, perception-in-the-loop RL, and sim-to-real capabilities for robot learning at NVIDIA.

Photo of the Rise User
Posted 23 hours ago
Mission Driven
Customer-Centric
Transparent & Candid
Growth & Learning
Fast-Paced
Inclusive & Diverse
Work/Life Harmony
Rise from Within
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Education Stipend
Learning & Development
Bias Training
Performance Bonus

Staff Software Engineer to build and scale AI-native full-stack products at HubSpot Foundry, shipping rapid prototypes and production-ready features that help SMBs grow.

Photo of the Rise User

Help architect and operate the systems that take neuroscience datasets from raw experiments through large-scale model training, evaluation, and optimized production inference at Metamorphic.

Photo of the Rise User

Experienced Java Technical Lead/Architect needed to provide hands-on architecture, design reviews, and leadership for large-scale enterprise systems in Santa Clara.

Photo of the Rise User
Posted 23 hours ago

Experienced C++ engineers are needed to evaluate, repair, and improve AI-generated code as contractor contributors to an RLHF pipeline.

Photo of the Rise User
Posted 42 minutes ago

Lead development of scalable native iOS and Android streaming experiences and contribute across TV platforms while promoting AI-assisted workflows and strong platform architecture.

Jobgether has the ambition to disrupt the recruitment industry as we know it by simplifying it and making it more accurate 🎯 Jobgether platform connects candidates and companies based on: - Skills -... Values - Ambition - Personality The candidat...

719 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 17, 2026
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!