Job details

Software Engineer — Developer Platform

We're building the company which will de-risk the largest infrastructure build-out in history.

When people finance GPU clusters, the datacenters housing them, and the infrastructure powering them, they need "offtake" - meaning someone has signed a contract to lease the cluster for a period of time before its even built.

Financing a GPU cluster is inherently risky, since margins are thin and volumes are huge. Lenders don't want to take on the risk that cluster developers can't repay their loan, and cluster developers really don't want to risk not selling their cluster. As a result, risk is offloaded to the customer using fixed-price long-term contracts.

If you don't mitigate this customer risk, there's a bubble. This isn't SaaS anymore - application layer companies sign multi-year contracts for computer and inference, but sell to customers on monthly subscriptions. If you mess up a purchase, it's game over: a minor shift in your revenue growth rate might mean the difference between profit or bankruptcy. But what if companies could exit their contract by selling it back to the market?

Otherwise, as AI scales, compute only becomes available to folks who can effectively take on that risk. A 2-person startup in a San Francisco Victorian can't realistically sign a 5-year take or pay contract on $100m supercomputers. But they may be able to buy the month of liquidity that someone else sold back.

So that's what we make: a liquid market for GPU offtake.

About the Tooling Team

We are a small team focused on making SFCompute engineering faster, more observable, and more reliable. Our work spans data infrastructure, developer experience, pre-production environments, and AI tooling — but the common thread isn't any specific domain. It's that we find the problems nobody else owns and make them solved problems.

Everyone on this team wears many hats. You'll work across the stack, collaborate with all parts of engineering, and regularly take on problems that don't fit neatly into a job description. If you want a narrow scope and a clear ticket queue, this team isn't it. If you want to have a large, legible impact on a small team building serious infrastructure, read on.

The Role

We're looking for a platform engineer who cares about the full pre-production experience — not just staging clusters, but the entire ecosystem of tooling that makes development fast and safe. Right now the gap between dev and prod is a real frustration. You'll close it. That means building a realistic staging environment, but it also means owning internal developer tooling, improving deployment pipelines, and eventually getting us off managed platforms like Vercel where self-hosting makes sense.

What You'll Do

Design and build a pre-production EKS cluster that mirrors production fidelity without production cost
Own the infrastructure-as-code for the cluster (Terraform, Helm, or equivalent)
Integrate the cluster into CI/CD pipelines so changes are validated before they reach prod
Define promotion gates what has to pass in pre-prod before a change is eligible for production
Collaborate with platform and application engineers to understand what needs to be testable
Own and evolve internal developer tooling that improves how the team builds, tests, and ships
Drive migration off managed platforms (like Vercel) where self-hosting is the right call
Explore and implement A/B testing and feature flagging infrastructure to support safer, incremental rollouts
Monitor and maintain the pre-production environment over time

What We're Looking For

Hands-on EKS / Kubernetes experience you've provisioned and operated clusters, not just deployed workloads onto them
Experience with infrastructure-as-code tools (Terraform, CDK, or similar)
Familiarity with CI/CD systems (GitHub Actions, ArgoCD, or similar)
Strong operational instincts you know what "production-like" means and how to approximate it affordably
You can scope your own work. The pre-prod environment doesn't exist yet, so the first job is figuring out what it actually needs to be
Nice to have: experience with GPU workloads, bare metal networking, or marketplace-style platforms

Why This Role

We're shipping real workloads to bare-metal GPU clusters, and right now we validate too many infrastructure changes in production. That's the problem this role exists to solve. The cluster you build will be the default environment for every infrastructure change the team makes going forward. You'll own the design, the tooling, and the standards, with full backing from engineering leadership to do it right.

Benefits

Generous equity grant

Team members are offered a competitive salary along with equity in the company

Visa Sponsorships

Yes, we sponsor visas and work permits

Retirement matching

We match 401(k) plans up to 4%

Medical, dental & vision

We offer competitive medical, dental, vision insurance for employees and dependents and cover 100% of premiums

Time off

We offer unlimited paid time off as well as 10+ observed holidays

Parental leave

We offer biological, adoptive, and foster parents paid time off to spend quality time with family

Daily lunch

We cover lunch daily for employees

Unlimited office book budget

You can buy as many books for the office as you want

The San Francisco Compute Company is committed to maintaining a workplace free from discrimination and harassment.

We make employment decisions based on business needs, job requirements, and individual qualifications, without regard to race, color, religion, belief, national origin, social or ethical origin, age, physical, mental, or sensory disability, sexual orientation, gender identity or expression, marital status, civil union or domestic partnership status, past or present military service, HIV status, family medical history or genetic information, family or parental status including pregnancy, or any other status protected by law.

We welcome the opportunity to consider qualified applicants with prior arrest or conviction records. Our commitment to diversity includes hiring talented individuals regardless of their criminal history, in accordance with local, state, and federal laws, including San Francisco’s Fair Chance Ordinance and California’s ban-the-box laws.

If you require reasonable accommodation for any reason, please reach out to us at hiring@sfcompute.com

EKS Kubernetes Terraform Helm CI/CD GitHub Actions ArgoCD Platform Engineer Developer Experience Pre-production Infrastructure as Code GPU Bare metal Feature flags

Average salary estimate

$190000 / YEARLY (est.)

min

max

$160000K

$220000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Senior Software Engineer - SF (On Site)

Voltai Hybrid Palo Alto

VIEW

Posted 24 hours ago

Lead on-prem and cloud deployments of a cutting-edge AI platform for semiconductor and electronics customers as a Senior Software Engineer based in the Bay Area.

Site Reliability Engineer

Clarity Innovations Hybrid Remote

VIEW

Posted 11 hours ago

Experienced Site Reliability Engineer needed to lead observability, automation, and data-focused reliability efforts for cloud-based national security systems in a collaborative, mission-driven environment.

Senior Software Engineer, GenAI Platform

Chime Financial, Inc Hybrid San Francisco, CA, USA

VIEW

Posted 20 hours ago

Help scale Chime's AI-powered Jade assistant by building platform tooling, backend services, and observability systems as a Senior Full-Stack Engineer.

Film Technology AR/VR Internships – Academic Year

NBCUniversal Hybrid 100 Universal City Plaza, Universal City, CA 91608, USA

VIEW

Posted 17 hours ago

Academic Year internship at NBCUniversal's Universal Pictures Content Group focused on full-stack and AR/VR development, machine learning experimentation, and digital transformation projects.

GTM Engineer, Marketing

Ironclad Hybrid San Francisco

VIEW

Posted 19 hours ago

Ironclad is hiring an AI-native GTM Engineer to architect and deploy autonomous agent systems and integrations that automate end-to-end marketing workflows and drive measurable revenue impact.

eCommerce Technical Lead, Application Development and Maintenance

Cardinal Health Hybrid US-Nationwide-FIELD

VIEW

Posted 15 hours ago

Lead the architecture, development, and stabilization of Cardinal Health's cloud-native eCommerce platforms while guiding distributed engineering teams and driving modernization efforts.

Software Engineering Intern/Co-op- Summer 2026

Bosch Group Hybrid 100 Southchase Blvd, Fountain Inn, SC 29644, USA

VIEW

Posted 12 hours ago

Bosch Rexroth is hiring a Summer 2026 Software Engineering Intern to develop C# tools that generate and optimize C++ code for embedded systems in mobile machine applications.

Senior Software Engineer, Identity & Access

Patreon Hybrid No location specified

VIEW

Posted 11 hours ago

Inclusive & Diverse

Transparent & Candid

Growth & Learning

Diversity of Opinions

Mission Driven

Customer-Centric

Rapid Growth

Dare to be Different

Collaboration over Competition

Work on Patreon's Identity & Access team to design and implement authentication, verification, and anti-account-takeover systems that protect creators and fans while delivering a great user experience.

Senior Software Engineer, Data Platform

Lithic Hybrid Remote

VIEW

Posted 13 hours ago

Customer-Centric

Collaboration over Competition

Fast-Paced

Growth & Learning

Lithic seeks a Senior Software Engineer, Data Platform to build production Python backend services and REST APIs that reliably expose Snowflake-powered data to internal consumers.

Technical Lead - Angular / AI

Entain Hybrid 210 Hudson St, Jersey City, New Jersey, United States

VIEW

Posted 14 hours ago

Lead frontend teams to design and deliver scalable Angular applications for BetMGM, championing AI-assisted engineering practices to accelerate delivery and improve code quality.

ML Research Engineer (Model Training)

Awesome Motive Hybrid Palo Alto

VIEW

Posted 13 hours ago

Help architect and operate the systems that take neuroscience datasets from raw experiments through large-scale model training, evaluation, and optimized production inference at Metamorphic.

Senior Manager, Software Engineering

Jobgether Hybrid US

VIEW

Posted 10 hours ago

Lead and mentor cloud-focused engineering teams to deliver scalable, production-ready systems that expand access to technology-enabled pediatric care.

Deployed Engineer (Las Vegas)

LangChain Hybrid No location specified

VIEW

Posted 59 minutes ago

LangChain is hiring a Deployed Engineer to partner with customers on designing, deploying, and operating production AI agents and multi-step LLM workflows.