This is a remote based position in US. Please note that as part of the recruitment and hiring process, there is an in-person meeting that will take place.
We are seeking a skilled and experienced Staff Cloud Platform Engineer with expertise in Kafka to join Cloud Platform team. The Staff Cloud Platform Engineer to design, deploy, operate, and optimize our Apache Kafka-based event streaming infrastructure at scale to design in Google Cloud Platform (GCP).The ideal candidate will have a strong background in DevOps practices, cloud infrastructure automation, and big data technologies. In this role you will partner closely with platform, data, and application engineering teams to ensure our Kafka clusters are reliable, performant, and secure — running natively on GCP or AWS.
Responsibilities:
Design, provision, and manage Apache Kafka clusters (self-managed on GCP/AWS or via Confluent Platform / MSK).
Configure and tune brokers, ZooKeeper/KRaft, topics, partitions, replication factors, and retention policies for high throughput and low latency.
Perform cluster upgrades, rolling restarts, and broker replacements with zero downtime.
Implement and manage Kafka Connect pipelines for data ingestion and egress across heterogeneous systems.
Administer Kafka Streams and ksqlDB deployments for real-time stream processing workloads.
Maintain Schema Registry and enforce schema governance standards across teams.
Define and track SLIs/SLOs for consumer lag, throughput, end-to-end latency, and broker health.
Design and implement cloud infrastructure using IaC – Terraform
Build automated deployment pipelines for Kafka configuration changes using GitOps workflows (ArgoCD, Flux).
Create self-service tooling and runbooks to reduce toil for development teams.
Automate topic provisioning, ACL management, and schema registration via APIs and CLI tooling.
Integrate tools like GitLab CI/CD, or Cloud Build for automated testing and deployment.
Ensure seamless integration of data pipelines with other GCP services like Big Query, Cloud Storage.
Monitor and Optimize performance, reliability, and cost of Kafka and streaming pipelines
Implement security best practices for GCP resources, including IAM policies, encryption, and network security.
Ensure Observability is an integral part of the infrastructure platforms and provides adequate visibility about their health, utilization, and cost.
Collaborate extensively with cross functional teams to understand their requirements; educate them through documentation/trainings and improve the adoption of the platforms/tools.
Qualifications:
10+ years of overall experience in DevOps cloud engineering, or data engineering.
5+ years of experience in Kafka at production scale.
Deep expertise in Kafka internals: replication protocol, log compaction, consumer group coordination, partition leadership, and KRaft mode
Proficiency with container orchestration (Kubernetes / Helm) and deploying Kafka via Strimzi, Confluent Operator, or equivalent
Strong understanding of networking (VPC, peering, private endpoints, DNS, load balancing) in cloud environments.
Hands-on experience with Kafka Connect, Schema Registry, and at least one stream processing framework (Kafka Streams, Flink, Spark Structured Streaming).
Proficiency in Google Cloud Platform (GCP) services, including Dataflow, Pub/Sub, Kafka, Dataproc, Big Query, and Cloud Storage.
Expertise in Infrastructure as Code (IaC) tools like Terraform or Cloud Deployment Manager.
Familiarity with data orchestration tools like Apache Airflow or Cloud Composer.
Experience with CI/CD tools like Jenkins, GitLab CI/CD, or Cloud Build.
Knowledge of containerization and orchestration tools like Docker and Kubernetes.
Strong scripting skills for automation (e.g., Bash, Python).
Experience with monitoring tools like Cloud Monitoring, Prometheus, and Grafana.
Familiarity with logging tools like Cloud Logging or ELK Stack.
Strong problem-solving and analytical skills.
Excellent communication and collaboration abilities.
Ability to work in a fast-paced, agile environment.
#LI-Remote
The base pay range for this position varies based on the geographic location. More information about the pay range specific to candidate location and other factors will be shared during the recruitment process. Individual pay is determined based on location of residence and multiple factors, including job-related knowledge, skills and experience.
San Francisco Bay Area:
156,400 - 265,700 USD AnnualAll Other US Locations:
As a part of the total compensation package, this role may be eligible for a bonus. For information on our benefits click here.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Senior Software Engineer needed to develop high-performance, mission-critical software and algorithms for Anduril’s autonomy and sensor-fusion systems.
The Real Deal seeks a Full Stack Developer to build scalable, data-driven web applications and intuitive user experiences for its high-traffic real estate products.
Help build TierZero's core product as a founding engineer, designing agentic LLM systems, ML pipelines, and scalable infrastructure to accelerate how teams run code in production.
Lead development of scalable native iOS and Android streaming experiences and contribute across TV platforms while promoting AI-assisted workflows and strong platform architecture.
Lead architecture and engineering efforts to design, build, and deliver scalable, containerized applications using Golang, JavaScript, and Python for mission-driven federal clients.
Lead and mentor a software engineering team at Renesas to deliver high-quality embedded and application software while driving execution and cross-functional collaboration.
Senior Software Engineer, Data Platform to own and scale Chime’s core data infrastructure—ETL/ELT frameworks, streaming pipelines, governance, and observability—across batch and streaming domains.
Lead and grow a cross-platform engineering team to deliver enterprise-ready features and agent-driven experiences that drive measurable growth for Superhuman's Sales-led line of business.
Liatrio is hiring a Principal Application Modernization Engineer to lead architectural direction, deliver complex modernization workstreams, and integrate AI capabilities into enterprise applications.
A paid summer Software Engineering Internship at Gen (NortonLifeLock) offering hands-on experience building and maintaining production code within a leading consumer cybersecurity organization.
Help build AI-first government software at Kaizen as a Product Software Engineer, delivering high-impact, real-world features used by millions.
Lead and scale the Web Platform engineering organization to deliver high-performance, SEO-driven web experiences using modern web technologies and strong cross-functional collaboration.
NVIDIA's NVHPC compilers & tools group seeks a Senior HPC Performance Engineer to analyze and optimize high-performance applications across CPU and GPU architectures and guide compiler and application engineering improvements.
To enable CSPs of all sizes to simplify, excite and grow.
5 jobs