Software Engineer - Cloud Engineering Job at Kumo.AI, Mountain View, CA

S09oQlVSdEUxMGVGK3owd2Z6MExscHk2Z0E9PQ==
  • Kumo.AI
  • Mountain View, CA

Job Description

About Kumo.ai

Kumo is building a next-generation AI platform that empowers organizations to make predictive decisions faster—without the overhead of traditional ML pipelines. Backed by Sequoia and led by ex-Airbnb, Pinterest, and LinkedIn leaders, we’re scaling rapidly and looking for a cloud infrastructure engineer to build and run the backbone of our AI platform.

Your work will directly power the models and applications our customers rely on every day.  If you’re passionate about multi-cloud infrastructure , Kubernetes at scale , and building the infrastructure that powers the next generation of AI applications — we’d love to talk.

Why Kumo.ai?

  • Work alongside world-class engineers & scientists (ex-Airbnb, Pinterest, LinkedIn, Stanford).
  • Be a foundational voice in designing a platform powering enterprise-scale AI.
  • Competitive Series B compensation package (salary + meaningful equity).

The Opportunity - The Cloud Infrastructure team builds and operates our Kubernetes-based, multi-cloud AI platform across AWS, Azure, and GCP.

  • As a Cloud Infrastructure Engineer , you’ll work on scaling, securing, and optimizing the platform that powers massive multi-tenant clusters running Big Data and AI/ML workloads.
  • You’ll collaborate closely with senior engineers, ML scientists, and product teams to deliver automation, improve reliability, and expand our multi-cloud capabilities.
  • This role offers the chance to deepen your Kubernetes and cloud expertise while taking ownership of impactful projects.

What You’ll Do

  • Deploy, operate, and maintain infrastructure across AWS, Azure, and GCP.
  • Build and manage Kubernetes clusters (EKS, AKS, GKE) with a focus on performance, availability, and cost efficiency.
  • Develop and maintain automation using Infrastructure-as-Code tools (Terraform, Pulumi, Crossplane).
  • Implement and enhance GitOps workflows using Argo CD or Flux.
  • Set up and maintain observability systems (Prometheus, Grafana, OpenTelemetry) to monitor workloads and clusters.
  • Collaborate with the team to design, test, and roll out improvements to scaling and reliability.
  • Troubleshoot incidents and participate in on-call rotations to ensure platform uptime.
  • Contribute to security best practices, including RBAC, tenant isolation, and cloud identity management.

What You Bring

  • 3–5 years of experience building or operating cloud-native infrastructure in production.
  • Hands-on experience with at least one major cloud provider (AWS, Azure, or GCP) and (preferably) exposure to multi-cloud environments.
  • Solid understanding of Kubernetes concepts and operational experience with production clusters.
  • Proficiency with Infrastructure-as-Code tools (Terraform, Pulumi, or similar).
  • Experience with GitOps workflows and tools like Argo CD, Flux, or Argo Workflows.
  • Familiarity with monitoring, logging, and tracing for distributed systems.
  • Scripting or programming skills in Python, Go, or Bash.
  • Strong problem-solving skills and a collaborative approach.

Nice to Have

  • Experience with multi-tenant Kubernetes clusters for AI/ML or big data workloads.
  • Knowledge of compliance and security standards (SOC2, GDPR, ISO27001).
  • Contributions to open-source cloud-native projects.
  • Familiarity with Kubernetes operators, controllers, or custom resources.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Job Tags

Similar Jobs

TSMC - Taiwan Semiconductor Manufacturing Company Limited

Senior Physical Design Engineer(7051) Job at TSMC - Taiwan Semiconductor Manufacturing Company Limited

 ...As a Senior Physical Design Engineer, you will be responsible for the physical design implementation PnR run, Performance/Power/Area (PPA) comparison, congestion & DRC analysis, and design optimization. You may also do synthesis, debugging & data analysis, scripting,... 

Pro-Vac

CCTV Operator/Pipeline Inspection Job at Pro-Vac

As a CCTV Operator at Pro-Vac... You will work with the largest Vac-Con fleet in the U.S to bring sustainable environmental services to our communities. Every day, you will travel somewhere new, build relationships with current and potential clients, and receive comprehensive...

Talent Staffing and Recruiting

Registered Nurse Job at Talent Staffing and Recruiting

 ...Were currently recruiting for a variety of nursing roles across specialties. If youre an RN or LPN looking for your next opportunity, explore the positions below Labor & Delivery (RN) Requirements: RN License (Montana or eligible), BLS, NRP, STABLE, Fetal Monitoring... 

Health Systems Management

Registered Nurse RN Job at Health Systems Management

 ...On Bonus Offered, based on experience!!! Registered Nurse (RN) King Dialysis Center - King, NC 27021 Health Systems...  ...licensure appropriate to the state of practice.- Willingness to work a flexible schedule and to fill in when needed. - Computer skills... 

ECLARO

Healthcare Compliance Specialist Job at ECLARO

 ...Healthcare Compliance Analyst/Auditor (Quality Assurance Analyst) Remote with occasional onsite...  ...Extensive knowledge of, and experience with, Home and Community-Based Services (HCBS)...  ...or compliance officer. Experience working with eMedNY, CANS-NY, UAS-NY, Medicaid...