Compute Infrastructure Specialist

Arcee AI

Arcee AI

Other Engineering

San Francisco, CA, USA

Posted on May 22, 2026

Compute Infra Specialist

Role Overview

We’re looking for a highly operational, technically savvy Compute Infra Specialist to help manage and scale the infrastructure that powers our AI workloads and customer deployments. This person will sit at the intersection of engineering, operations, and customer delivery, helping ensure GPU resources are efficiently allocated, deployments run smoothly, and customers have a strong experience using our infrastructure.

This is a hands-on role for someone who enjoys solving infrastructure problems, coordinating across teams, and working directly with cutting-edge AI systems.

Responsibilities

  • Manage and track GPU/compute inventory across internal and customer environments
  • Coordinate infrastructure provisioning for customer deployments and internal research workloads
  • Monitor utilization, capacity, uptime, and cost efficiency across compute environments
  • Work cross-functionally with Engineering, Research, Product, and GTM teams on deployment readiness and customer needs
  • Support customer onboarding and infrastructure troubleshooting alongside Solutions and Customer Success teams
  • Maintain documentation around infrastructure processes, environments, and deployment standards
  • Help improve operational workflows around provisioning, monitoring, escalation management, and forecasting
  • Partner with vendors and cloud providers as needed
  • Assist with infrastructure planning related to scaling customer demand and new product launches

Qualifications

  • Experience working with cloud infrastructure, GPU environments, or AI/ML infrastructure operations
  • Familiarity with Kubernetes, Linux environments, containerization, or cloud platforms (AWS/GCP/Azure)
  • Strong operational and project management instincts
  • Comfortable working cross-functionally in a fast-moving startup environment
  • Ability to communicate technical concepts clearly to both technical and non-technical stakeholders
  • Strong organizational skills and attention to detail

Nice to Have

  • Experience supporting AI/ML workloads or model deployment infrastructure
  • Experience working directly with enterprise customers
  • Startup experience