Site Reliability Engineer (SRE) Job at Freeplay, Boulder, CO

aUgvWE1Zak91V0R0dTA4YlhwTlF1ckIvbmc9PQ==
  • Freeplay
  • Boulder, CO

Job Description

The Opportunity

We're hiring an experienced Site Reliability Engineer to own the reliability of the Freeplay platform and drive success for our most advanced enterprise customers. In this role, you will bridge the gap between core infrastructure engineering and high-stakes customer deployments. You won’t just be maintaining our internal SaaS environment; you will be the technical expert guiding Fortune 100 engineering teams as they deploy Freeplay into their own private clouds.

This is an exciting chance to join a fast-growing startup with a front‑row seat to how AI products are being built at some of the largest and most innovative companies in the world. You’ll be hands‑on with customers, learning about cutting‑edge AI architectures while ensuring our platform runs flawlessly in their diverse and complex environments.

What's Freeplay?

Freeplay is the end‑to‑end platform for software teams to ship great AI products. We give product development teams the power to test, evaluate, monitor & optimize AI in production. Our customers use Freeplay to build better LLM features, chatbots, and agents. Today we serve leading software companies from growing startups to Fortune 100 companies.

Your Mission

Build the infrastructure that powers Freeplay and ensure successful deployments for our enterprise customers.

  • Partner with Enterprise Customers: Act as a key technical contact for our "Bring Your Own Cloud" (BYOC) deployments. You will jump on calls with customer engineering teams to guide them through installation, debug configuration issues in their VPCs, and ensure they are successful running Freeplay.

  • Own the Multi‑Cloud Architecture: Help manage and improve our internal production infrastructure across AWS, GCP, and Azure ensuring high availability and seamless networking.

  • Solve the "Shipped Software" Challenge: Drive the engineering efforts to package and distribute Freeplay using tools like Helm, Replicated, and KOTS. You will help ensure our software is portable, installing as reliably in a customer's cloud environment as it does in our SaaS.

  • Master Infrastructure as Code: Drive our Terraform strategy, building modular, reusable, and secure infrastructure definitions that treat operations with the same rigor as application code.

  • Champion Observability: Implement and tune our monitoring stack (Datadog) to provide deep visibility into system health, and help customers implement similar observability for their private instances.

  • Scale Data & Messaging: Manage the stateful components of our stack, including PostgreSQL, Elasticsearch, and NATS JetStream, ensuring data integrity and performance under load.

About You

  • Experience: We are open to candidates ranging from Mid‑Level (3+ years) to Senior/Staff (7+ years) . We will tailor the scope and responsibilities to your expertise.

  • Customer‑facing confidence. You are comfortable interacting directly with external engineering teams. You can troubleshoot a failed deployment while on a Zoom call with a client and explain complex architectural requirements clearly.

  • Production Kubernetes fluency. You are confident managing EKS/GKE/AKS clusters, debugging complex pod failures, managing ingress controllers, and handling autoscaling in production.

  • Deep Terraform expertise. You have experience structuring IaC for scale and have managed multi‑environment setups.

  • Database operational experience. You aren't just an infrastructure plumber; you understand how to manage and tune databases (Postgres) and search indices (Elasticsearch) at scale.

  • Security‑first thinking. You are familiar with cloud security best practices, including VPC networking, IAM/Workload Identity, and secrets management, and you can explain these concepts to security‑conscious enterprise clients.

Bonus Points

  • Experience in a Solutions Engineering or Field Engineering capacity.
  • Experience with Replicated / KOTS or similar tools for packaging enterprise software for on‑premise/VPC deployments.
  • Experience operating message queues like NATS, JetStream, or Kafka.
  • Background in AI/ML infrastructure or high‑throughput data systems.

Compensation & Benefits

  • Competitive salary commensurate with experience, plus equity package.
  • Medical, dental, and vision insurance.
  • Premium hardware setup (MacBook, monitor, peripherals).
  • Four weeks of Paid Time Off per year (and we encourage you to take it!).

Location

We prefer candidates able to work full‑time on‑site in Boulder, CO, but we're open to exceptional remote candidates who can visit Boulder every 6 weeks for team collaboration.

#J-18808-Ljbffr

Job Tags

Full time, Remote work,

Similar Jobs

Simplot

Engineering Intern Mine Engineer Job at Simplot

 ...cattle production, and other enterprises related to agriculture. Summary We are seeking a motivated and detail-oriented Mining Engineering Intern to join our team for a unique two-summer internship experience. This opportunity offers hands-on exposure to two of our... 

CyberArk

Site Reliability Engineer Job at CyberArk

 ...or follow us on X , LinkedIn or Facebook . Job Description Whatyou willdo: CyberArk Cloud Engineering is looking for a Site Reliability Engineer with "automation first" mindset who is passionate about performance,stabilityand security to share... 

Bell Lexus North Scottsdale

Service Technician Lexus/ Toyota Job at Bell Lexus North Scottsdale

 ...Rd. and Hwy 101 in North ScottsdaleAre you currently working at an Independent Garage? We will pay to train you!If you have TOYOTA or LEXUS experience we pay you more per hour if you have:* LEXUS CERTIFICATIONS* TOYOTA CERTIFICATIONS* ASE CERTIFICATIONS... 

Compass Group

BARISTA (FULL TIME) Job at Compass Group

 ...We are hiring immediately for a full time BARISTA position. Location : Expedia - 1201 Amgen Court West, Seattle, WA 98119. Note: online applications accepted only . Schedule : Full time schedule. Monday through Friday, 8:00 am to 4:00 pm. More details... 

Icon Health

Senior Actuarial Analyst Job at Icon Health

 ...Job Title: Data Analyst, Actuarial or Medical Economics Location: Remote Reports To: SVP, Data and Technology Who We Are Icon Health is a leading provider of value-based musculoskeletal (MSK) care, collaborating with payers and providers to enhance outcomes...