Job Description
Our Technical Operations team manages the infrastructure, DevOps, and Site Reliability of our platform. We are looking for a Staff Cloud DevOps/Site Reliability Engineer to join our team.
Qualifications
Bachelor's degree in Computer Science, Engineering, or a related field7+ years of experience as a DevOps, Infrastructure, Operations, or Site Reliability Engineer (or as a software engineer with relevant experience).At least 2 years experience each with:
TerraformHelmKubernetesAWS, Azure, or GCPCI/CD using modern tools (GitOps)
Optional (not required but considered a plus):
MLOps (building, orchestrating, and maintaining Machine Learning Pipelines)Prometheus / GrafanaMulti-cloud deployments (2 or more)ArgoCDNetwork management and VPNs
Responsibilities
Infrastructure: Maintain and contribute to Infrastructure-as-Code (Terraform)DevOps and CI/CD Pipelines: Orchestrate pipelines using Github Actions, Helm, ArgoCDMicroservices scalability: Kubernetes AdministrationCloud AdministrationSite Reliability: Measure and monitor availability, latency, and overall service health, drive incident management and post-mortem analysis
Inworld AI is a character engine for powering AI-driven characters in gaming, entertainment, and interactive experiences. Inworld is funded by VCs including Kleiner Perkins, CRV, Bitkraft, Founders Fund, Section 32, M12, Intel Capital, and Meta, as well as a team of all-star angels including corporate executives, top VC funds' partners, scouts, and industry veterans.