Staff DevOps Engineer

Remote (United States)

This opportunity is for a Staff DevOps Engineer focused on developer infrastructure, platform engineering, and internal tooling. The role is responsible for building the infrastructure that supports modern development pipelines, pre-production environments, automated quality gates, and AI-assisted engineering workflows.

This position sits at the intersection of DevOps, platform engineering, and developer tooling. The role helps create reliable systems that allow engineers and AI agents to validate, test, and ship production-ready code with greater speed, visibility, and confidence.

Employment Type: Full-Time

Compensation: $96.15 – $120.19 per hour

What You’ll Do

Design and operate ephemeral, pre-warmed development environments that engineers and AI agents can launch on demand
Extend internal CLI tooling so new engineers or AI agents can start a validated local environment within minutes
Automate service discovery, dependency management, and local configuration for development environments
Build environment parity monitoring to ensure development environments closely match production behavior
Own infrastructure-level pre-production quality gates that validate deployments before they reach production
Build and operate automated load testing, performance benchmarking, and security scanning gates within CI/CD pipelines
Partner with QA and engineering teams to expand quality gate coverage across services
Build containerized mock services generated from OpenAPI specifications for local integration testing
Support realistic third-party dependency validation before pull requests are opened
Stand up Playwright-based UI validation within AI-agent and CI workflows
Create infrastructure that supports iterative self-refinement, allowing engineers or agents to run, test, fix, and improve output before review
Build review tooling, metrics dashboards, and operational controls that make development pipelines observable and improvable
Surface scoring signals, approval rate trends, gate pass rates, and common failure patterns
Create policy layers that define approval requirements by component, task type, or delivery workflow
Work across EKS, AWS services, and CI/CD systems to improve delivery reliability and engineering productivity
Collaborate closely with architects, product teams, QA, infrastructure, developer experience, and engineering teams
Use AI development tools as part of infrastructure operations, debugging, automation, and delivery workflows

Qualifications

6-10+ years of experience in DevOps, Platform Engineering, or Site Reliability Engineering roles
Strong experience building and operating production systems at scale
Active experience using AI development tools such as Claude Code, Codex, or similar tools in infrastructure workflows
Ability to use AI tools for Terraform changes, Kubernetes debugging, automation scripting, and operational investigations
Expertise with Kubernetes, especially EKS
Strong hands-on experience with AWS services including IAM, VPC, ECR, SSM, Secrets Manager, S3, SQS, Lambda, and RDS/Aurora
Strong Infrastructure as Code experience, with Terraform preferred
Experience with GitOps workflows using Argo CD or similar tools
Proven experience building ephemeral environments, developer tooling, internal platforms, CLIs, scaffolding tools, or developer portals
Experience with load testing frameworks such as k6, Locust, Gatling, or similar tools
Experience automating performance gates within CI/CD pipelines
Experience building mock or stub infrastructure for large-scale integration testing
Experience with containerized services, API mocking, and dependency isolation
Deep CI/CD experience with CircleCI, GitHub Actions, or similar platforms
Experience with caching, parallelism, artifact management, test reliability, and pipeline observability
Experience with release strategies such as canary releases, blue-green deployments, automated rollbacks, and progressive delivery
Strong observability fundamentals using tools such as Datadog and OpenTelemetry
Ability to define SLIs and SLOs and connect them to delivery decisions
Excellent cross-team communication skills
Ability to translate platform constraints into developer-friendly tools, solutions, and documentation

Technology Stack

Infrastructure: AWS, EKS, Terraform, Argo CD, Docker, Vault
CI/CD: CircleCI, Argo CD, GitHub Actions
Messaging: Kafka and Confluent Cloud
Observability: Datadog and OpenTelemetry
Languages and Applications: Node.js, TypeScript microservices, Python jobs, and React front ends

How You’ll Succeed

Treat infrastructure as a product by understanding engineer needs, measuring adoption, and improving tools based on feedback
Build systems that multiply team output rather than only maintaining existing infrastructure
Prioritize automation, reproducibility, and measurable outcomes
Create tools or gates when repeated manual work appears in the delivery process
Operate with high ownership across infrastructure, developer experience, QA, and product engineering teams
Use AI tools to move faster while maintaining strong verification, rigor, and engineering judgment
Help develop better team patterns for AI-assisted infrastructure work

Apply for This Job

If you notice a problem with this job, email us at contact@7seventy.net.

Looking for more opportunities?

View All Jobs