DevOps & Site Reliability Engineer

VoltaGrid · Houston

New

🇬🇧 English

AWS GCP Azure Kubernetes Docker Terraform GitHub Actions GitLab CI Jenkins Prometheus Grafana Datadog Bash Python Go Linux Ubuntu RHEL CentOS Virtualization Networking DNS Load balancing Security fundamentals

Job description

About the role

We are looking for a DevOps & Site Reliability Engineer to design, build and operate the cloud and on‑premise infrastructure that powers our applications. The role works closely with software engineering teams to ensure services are scalable, observable and resilient while fostering a culture of operational excellence.

Key responsibilities

Design, implement and maintain cloud infrastructure on AWS (or other major cloud providers).
Manage and optimise Kubernetes clusters and containerised workloads in production.
Develop infrastructure‑as‑code using Terraform and related tooling.
Build and improve CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, etc.) for fast and safe deployments.
Implement monitoring, alerting and observability solutions such as Prometheus, Grafana and Datadog.
Define and track SLIs/SLOs, participate in incident response, root‑cause analysis and blameless post‑mortems.
Automate repetitive tasks and create self‑service tooling to reduce toil.
Configure and maintain on‑prem bare‑metal servers, Linux systems and virtualised assets.
Collaborate with development teams on system design, capacity planning and performance optimisation.
Participate in on‑call rotations and ensure production readiness of new services.

Required profile

Minimum 4 years of experience in DevOps, SRE or infrastructure engineering.
Strong experience with at least one major cloud provider (AWS preferred).
Hands‑on experience operating Kubernetes and Docker in production.
Proficiency with Terraform or comparable infrastructure‑as‑code tools.
Experience building CI/CD pipelines using GitHub Actions, GitLab CI, Jenkins or similar.
Solid understanding of monitoring, logging and tracing concepts.
Strong scripting abilities in Bash, Python or Go.
Proven incident‑management experience and familiarity with SLO‑based reliability practices.
Deep Linux administration skills (Ubuntu, RHEL/CentOS).
Knowledge of virtualization, networking, DNS, load balancing and security fundamentals.

Required skills

AWS (or GCP/Azure)
Kubernetes
Docker
Terraform
GitHub Actions / GitLab CI / Jenkins
Prometheus
Grafana
Datadog
Bash
Python
Go
Linux (Ubuntu, RHEL/CentOS)
Virtualization platforms
Networking, DNS, load balancing, security basics

Questions fréquentes

Le salaire n'est pas communiqué publiquement par le recruteur. Vous pouvez postuler et négocier directement avec VoltaGrid.

Cliquez sur "Postuler maintenant" en haut de la page. Vous pouvez importer votre CV en 1 clic — Jobiglo extrait automatiquement vos informations et postule pour vous.

Why are you reporting this job?

Thank you for your report. We will review this job.

Apply in 30 seconds

Enter your email to apply. An account will be created automatically.

By continuing, you accept our terms of use.

Already have an account? Login

Published 1 week ago

Expires 1 month from now

11 views · 0 interested

Share Log in to earn credits by sharing

Boost your chances

Upload your CV — we will match you with relevant openings.

Analyzing your CV...

VoltaGrid

Houston

Related job offers

Emplois à Houston Métier : IT / Computer Science