Senior Operations Engineer (m/w/d) Kubernetes/ ArgoCD Remote und Berlin

Startdatum:

18.08.2025

Enddatum:

3 MM+ Option

Beschäftigungsart:

Freiberuflich

Region:

Remote und Berlin


Beschreibung:

Für unseren Kunden in Berlin suchen wir ab dem 18.08. einen Senior Operations Engineer (m/w/d) Kubernetes/ ArgoCD für die voraussichtliche Dauer von 3 Monaten mit der Option auf Verlängerung.

 

Ihre Aufgaben:

 

- Validation of deployment artifacts from an operations perspective.

- Defining and enforcing quality assurance measures (e.g. required documentation of standard operation procedures,

successful test reports, …) to ensure the high quality of delivered products and services.

- Ensuring rollback strategies and operational monitoring (observability) are in place for production deployments

- Monitoring system health, performance metrics, and service availability across multi-tenant environments.

- Identifying, analyzing, and resolving incidents, minimizing service disruption.

- Triggering root cause analysis and implementation of corrective and preventive actions

- Address recurring operational issues by automating remedial standard operations processes

- Validate all automated procedures following the established software development lifecycle including staging, testing,

and validation reviews

- Implementing monitoring and logging strategies to support audit and compliance requirements.

- Performing routine security scans and remediating identified vulnerabilities.

 

Ihre Anforderungen:

 

- At least of 5 years of operational experience with self-managed Kubernetes clusters, self-managed services providing

Kubernetes clusters and productive applications or systems in on premise environments

- Deep understanding of networking concepts, including protocols, load balancing, and security.

- Profound knowledge and implementation experience with CI/CD processes, tooling (e.g. GitLab, Jenkins, Tekton,

Argo Workflows, and Argo CD), concepts and associated quality and security assurance for software delivery

- Fundamental understanding of core operations processes (incident management, change management, problem

management, IT Service Management) as well as SRE concepts

- Experience in gathering operational insights from monitoring or observability including SLI/SLA/SLO management

and tracking.

- Hand-on experience in documenting procedures properly and enforcing clear runbooks or playbooks.

- Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog).

 

Must-have language skills:

- Proficiency in both speech and writing in English (at least C1).

 

Preferred experience

- Project experience in software engineering (in Go Lang, C/C++ or Python) with significant experience in building

RESTful services in distributed environments.