Tier 3 Operations Specialist (m/f/d) Storage

Startdatum:

05/2026

Enddatum:

12/2026 + Option

Beschäftigungsart:

Freiberuflich

Region:


Beschreibung:

For our costumer, we are looking for a Tier 3 Operations Specialist (m/f/d) Storage.

 

Full-time

Start: 05/2026

End: 12/2026 + Option

 

Objective: Provide Tier-3 operational ownership for Storage Products for Local Production (DE).

Tasks:

- Handling of complex incidents, deep troubleshooting, and root cause analysis; drive permanent fixes and preventive

measures.

- Ensuring operational readiness for storage changes

- Monitoring/alerting coverage, performance baselines, hardening, patch strategy, rollback and recovery procedures,

runbooks.

- Executing and improving standard operational procedures through automation (reduce toil, improve MTTR and

stability).

- Automation of standard operational tasks (capacity checks, validation procedures, provisioning workflows where

applicable).

 

Support and Operational Readiness

Objective: Ensure operational readiness for deployments

Tasks:

- Validation of deployment artifacts from an operations perspective.

- Defining and enforcing quality assurance measures (e.g. required documentation of standard operation procedures,

successful test reports, …) to ensure the high quality of delivered products and services.

- Ensuring rollback strategies and operational monitoring (observability) are in place for production deployments.

 

Monitoring, Incident, Problem and Change Management in the specific context of providing Compute & Operating System

Objective: Ensure operational stability and responsiveness for the managed Kubernetes platform

Tasks:

- Monitoring system health, performance metrics, and service availability across multi-tenant environments.

- Identifying, analyzing, and resolving incidents, minimizing service disruption.

- Triggering root cause analysis and implementation of corrective and preventive actions.

 

Automation of operations critical standard processes

Objective: Reduce operational toil and improve service reliability

Tasks:

- Address recurring operational issues by automating remedial standard operations processes

- Validate all automated procedures following the established software development lifecycle including staging, testing,

and validation reviews

 

Security and Compliance Enforcement

Objective: Ensure platform operations adhere to security and compliance standards

Tasks:

- Implementing monitoring and logging strategies to support audit and compliance requirements.

- Performing routine security scans and remediating identified vulnerabilities.

 

Profile Requirements

The contractor must be a senior level professional with proven experience in operations management of private cloud

solutions, proficiency in managing storage operations on the platform with following experience:

 

Must-have experience

- 5+ years in IT storage operations / service delivery / platform operations with demonstrated leadership in missioncritical

environments.

- Proven experience implementing/leading Incident, Problem, Change, Release governance in production.

- Experience supporting platform workloads that rely on shared storage services.

- Expertise with storage types: File Storage, Block Storage, Object Storage.

- Expertise with protocols/services: NFS; object storage operations (S3-like concepts).

- Experience with kubernetes storage integration: CSI driver concepts and troubleshooting (PV/PVC lifecycle

understanding).

- Virtualization (Storage): Experience operating storage virtualization in enterprise environments.

- Expertise within ITSM: Jira Service Management (JSM), Jira, Confluence.

- Fundamental understanding of core operations processes (incident management, change management, problem

management, IT Service Management) as well as SRE concepts

- Experience in gathering operational insights from monitoring or observability including SLI/SLA/SLO management

and tracking.

- Hand-on experience in documenting procedures properly and enforcing clear runbooks or playbooks.

- Observability Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog, Mimir,

Loki).

- Familiarity with enterprise DevOps toolchains is a plus (GitLab, JFrog Artifactory, Backstage, Harness).

- Strong understanding of modern platform operations (Kubernetes/containers, automation, observability), sufficient

to govern specialists.

- Platform delivery concepts: GitOps and IaC awareness (Terraform/OpenTofu, ArgoCD, Helm) to govern

deployment/readiness standards.

 

Must-have language skills:

- Proficiency in both speech and writing in English (at least C1).

- Proficiency in both speech and writing in German(at least C1).

 

Preferred experience

- Experience operating in regulated / high-availability industries (banking, telco, public sector, healthcare).

- Experience with SRE practices (SLOs/SLIs, error budgets) and reliability management.

- Experience operating storage services that integrate with Kubernetes platforms.

- Familiarity with IaC-based provisioning and GitOps-driven operational patterns.