Startdatum:
05/2026
Enddatum:
12/2026 + Option
Beschäftigungsart:
Freiberuflich
Region:
Beschreibung:
For our costumer, we are looking for a Tier 3 Operations Specialist (m/f/d) Storage.
Full-time
Start: 05/2026
End: 12/2026 + Option
Objective: Provide Tier-3 operational ownership for Storage Products for Local Production (DE).
Tasks:
- Handling of complex incidents, deep troubleshooting, and root cause analysis; drive permanent fixes and preventive
measures.
- Ensuring operational readiness for storage changes
- Monitoring/alerting coverage, performance baselines, hardening, patch strategy, rollback and recovery procedures,
runbooks.
- Executing and improving standard operational procedures through automation (reduce toil, improve MTTR and
stability).
- Automation of standard operational tasks (capacity checks, validation procedures, provisioning workflows where
applicable).
Support and Operational Readiness
Objective: Ensure operational readiness for deployments
Tasks:
- Validation of deployment artifacts from an operations perspective.
- Defining and enforcing quality assurance measures (e.g. required documentation of standard operation procedures,
successful test reports, …) to ensure the high quality of delivered products and services.
- Ensuring rollback strategies and operational monitoring (observability) are in place for production deployments.
Monitoring, Incident, Problem and Change Management in the specific context of providing Compute & Operating System
Objective: Ensure operational stability and responsiveness for the managed Kubernetes platform
Tasks:
- Monitoring system health, performance metrics, and service availability across multi-tenant environments.
- Identifying, analyzing, and resolving incidents, minimizing service disruption.
- Triggering root cause analysis and implementation of corrective and preventive actions.
Automation of operations critical standard processes
Objective: Reduce operational toil and improve service reliability
Tasks:
- Address recurring operational issues by automating remedial standard operations processes
- Validate all automated procedures following the established software development lifecycle including staging, testing,
and validation reviews
Security and Compliance Enforcement
Objective: Ensure platform operations adhere to security and compliance standards
Tasks:
- Implementing monitoring and logging strategies to support audit and compliance requirements.
- Performing routine security scans and remediating identified vulnerabilities.
Profile Requirements
The contractor must be a senior level professional with proven experience in operations management of private cloud
solutions, proficiency in managing storage operations on the platform with following experience:
Must-have experience
- 5+ years in IT storage operations / service delivery / platform operations with demonstrated leadership in missioncritical
environments.
- Proven experience implementing/leading Incident, Problem, Change, Release governance in production.
- Experience supporting platform workloads that rely on shared storage services.
- Expertise with storage types: File Storage, Block Storage, Object Storage.
- Expertise with protocols/services: NFS; object storage operations (S3-like concepts).
- Experience with kubernetes storage integration: CSI driver concepts and troubleshooting (PV/PVC lifecycle
understanding).
- Virtualization (Storage): Experience operating storage virtualization in enterprise environments.
- Expertise within ITSM: Jira Service Management (JSM), Jira, Confluence.
- Fundamental understanding of core operations processes (incident management, change management, problem
management, IT Service Management) as well as SRE concepts
- Experience in gathering operational insights from monitoring or observability including SLI/SLA/SLO management
and tracking.
- Hand-on experience in documenting procedures properly and enforcing clear runbooks or playbooks.
- Observability Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog, Mimir,
Loki).
- Familiarity with enterprise DevOps toolchains is a plus (GitLab, JFrog Artifactory, Backstage, Harness).
- Strong understanding of modern platform operations (Kubernetes/containers, automation, observability), sufficient
to govern specialists.
- Platform delivery concepts: GitOps and IaC awareness (Terraform/OpenTofu, ArgoCD, Helm) to govern
deployment/readiness standards.
Must-have language skills:
- Proficiency in both speech and writing in English (at least C1).
- Proficiency in both speech and writing in German(at least C1).
Preferred experience
- Experience operating in regulated / high-availability industries (banking, telco, public sector, healthcare).
- Experience with SRE practices (SLOs/SLIs, error budgets) and reliability management.
- Experience operating storage services that integrate with Kubernetes platforms.
- Familiarity with IaC-based provisioning and GitOps-driven operational patterns.