Zain Kuwait
Site Reliability Engineer
Job Description
The Site Reliability Engineer at Zain Kuwait is responsible for maintaining the reliability, scalability, and availability of mission-critical telecommunications platforms and digital services. The role focuses on automating infrastructure, monitoring production systems, and minimizing service disruptions through proactive engineering practices. The engineer collaborates with software development, cloud, network, and operations teams to improve system performance and incident response. This position plays a key role in ensuring customers receive uninterrupted digital and telecom services.
Key Responsibilities
Monitor system health, availability, and application performance using monitoring tools.
Design and implement automation scripts for infrastructure and deployment processes.
Investigate and resolve production incidents within defined SLAs.
Improve platform reliability through proactive maintenance and optimization.
Collaborate with DevOps and development teams to enhance deployment pipelines.
Perform root cause analysis and implement preventive solutions.
Maintain cloud infrastructure, servers, and containerized environments.
Prepare technical documentation, operational procedures, and incident reports
