Stack Digital

SRE Lead
Apply Now

📅 Date Posted

Feb 11, 2025

💼 Job Type

CONTRACTOR

📍 Location

Edinburgh

💵 Rate

£550.00 - £600.00

Description

Job Title: SRE LeadWork Location & Mode: Hybrid: 2 days in the officeLocation: EdinburghDuration: 6 monthsStart Date: ASAPNumber of Positions: 1Day Rate: £550 - £600/dayRole Description:We are looking for an experienced Site Reliability Engineer (SRE) to join our Everyday Banking Platform team. In this role, you will ensure the reliability, scalability, and performance of our cloud infrastructure and applications on GCP. Your responsibilities will include: Automating infrastructure deploymentsOptimizing CI/CD pipelinesMonitoring the health of the GCP environmentDriving performance improvementsEnhancing observability across cloud-based systemsManaging incident resolution and cost optimizationYou will collaborate closely with development and support teams to maintain high availability, resilience, and a seamless cloud experience for customers.Key Responsibilities:Cloud Infrastructure AutomationDesign, develop, and maintain GCP cloud infrastructure.Implement Infrastructure as Code (IaC) using Terraform, Ansible etc.Optimize and maintain CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI/CD.Automate operational tasks using Python, Shell scripting.Ensure security best practices, cost optimization, and performance tuning for cloud workloads.Site Reliability & Production SupportContinuously monitor and ensure health checks for Google Kubernetes Engine (GKE), Compute Engine, etc.Implement automated health checks, capacity planning, and performance optimizations.Enhance observability and alerting using Google Cloud Operations Suite (Stackdriver), Dynatrace, Splunk.Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.Troubleshoot and resolve production incidents with minimal downtime.Conduct Root Cause Analysis (RCA) for continuous improvement.Collaboration & Operational ExcellenceWork closely with development, support, and cloud engineering teams.Improve monitoring, alerting, and self-healing capabilities.Support infrastructure upgrades, cloud migrations, and platform optimization.Develop and maintain playbooks, runbooks, and automation scripts.Ensure compliance with security policies and IAM configurations.Key Skills, Knowledge & Experience:Strong expertise in GCP services: Google Kubernetes Engine GKE, Cloud Run, Compute Engine, Pub/Sub.Hands-on experience with Terraform, Ansible for infrastructure automation.Experience in building and optimizing CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI/CD.Knowledge of monitoring & observability tools: Google Cloud Operations Suite (Stackdriver), Dynatrace, Splunk.Strong background in incident response, troubleshooting, and cloud security best practices.Solid understanding of networking, IAM, and cloud security.Experience working in Agile teams, collaborating with development and operations teams to ensure system resilience and high availability.Person Specification:Client-facingAssertive Engineering LeaderStrong communication & team collaboration skills

Share:
Interested in this job?
Apply Now

Subscribe Newsletter

Sign up to our newsletter to get Outside IR35 jobs directly to your inbox.