Employment Type: Contract to Hire
Location: Remote
Experience: 12+ Years
Strong hands-on experience in Kubernetes and Golang (Go) – must be excellent in Go programming
Candidates must be comfortable coding and highly efficient in programming
Only Independent consultants (No layers / No third-party candidates)
LinkedIn ID and Driving License copy are mandatory
Candidate screenshot required at the time of submission
Design and implement monitoring & alerting using CloudWatch, Grafana, Prometheus, Datadog, ELK
Maintain system reliability through SLIs/SLOs, error budgets
Build auto-healing, scaling, and health check mechanisms
Conduct RCA and post-incident reviews
Develop automation using Terraform, Ansible, CloudFormation
Build and manage CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins)
Manage containerized workloads (Kubernetes, ECS, EKS, Lambda)
Implement blue/green deployments, canary releases, automated rollbacks
Participate in 24/7 on-call rotation
Improve MTTD & MTTR
Create and maintain runbooks & playbooks
Ensure compliance with ISO 27001, SOC 2
Manage IAM, secrets, and secure network configurations
Integrate security into CI/CD and automation workflows
Work closely with development teams for scalable & resilient architecture
Drive DevOps and SRE best practices
Contribute to platform reliability improvements
6+ years in SRE / DevOps / Infrastructure Engineering
Strong AWS experience (EC2, EKS/ECS, RDS, Lambda, S3, IAM, VPC)
Expertise in Infrastructure as Code
Experience with observability tools
Strong scripting/programming in Go (preferred), Python, Bash, or PowerShell
Solid knowledge of networking, DNS, load balancing
Strong troubleshooting & RCA skills
Certifications: AWS SysOps, CKA, SRE Foundation
Experience with chaos engineering
Knowledge of SLO/SLI & error budgets
Experience with multi-region / hybrid architectures
Background in SaaS or regulated environments (SOC 2, HIPAA, GDPR)