New Position- IT Site Reliability Engineer
Location: Fort Worth, TX ( Hybrid)
Duration: 12+ Months
Note: No H1bs for this role
*** This role may also require an in-person interview if you are sending local candidates please be sure they are open to onsite interviews.***
Job Description:
• Experience in setting up and managing environments on Azure using Terraform infrastructure as code
• Experience in setting up CI/CD pipelines for application and infrastructure using GitHub Actions and FluxCD
• Experience in on-prem to cloud migration
• Experience in containerization of applications using Kubernetes
• Experience in Monitoring and alerting solutions: Azure Monitor and Datadog
• Experience in DevOps and Site Reliability Engineering
• Familiar with Azure & Cloud technologies: Azure Kubernetes Service, Azure FrontDoor, Azure Storage account, Azure virtual Network, Azure Postgres, Microsoft Defender for Cloud, Kong, etc
• Familiar with DevOps technologies: Github Enterprise, Github Actions pipelines, FluxCD, Helm Chart, Kustomize, Terraform, Datadog, Jira, Confluence, etc
• Familiar with security and compliance practices: Microsoft Cloud Security Benchmark, Coverity, BlackDuck, Wiz, etc
We are seeking an experienced Site Reliability Engineer to lead the migration of on-prem application to Cloud and to maintain the Cloud applications. This role is a hands-on role involving design, coding, implementation of Azure Infrastructure and CI/CD pipelines. Furthermore, this role also will be responsible for reliability engineering of applications, monitoring and incident response. This role requires experience in Azure Cloud Software continuous integration & delivery, infrastructure automation and site reliability engineering.
DevOps Support for Software Development and Testing
- Set up infrastructure environments on Cloud using an infra-as-code approach
- Migrate on-prem applications to the Cloud
- Set up and maintain CI/CD pipelines for the application
- Set up monitoring and alerting solutions for the application
- Deliver corresponding code and scripts
- Manage suppliers to ensure delivery of IT services
- Support application DevOps lifecycle
- Perform POC with new technology solutions that are recommended by architects (50%)
Site Reliability Engineering Support for Application
- Provide L3 Support for production incidents
- Maintain application, infrastructure, and DevOps pipelines
- Monitor application and infrastructure
- Collaborating with Developers in testing and fixing bugs
- Onboard APIs to OneAPI/Kong gateway for the application
- Perform DB data refresh to the lower environment from the Production environment (40%)
Communication Support
- Guide developers and quality assurance engineers on how to use the software environment
- Communicate with upstream and downstream IT teams to ensure smooth application development & operations
- Provide on-call support for the application when needed (10%)
Bachelor's Degree is required - Recommended majors - Computer/Information Science