Location: Fort Worth, TX ( Hybrid) Duration: 12+ Months
*** This role may also require an in-person interview if you are sending local candidates please be sure they are open to onsite interviews.***
Job Description: • Experience in setting up and managing environments on Azure using Terraform infrastructure as code • Experience in setting up CI/CD pipelines for application and infrastructure using GitHub Actions and FluxCD • Experience in on-prem to cloud migration • Experience in containerization of applications using Kubernetes • Experience in Monitoring and alerting solutions: Azure Monitor and Datadog • Experience in DevOps and Site Reliability Engineering • Familiar with Azure & Cloud technologies: Azure Kubernetes Service, Azure FrontDoor, Azure Storage account, Azure virtual Network, Azure Postgres, Microsoft Defender for Cloud, Kong, etc • Familiar with DevOps technologies: Github Enterprise, Github Actions pipelines, FluxCD, Helm Chart, Kustomize, Terraform, Datadog, Jira, Confluence, etc • Familiar with security and compliance practices: Microsoft Cloud Security Benchmark, Coverity, BlackDuck, Wiz, etc
We are seeking an experienced Site Reliability Engineer to lead the migration of on-prem application to Cloud and to maintain the Cloud applications. This role is a hands-on role involving design, coding, implementation of Azure Infrastructure and CI/CD pipelines. Furthermore, this role also will be responsible for reliability engineering of applications, monitoring and incident response. This role requires experience in Azure Cloud Software continuous integration & delivery, infrastructure automation and site reliability engineering.
DevOps Support for Software Development and Testing - Set up infrastructure environments on Cloud using an infra-as-code approach - Migrate on-prem applications to the Cloud - Set up and maintain CI/CD pipelines for the application - Set up monitoring and alerting solutions for the application - Deliver corresponding code and scripts - Manage suppliers to ensure delivery of IT services - Support application DevOps lifecycle - Perform POC with new technology solutions that are recommended by architects (50%)
Site Reliability Engineering Support for Application - Provide L3 Support for production incidents - Maintain application, infrastructure, and DevOps pipelines - Monitor application and infrastructure - Collaborating with Developers in testing and fixing bugs - Onboard APIs to OneAPI/Kong gateway for the application - Perform DB data refresh to the lower environment from the Production environment (40%)
Communication Support - Guide developers and quality assurance engineers on how to use the software environment - Communicate with upstream and downstream IT teams to ensure smooth application development & operations - Provide on-call support for the application when needed (10%)
Bachelor's Degree is required - Recommended majors - Computer/Information Science