Role: Site Reliability Engineer
Location: Miami, FL (5 days onsite)
Duration: Long Term Contract
Experience: 12+ Years
Must have / Required Skills:
- 12+
years of experience in IT
- Site
Reliability Engineering Practices.
- Should
be good at understanding Microservices, KUBERNETES, DOCKER, AWS CLOUD,
Oracle/IBM/Tomcat application servers, NewRelic
- Should
have good understanding on Business flows, Customer Experience, KPis
and SLAs.
- Good
Understanding on Logging frameworks and tools like Elastic/Open search,
Logstash and Kibana.
- Experience
in troubleshooting JVM failures, JDBC connection leaks and service
integration failures
- Experience
with Application Monitoring tools like New Relic.
- Experience
working in Telecom Domain
- In-depth
knowledge of configuring, tuning, and maintaining java application servers
and micro services on Kubernetes platform
- Strong
understanding of SDLC
- Experience
working on CI/CD pipelines using FlexDeploy, Jenkins, Artifactory etc
Job Description
- Working
experience on Web Servers , Application Servers, Java Messaging services(
JMS Queues & topics) and containerized micro services.
- Good
understanding on Kubernetes platform and Service Mesh like ISTIO, NGNIX,
etc
- Hands
on Experience on AWS services like EC2, ALB, NLB, RabbitMQ.
- Responsible
for Application's reliability and defining SLA, SLI and SLO
- Capacity
planning, JDBC tuning and performance tuning.
- Should
be able to provide requirements and analyze performance and chaos test
results.
- Strong
Experience and understanding on SOAP and REST Webservices.
- Strong
Log analysis skills. Should be able to identify System Errors vs Business
Errors.
- Assess
and implement best practices for Observability and tracing.
- Strong
Incident Management and Problem Management Skills.
- Working
experience on APM tool NewRelic. Creation of Dashboards.
- Strong
knowledge on Load Balancers, HTTP/HTTPS protocols, and Networking
concepts.
- Collaborate
with multiple teams for Incident resolution.