Locals Only || Senior Site Reliability Engineer || Naperville, TN

0 views
Skip to first unread message

Savi Technologies LLC

unread,
Jun 26, 2026, 11:32:50 AM (17 hours ago) Jun 26
to idc.recru...@gmail.com
Please share with me suitable profiles.

Role: Senior Site Reliability Engineer
Location: Naperville, TN- Locals only

Job Summary:

We are looking for a highly experienced Senior Site Reliability Engineer (SRE) / Application Reliability Engineer with AWS knowledge over and over 10+ years of expertise in incident management, system reliability, and enterprise application support. The role focuses on ensuring high availability, operational stability, and continuous improvement of critical financial and ERP systems in a 24×7 environment.

The ideal candidate will have strong hands-on experience in monitoring, troubleshooting, root cause analysis, and supporting cloud-based and on-prem enterprise platforms.

Key Responsibilities:
Reliability Engineering & Operations
Ensure high availability and reliability of enterprise applications in a 24×7 production environment.
Monitor applications, batch jobs, and workflows to maintain operational continuity.
Incident, Problem & Change Management
Lead and manage major incidents (P1/P2) and drive resolution to minimize business impact.
Perform root cause analysis (RCA) and implement preventive measures.
Ensure adherence to SLA/SLO and ITIL-based incident, problem, and change management processes.
Monitoring & Observability
Design and maintain monitoring dashboards.
Implement proactive alerting and improve system observability.
Troubleshooting & Support
Diagnose and resolve application and data-related issues using SQL queries and log analysis.
Provide backend validation and technical support across distributed environments.
Release & Deployment Support
Support release deployments, change validation, and post-deployment activities.
Participate in disaster recovery testing and release readiness validation.
Collaboration & Documentation
Collaborate with infrastructure, DBA, and development teams to resolve technical issues.
Create and maintain operational documentation, runbooks, and knowledge base articles.

Required Skills & Qualifications:
Core Skills
Site Reliability Engineering (SRE) and Application Support Incident & Problem Management Root Cause Analysis (RCA) SLA / SLO Compliance Batch Monitoring & Scheduling ITIL Framework

Technical Skills
CI/CD Tools: GitHub
Cloud Platforms: AWS (EC2, S3, VPC)
Databases: Oracle, SQL Server
Languages: SQL, SQR, Basic Java
Ticketing Tools: ServiceNow, Jira
Operating Systems: UNIX, Linux, Windows

Experience:
10+ years of experience in Application Support / Reliability Engineering roles.
Strong experience in BFSI or enterprise application environments.
Proven track record in managing production support operations and high-severity incidents.

--
Thanks

Reply all
Reply to author
Forward
0 new messages