C2C-Senior / Senior SRE Engineer

8 views
Skip to first unread message

Madhu R

unread,
Oct 9, 2025, 12:33:18 PM (15 hours ago) Oct 9
to c2c-requ...@googlegroups.com
Job Title: IT Applications Solutions Architect – Senior / Senior SRE Engineer
Location: Washington (Need Locals) Hybrid 3 days onsite / week
Duration: 12+ Months

Only H1B No OPT


Travel Requirement:
25% Travel required. Must be local to Amtrak’s Northeast Corridor, specifically near Wilmington, DE, NYC, D.C., or Philadelphia.
Train travel is required to support local workshops and engage with employees on premise and on trains.
The opportunity also requires in-person participation for PI Planning activities that happen quarterly, 2-3 days in/near our Northeast Corridor.
 

Key Responsibilities:
  1. Deployment & Automation
    - Implement CI/CD pipelines using tools such as GitHub Actions, AWS CodePipeline, and Jenkins
    - Automate infrastructure provisioning through Infrastructure-as-Code (IaC) using Terraform, CloudFormation, or AWS CDK.
    - Design and develop automation scripts and self-service tools to enhance operational efficiency.
  2. Capacity Planning & Performance
    - Develop capacity models and forecasting systems.
    - Lead cost optimization initiatives across services
    - Design and execute Resiliency and Performance testing frameworks
    - Configure and maintain auto-scaling policies and thresholds.
  3. Incident Management & Response
    - Proficient in ITIL framework and ITSM tools (ServiceNow prefrerred).
    - Production on-call responder with the ability to lead during critical service outages and orchestrate disaster recovery failover activities.
    - Facilitate post-mortem meetings and drive improvement patterns.
    - Develop RCA documentation, and Knowledge articles.
    - Apply SRE principles, including SLIs, SLOs, and error budgets.
  4. Leveraging Observability Tools
    - Proven experience in leveraging observability tools such as Dynatrace, AppDynamics, or ELK; Dynatrace experience is strongly preferred.
    - Define, monitor, and enforce SLOs and error budgets to align reliability targets with business outcomes.
    - Leverage distributed tracing and context propagation to identify performance bottlenecks and root causes of failures.
    - Extensive experience with APM tools
    - Design and implement custom dashboards and anomaly detectors to generate actionable insights for performance tuning and capacity planning.
    - Develop self-healing mechanisms based on observability data to reduce manual intervention and enhance system resilience.
  5. Security & Compliance Implementation
    - Lead security incident investigations and execute remediation plans
    - Design automated compliance validation
    - Develop security automation frameworks
    - Implement zero-trust architecture patterns
Education & Experience
- Bachelor’s degree in Computer Science, Engineering, or related field
- 5 to 8 years of experience in DevOps, SRE, or Platform Engineering
- 3+ years in high-availability production environments
- Proven track record of leading technical initiatives
- Expert-level knowledge of at least one cloud platform (AWS preferred)
- Deep expertise in cloud architecture, network, and services.
- Proficiency in multiple programming languages (Python, Go, Java)
- Ability to influence without authority across teams
- Knowledge of relational, cloud, and NoSQL  databases.
- Strong leadership and mentoring capabilities
- Excellent technical writing and documentation skills
- Availability to work outside of standard business hours as required



Thanks & Regards
Madhu
Reply all
Reply to author
Forward
0 new messages