Job Title: Sr. Datadog Architect
Duration: 6 months contract
100% Remote
Job Description:
Architecture & Design
-
Design end-to-end observability architecture using Datadog across cloud Azure, containers, Kubernetes, and on-prem workloads.
-
Define monitoring standards, SLIs/SLOs, dashboards, alerting strategy, and tagging governance.
-
Design and Architect end to end solution to integrate Mainframe platforms
-
Architect log ingestion pipelines, retention policies, and cost-optimized indexing strategies.
-
Build scalable APM instrumentation patterns for microservices, serverless, and distributed environments.
Implementation & Optimization
-
Deploy Datadog agents, integrations, and custom checks across large-scale infrastructure.
-
Configure APM, RUM, Logs, SIEM, Synthetics, Network Performance Monitoring, and CI/CD Observability.
-
Work closely with DevOps, SRE, Cloud, and Application teams to instrument services and ensure visibility.
-
Analyze and optimize Datadog costs: usage, retention settings, indexing, and billing insights.
Governance & Best Practices
-
Establish organization-wide tagging standards, dashboards, alerting guardrails, and onboarding processes.
-
Create reusable templates, Terraform modules, and automation scripts for Datadog deployment.
-
Ensure compliance with security and observability best practices.
-
Mentor teams on Datadog usage, training engineers on dashboards, logs, traces, and alerts.
Troubleshooting & Insights
-
Lead RCA investigations using Datadog metrics, traces, logs, and correlated events.
-
Collaborate with engineering teams to improve system reliability, resilience, and performance.
-
Identify gaps in observability and propose improvements across the stack.
Required Skills & Experience
-
12 years in Observability, Monitoring, SRE, DevOps, or Cloud Engineering.
-
6+ years of hands-on experience with Datadog.
-
Strong understanding of distributed systems, microservices, and cloud-native architectures.
-
Expertise with Kubernetes, Docker, AWS/Azure/GCP cloud services.
-
Experience with Infrastructure as Code (Terraform preferred).
-
Strong knowledge of APM, Metrics, Logs, RUM, Synthetics, and Security Monitoring.
-
Deep experience with Datadog dashboards, alerting, monitors, service maps, event correlation, and notebooks.
-
Proficiency with Python, Bash, or similar scripting languages.
-
Strong analytical, communication, and problem-solving skills.
Preferred Qualifications
-
Datadog Certifications (Datadog Fundamentals, APM, Log Management, or Observability).
-
Experience with Retail for observability tools.
-
CI/CD observability experience (GitHub Actions, Jenkins, GitLab CI, etc).
-
Background in Performance Engineering, Reliability Engineering, or Platform Engineering.
Thank you,
Upcoming PTO:
Proudly servicing workforce solutions in USA | Canada | EMEA | APAC
For any suggestions or feedback, please contact my manager: Manish
Bisht (mab...@eteaminc.com)
eTeam DISCLAIMER: This e-mail transmission may contain confidential or legally privileged information that is intended only for the individual(s) or entity(ies)
named in the e-mail address. If you are not the intended recipient, please reply so that arrangements can be made for proper delivery, and then please delete all copies and attachments. Any disclosure, copying, distribution, or relaying upon the contents of
this e-mail, by any other than the intended recipients, is strictly prohibited.