Client: TCS
Role: Data Engineer Tech Lead
Location: Malvern, PA
Role Description:
Experience: 15+ Years Experience
Skills: Java| AWS | Python | PySpark | Event-Driven Pipelines | Data Architecture
Summary:
We are seeking an experienced Tech lead- Data Engineer (15+ years) with a strong background in Java, AWS, Python, PySpark, and event-driven architectures. You will design and build scalable batch and streaming data pipelines, optimize cloud data platforms, and deliver high-quality, reliable datasets that support analytics, reporting, and machine learning workloads.
Key Responsibilities:
· Architect, build, and maintain event-driven data pipelines using AWS services such as Kinesis, MSK/Kafka, Lambda, Step Functions, SQS/SNS, and Glue/EMR.
· Develop ETL/ELT workflows using Python and PySpark, ensuring performance, scalability, and cost efficiency.
· Implement and optimize Spark-based data transformations, partitioning strategies, and data processing frameworks.
· Design and manage data lake and warehouse structures using S3, Glue Catalog, Athena, and/or Redshift.
· Build streaming solutions with checkpointing, stateful transformations, idempotency, and schema evolution.
· Ensure high standards of data quality, observability, monitoring, and alerting (CloudWatch, Datadog, etc.).
· Implement data security best practices including IAM, encryption (KMS), networking, and governance.
· Create reusable frameworks, internal libraries, and CI/CD pipelines for automated deployments.
· Collaborate with data scientists, analysts, and business teams to deliver well-modeled, reliable datasets.
· Lead design reviews, mentor junior engineers, and contribute to engineering best practices.
Required Qualifications:
· 15+ years of professional experience in Data Engineering.
· Strong expertise in Python and PySpark for large-scale data processing.
· Advanced hands-on experience with AWS (S3, Glue, EMR, Lambda, Step Functions, Kinesis/MSK, DynamoDB, Athena, Redshift).
· Deep experience building event-driven and streaming data pipelines.
· Strong SQL experience for analytical and ETL workloads.
· Hands-on experience with workflow orchestration tools such as Airflow or Step Functions.
· Experience with CI/CD, Git, and Infrastructure-as-Code (Terraform or CloudFormation).
· Strong understanding of distributed systems, Spark performance tuning, data modeling, and cloud cost optimization.
· Knowledge of data security, encryption, networking, and compliance best practices in cloud environments.
Soft Skills:
Strong design and architectural understanding Excellent communication and stakeholder interaction skills Ability to work in a globally distributed team
Skills: Core Java~Project Management
Experience Required: 8-10