Urgent role || Hadoop Data Engineer || Charlotte, NC (Onsite)

1 view

Skip to first unread message

Rahul Pandey

unread,

Jun 4, 2026, 3:46:00 PM (19 hours ago) Jun 4

to Recruiting Simplifies

Hello,

This is Rahul from Quantum world Technologies; I am working as Senior Technical Recruiter in this company. I have an Onsite Job Opportunity with one of our clients. Please share your resume if you are interested in the job details given below

Role- Hadoop Data Engineer

Location- Charlotte, NC (Onsite)

Role Summary

Programming: Python/PySpark, Scala is a plus

Big Data: Hadoop (HDFS, YARN), Hive, Spark (optimization, tuning)

Orchestration: Apache Airflow

Databases/ETL: MongoDB (indexing, sharding, tuning) SQL Server & SSIS (development, migration) Strong SQL & stored procedures

Data Lake: HDFS, Hive, Parquet/ORC, partitioning, compaction

APIs: REST-based ingestion Reverse engineering & lineage tools

CI/CD & DevOps: Git, Jenkins, Docker, IaC

Monitoring: logging, metrics, lineage

Key Responsibilities

Reverse Engineering & Data Mapping

Reverse engineer ETL pipelines (SSIS, Spark, stored procedures) to document data

flows, logic, and transformations.

Perform detailed source-to-target mappings with field-level transformations and business

rules.

Build data dictionaries, lineage, and mapping artifacts.

Collaborate with SMEs to uncover undocumented logic.

Identify data model gaps and recommend remediation.

ETL Pipeline Remediation

Design and refactor pipelines aligned to new source APIs and data contracts.

Re-engineer ETL for 1:1 functional parity during migrations.

Implement schema evolution, transformations, and mapping changes (batch &

streaming).

Eliminate redundancy and optimize legacy logic.

Build modular, reusable pipelines using Spark/PySpark/Scala.

Modernize SSIS and integrate with orchestration frameworks.

Orchestrate workflows in Airflow (DAGs, dependencies, SLAs).

Implement logging, error handling, alerting, and metadata capture.

Data Storage Optimization

Simplify schemas; remove redundant/obsolete data across Hive and MongoDB.

Optimize partitioning, clustering, and file formats (Parquet, ORC, Avro).

Redesign MongoDB indexing, sharding, and collections.

Tune HDFS, Hive, MongoDB, and SQL Server for performance and cost.

Implement lifecycle management, archival, and retention.

Functional Skills

Experience in ETL migration/remediation projects
Strong reverse engineering of legacy ETL (SSIS, Spark, scripts)
Expertise in STM, transformation specs, and lineage artifacts
Data modeling (dimensional, normalized, denormalized)
Schema evolution and zero-downtime migrations
Performance tuning across compute and storage layers
Strong debugging and problem-solving for distributed systems

Preferred Qualifications

AI/ML-assisted ETL remediation or code conversion

Experience with Wiz or Palo Alto Prisma (APIs, data models, risk metrics)

Prior Prisma to Wiz (or similar CSPM/CNAPP) migrations

Knowledge of CSPM/CNAPP domains (vulnerabilities, identities, exposures)

Experience in regulated, compliance-heavy environments

Thanks & Regards

Rahul Pandey

rahul....@quantumworldit.com

Reply all

Reply to author

Forward

0 new messages