Hello,
This is Rahul from Quantum world Technologies; I am working as Senior Technical Recruiter in this company. I have an Onsite Job Opportunity with one of our clients. Please share your resume if you are interested in the job details given below
Role- Hadoop Data Engineer
Location- Charlotte, NC (Onsite)
Role Summary
Programming: Python/PySpark, Scala is a plus
Big Data: Hadoop (HDFS, YARN), Hive, Spark (optimization, tuning)
Orchestration: Apache Airflow
Databases/ETL: MongoDB (indexing, sharding, tuning) SQL Server & SSIS (development, migration) Strong SQL & stored procedures
Data Lake: HDFS, Hive, Parquet/ORC, partitioning, compaction
APIs: REST-based ingestion Reverse engineering & lineage tools
CI/CD & DevOps: Git, Jenkins, Docker, IaC
Monitoring: logging, metrics, lineage
Key Responsibilities
Reverse Engineering & Data Mapping
Reverse engineer ETL pipelines (SSIS, Spark, stored procedures) to document data
flows, logic, and transformations.
Perform detailed source-to-target mappings with field-level transformations and business
rules.
Build data dictionaries, lineage, and mapping artifacts.
Collaborate with SMEs to uncover undocumented logic.
Identify data model gaps and recommend remediation.
ETL Pipeline Remediation
Design and refactor pipelines aligned to new source APIs and data contracts.
Re-engineer ETL for 1:1 functional parity during migrations.
Implement schema evolution, transformations, and mapping changes (batch &
streaming).
Eliminate redundancy and optimize legacy logic.
Build modular, reusable pipelines using Spark/PySpark/Scala.
Modernize SSIS and integrate with orchestration frameworks.
Orchestrate workflows in Airflow (DAGs, dependencies, SLAs).
Implement logging, error handling, alerting, and metadata capture.
Data Storage Optimization
Simplify schemas; remove redundant/obsolete data across Hive and MongoDB.
Optimize partitioning, clustering, and file formats (Parquet, ORC, Avro).
Redesign MongoDB indexing, sharding, and collections.
Tune HDFS, Hive, MongoDB, and SQL Server for performance and cost.
Implement lifecycle management, archival, and retention.
Functional Skills
Preferred Qualifications
AI/ML-assisted ETL remediation or code conversion
Experience with Wiz or Palo Alto Prisma (APIs, data models, risk metrics)
Prior Prisma to Wiz (or similar CSPM/CNAPP) migrations
Knowledge of CSPM/CNAPP domains (vulnerabilities, identities, exposures)
Experience in regulated, compliance-heavy environments
Thanks & Regards
Rahul Pandey