Role: Data Engineer
Location: Malvern, PA Local
Experience: 5-7+ years exp required.
Key Responsibilities
· Develop and Implement Data Pipelines: Design, build, and maintain robust data pipelines primarily using AWS Glue and PySpark.
· Data Sourcing and Transformation: Source data from various systems, including Redshift and Aurora, performing necessary streaming transformations and heavy data cleaning.
· Data Delivery: Push resulting, cleaned datasets into S3 buckets.
· External Integration: Manage the secure transfer of resulting files via SFTP to an external 3rd party company's server, adhering to non-negotiable external integration deadlines.
· Collaboration: Work closely with the team to consult on the best and most efficient solutions for achieving required data outputs, given the constraints of the AWS Glue/PySpark environment.
Regards,
Adarsh