Hi All
Job Summary: PySpark / Python Data Engineer – Tech Lead
Remote
rate 65$/hr
We are seeking a seasoned professional with strong expertise in building
robust data engineering and analytics solutions within AWS ecosystems.
The ideal candidate should have deep experience in PySpark, Python, and
end‑to‑end data pipeline development, including job orchestration,
workflow design, and data mapping. The role requires the ability to
translate complex business logic, stored procedures, SQL triggers, and
source‑system rules into scalable PySpark implementations in a cloud
environment. Experience with data streaming on Spark clusters and API
design is highly desirable. Knowledge of Palantir Foundry is a strong plus.
Domain: Experience in Energy and Utility projects is a significant
advantage. Time Zone: Should be able to work in PST hours.
Key Responsibilities
• Translate business requirements into technical solutions using PySpark
and Python frameworks.
• Lead data engineering initiatives addressing moderately complex to
highly complex data and analytics challenges.
• Plan and execute tasks to meet shared objectives, maintain progress
tracking, and document work following best practices.
• Identify and implement internal process improvements, including
scalable infrastructure design, optimized data distribution, and
automation of manual workflows.
• Participate actively in Agile/Scrum ceremonies such as stand‑ups,
sprint planning, and retrospectives.
• Contribute to the evolution of data systems and architecture,
recommending enhancements to pipelines and frameworks.
• Provide technical guidance to team members on complex challenges
spanning multiple functional and technical domains.
• Build infrastructure that supports large‑scale data access and
analysis, ensuring data quality and proper metadata management.
• Collaborate with leadership to strengthen data‑driven decision‑making
through demos, mentorship, and best‑practice sharing.
• Exposure to front‑end tools such as Power BI or Tableau is a plus.
Minimum Qualifications
• MS or equivalent experience in Computer Science, MIS, or related
technical fields.
• 10–15+ years of overall experience, including 5+ years in data
engineering/ETL ecosystems using PySpark, Python, and Java.
Required Skills
• Strong expertise in PySpark and Python.
• Experience with Pandas, APIs, and Spark Streaming.
• Solid understanding of database design fundamentals.
• Familiarity with CI/CD tools and infrastructure‑as‑code frameworks.
• Experience writing production‑grade code.
• Experience with unit tests, integration tests, schema validations, and
health checks.
• Knowledge of Palantir Foundry (Ontology modeling, API configuration,
Foundry Typescript) is a strong plus.