Data Engineer with Databricks, Delta Lake, ETL, Spark, Dataiku, Data Migration, SAP (e.g., SAP HANA, SAP BW), SharePoint, Python, Pyspark Exp with Databricks Certification.
Location: Remote
Number of Positions : 2-3
Key Responsibilities:
Data Migration:
• Lead the end-to-end migration of data from SAP and SharePoint to Databricks Delta Lake.
• Develop and implement ETL (Extract, Transform, Load) pipelines to move large volumes of data into Databricks.
• Ensure data quality, accuracy, and integrity during the migration process within the data policy framework and client’s enterprise architecture.
• Design and manage robust data models and architectures within Databricks Delta Lake.
Integration:
• Work closely with data engineers and data architects to integrate Databricks Delta Lake data into Dataiku for downstream data processing and analytics.
• Build scalable solutions that facilitate seamless data integration across the enterprise.
• Collaborate with cross-functional teams to develop integration strategies that align with business goals.
Optimization and Performance:
• Optimize Databricks clusters for performance and scalability.
• Perform regular data validation and error checking to ensure the successful migration and integration of data.
• Monitor system performance and troubleshoot any issues that arise during migration and integration phases.
Collaboration:
• Work with SAP and SharePoint subject matter experts to understand source data structures and develop migration strategies.
• Collaborate with data architects, developers, and business stakeholders to ensure that solutions meet business and technical requirements.
• Provide technical guidance and mentorship to junior team members.
Key Qualifications:
- Having Databricks Certification is a MUST.
Technical Expertise:
• 4+ years of experience working with Databricks and Delta Lake.
• Strong experience with data migration processes, particularly from SAP and SharePoint.
• Expertise in building and managing ETL pipelines.
• Proficiency in Spark for data processing in Databricks.
- Familiarity with Dataiku or similar data platforms is a strong advantage.
SAP and SharePoint Knowledge:
• Experience in extracting data from SAP (e.g., SAP HANA, SAP BW) and SharePoint.
• Strong understanding of SAP and SharePoint data models and structures.
Programming and Data Management:
• Proficiency in Python, SQL, and PySpark for data manipulation and transformation.
• Familiarity with cloud-based data platforms (e.g., AWS, Azure, or GCP) and working within cloud environments.
• Understanding of data warehousing concepts and experience with large-scale data migration projects.
Problem Solving & Communication:
• Excellent problem-solving skills and the ability to troubleshoot complex data migration and integration issues.
• Strong communication skills to collaborate with both technical and non-technical stakeholders.
Preferred Qualifications:
• Experience with CI/CD pipelines and automation in Databricks environments.
• Knowledge of Delta Live Tables for real-time data processing in Databricks.
• Familiarity with data governance and data security best practices.