Persistent
/ Mizuho Bank
Role : Azure Data Engineer
with Databricks Expertise
Location : Iselin, NJ (Hybrid/Local candidates only)
Note: Need strong experience
with Databricks. Interviewer was asking only data bricks related questions in
L1 round.
Exp : 12+
Job Summary:
We are seeking highly
skilled Azure Data Engineer with strong expertise in SQL, Python,
Datawarehouse, Cloud ETL tools to join our data team. The ideal candidate will
design, implement and optimize large-scale data pipeline, ensuring scalability,
reliability and performance. This role involves working closely with multiple
teams and business stakeholders to deliver cutting-edge data solutions.
Key Responsibilities:
- Data
Pipeline Development:
- Build
and maintain scalable ETL/ELT pipelines using Databricks.
- Leverage
PySpark/Spark and SQL to transform and process large datasets.
- Integrate
data from multiple sources including Azure Blob Storage, ADLS and other
relational/non-relational systems.
- Collaboration
& Analysis:
- Work
Closely with multiple teams to prepare data for dashboard and BI Tools.
- Collaborate
with cross-functional teams to understand business requirements and
deliver tailored data solutions.
- Performance
& Optimization:
- Optimize
Databricks workloads for cost efficiency and performance.
- Monitor
and troubleshoot data pipelines to ensure reliability and accuracy.
- Governance
& Security:
- Implement
and manage data security, access controls and governance standards using
Unity Catalog.
- Ensure
compliance with organizational and regulatory data policies.
- Deployment:
- Leverage
Databricks Asset Bundles for seamless deployment of Databricks jobs, notebooks
and configurations across environments.
- Manage
version control for Databricks artifacts and collaborate with team to
maintain development best practices.
Technical Skills:
- Strong expertise in Databricks (Delta
Lake, Unity Catalog, Lakehouse Architecture, Table Triggers, Delta Live
Pipelines, Databricks Runtime etc.)
- Proficiency in Azure Cloud Services.
- Solid Understanding of Spark and PySpark
for big data processing.
- Experience
in relational databases.
- Knowledge
on Databricks Asset Bundles and GitLab.
Preferred Experience:
- Familiarity
with Databricks Runtimes and advanced configurations.
- Knowledge
of streaming frameworks like Spark Streaming.
- Experience
in developing real-time data solutions.
Certifications:
- Azure
Data Engineer Associate or Databricks certified Data Engineer Associate
certification. (Optional)
Thanks and Regards,
Ashish Kumar
Senior Technical Recruiter | Sibitalent Corp
✉️ ashish...@sibitalent.com