Urgent Roles Databricks Data Engineer with DevOps : Los Angeles CA (Hybrid)

1 view

Skip to first unread message

Yogesh Singh

unread,

Mar 2, 2026, 4:41:17 PMMar 2

to Yogesh Singh

Hi All

Please find the updated JD below. Please share profiles of H1B/H4 for the below hybrid role with the client” Persistent Systems.”

Must have 11+ Years of experience. AWS Data Engineer (No Azure preference)

Job Title: Certified AWS Databricks Data Engineer with DevOps Skills(AWS cloud)

Location: Los Angeles, CA (Hybrid)

Rate: $65/hr. C2C, sorry, no flexibility

Job Summary

Must have skills:

We are looking for an experienced Databricks Data Engineer with strong DevOps expertise to join our data engineering team. The ideal candidate will design, build, and optimize large-scale pipelines on the Databricks Lakehouse Platform on AWS, while driving automated CI/CD and deployment practices. This role requires strong skills in PySpark, SQL, AWS cloud services, and modern DevOps tooling. You will collaborate closely with cross-functional teams to deliver scalable, secure, and high-performance data solutions.

Must Demonstrate (Critical Skills & Architectural Competencies)

Designing and implementing Databricks-based Lakehouse architectures on AWS
Clear separation of compute vs. serving layers
Ability to design low-latency data/API access strategies (beyond Spark-only patterns)
Strong understanding of caching strategies for performance and cost optimization
Data partitioning, storage optimization, and file layout strategy
Ability to handle multi-terabyte structured or time-series datasets
Skill in requirement probing, identifying what matters architecturally
A player-coach mindset: hands-on engineering + technical leadership

Key Responsibilities

1. Data Pipeline Development

Design, build, and maintain scalable ETL/ELT pipelines using Databricks on AWS.
Develop high-performance data processing workflows using PySpark/Spark and SQL.
Integrate data from Amazon S3, relational databases, and semi/non‑structured sources.
Implement Delta Lake best practices including schema evolution, ACID, OPTIMIZE, ZORDER, partitioning, and file-size tuning.
Ensure architectures support high-volume, multi-terabyte workloads.

2. DevOps & CI/CD

Implement CI/CD pipelines for Databricks using Git, GitLab, GitHub Actions, or AWS-native tools.
Build and manage automated deployments using Databricks Asset Bundles.
Manage version control for notebooks, workflows, libraries, and environment configuration.
Automate cluster policies, job creation, environment provisioning, and configuration management.
Support infrastructure-as-code via Terraform (preferred) or CloudFormation.

3. Collaboration & Business Support

Work with data analysts and BI teams to prepare curated datasets for reporting and analytics.
Collaborate closely with product owners, engineering teams, and business partners to translate requirements into scalable implementations.
Document data flows, technical architecture, and DevOps/deployment workflows.

4. Performance & Optimization

Tune Spark clusters, workflows, and queries for cost efficiency and compute performance.
Monitor pipelines, troubleshoot failures, and maintain high reliability.
Implement logging, monitoring, and observability across workflows and jobs.
Apply caching strategies and workload optimization techniques to support low-latency consumption patterns.

5. Governance & Security

Implement and maintain data governance using Unity Catalog.
Enforce access controls, security policies, and data compliance requirements.
Ensure lineage, quality checks, and auditability across data flows.

Technical Skills

Strong hands-on experience with Databricks, including:

Delta Lake
Unity Catalog
Lakehouse Architecture
Delta Live Pipelines
Databricks Runtime
Table Triggers
Databricks Workflows

Proficiency in PySpark, Spark, and advanced SQL.
Expertise with AWS cloud services, including:

S3
IAM
Glue / Glue Catalog
Lambda
Kinesis (optional but beneficial)
Secrets Manager

Strong understanding of DevOps tools:

Git / GitLab
CI/CD pipelines
Databricks Asset Bundles

Familiarity with Terraform is a plus.
Experience with relational databases and data warehouse concepts.

Preferred Experience

Knowledge of streaming technologies like Structured Streaming/Spark Streaming.
Experience building real-time or near real-time pipelines.
Exposure to advanced Databricks runtime configurations and performance tuning.

Certifications (Optional)

Databricks Certified Data Engineer Associate / Professional
AWS Data Engineer or AWS Solutions Architect certification

Thanks

Yogesh Pratap Singh

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

Reply all

Reply to author

Forward

0 new messages