Job Title: Azure Databricks Consultant (with RAG Experience)
Location: Richardson, TX (Hybrid)
Work Authorization: US Citizens or Green Card Holders Only
Duration: 12+ Months (Contract)
Position Overview
We are seeking a highly skilled Azure Databricks Consultant with
strong expertise in PySpark, Kafka, Azure Cosmos DB, and hands-on
experience implementing RAG (Retrieval-Augmented Generation) pipelines.
The ideal candidate will architect, build, and optimize cloud-based data
engineering solutions that support enterprise-scale analytics and AI-driven
applications. This role requires deep technical capability in streaming
ingestion, ETL development, and integrating big data systems within the Azure
ecosystem.
Key Responsibilities
- Design,
develop, and optimize scalable ETL and data pipelines in Azure
Databricks using PySpark.
- Ingest,
process, and transform high-volume streaming data from Kafka for
real-time and batch analytics use cases.
- Build
integrations and write high-performance data flows to Azure Cosmos DB,
ensuring optimized query and storage patterns.
- Implement and
support RAG-based data processing flows, including vector creation,
embeddings, retrieval optimization, and integration with LLM pipelines.
- Collaborate
with engineering, data science, and analytics teams to gather requirements
and deliver robust, scalable data solutions.
- Monitor, tune,
and manage Databricks clusters to ensure high performance, cost
efficiency, and operational reliability.
- Troubleshoot
pipeline issues, improve workflow efficiency, and enforce data quality and
governance standards.
- Document
technical solutions, provide knowledge transfer, and support production
deployments.
Required Skills & Experience
- Strong,
hands-on experience with Azure Databricks and PySpark for large-scale data engineering
and ETL pipelines.
- Expertise with
Kafka for streaming ingestion, data processing, and ensuring pipeline
reliability.
- Proficiency
with Azure Cosmos DB, including data modeling, indexing, tuning, and integration with
distributed systems.
- Experience
implementing RAG (Retrieval-Augmented Generation) workflows: vector
storage, embedding pipelines, retrieval optimization, and integration with
LLM-based applications.
- Solid
understanding of the Azure cloud ecosystem, including networking,
storage, compute, and hybrid environments.
- Ability to work
in cross-functional teams, troubleshoot complex data issues, and deliver
high-quality solutions in fast-paced environments.