Looking for Data Engineer – AI Systems @ St. Louis, Missouri ( ONSITE )

0 views

Skip to first unread message

Peer Professional

unread,

Dec 11, 2025, 7:40:58 AM (8 days ago) Dec 11

to Peer Professional

Hello

We are actively looking for a Data Engineer – AI Systems. If you or your consultant are actively looking for a new job please share your profile.

Role: Data Engineer – AI Systems

Duration: 6+ Months

Location: St. Louis, Missouri ( ONSITE )

Data Engineer – AI Systems (Databricks)

We’re building intelligent, Databricks-powered AI systems that structure and activate information from diverse enterprise sources (Confluence, OneDrive, PDFs, and more). As a Data Engineer, you’ll design and optimize the data pipelines that transform raw and unstructured content into clean, AI-ready datasets for machine learning and generative AI agents.

You’ll collaborate with a cross-functional team of Machine Learning Engineers, Software Developers, and domain experts to create high-quality data foundations that power Databricks-native AI agents and retrieval systems.

Key Responsibilities

Develop Scalable Pipelines: Design, build, and maintain high-performance ETL and ELT workflows using Databricks, PySpark, and Delta Lake.
Data Integration: Build APIs and connectors to ingest data from collaboration platforms such as Confluence, OneDrive, and other enterprise systems.
Unstructured Data Handling: Implement extraction and transformation pipelines for text, PDFs, and scanned documents using Databricks OCR and related tools.
Data Modeling: Design Delta Lake and Unity Catalog data models for both structured and vectorized (embedding-based) data stores.
Data Quality & Observability: Apply validation, version control, and quality checks to ensure pipeline reliability and data accuracy.
Collaboration: Work closely with ML Engineers to prepare datasets for LLM fine-tuning and vector database creation, and with Software Engineers to deliver end-to-end data services.
Performance & Automation: Optimize workflows for scale and automation, leveraging Databricks Jobs, Workflows, and CI/CD best practices.

What You Bring

Experience with data engineering, ETL development, or data pipeline automation.
Proficiency in Python, SQL, and PySpark.
Hands-on experience with Databricks, Spark, and Delta Lake.
Familiarity with data APIs, JSON, and unstructured data processing (OCR, text extraction).
Understanding of data versioning, schema evolution, and data lineage concepts.
Interest in AI/ML data pipelines, vector databases, and intelligent data systems.

Bonus Skills

Experience with vector databases (e.g., Pinecone, Chroma, FAISS) or Databricks’ Vector Search.
Exposure to LLM-based architectures, LangChain, or Databricks Mosaic AI.
Knowledge of data governance frameworks, Unity Catalog, or access control best practices.
Familiarity with REST API development or data synchronization services (e.g., Airbyte, Fivetran, custom connectors).

Thank You

Satti Reddy

Reply all

Reply to author

Forward

0 new messages