Job Summary
We are seeking a highly experienced Lead ML Engineer / Lead RAG Engineer to architect and deliver a production-grade Retrieval-Augmented Generation (RAG) platform. The ideal candidate will have deep expertise in Python, OpenAI APIs, MongoDB Atlas Vector Search, AWS data platforms, and enterprise-scale AI solution delivery.
This role will lead the design and implementation of end-to-end RAG architecture, including ingestion, indexing, retrieval, grounding, citations, tool integrations, observability, and production readiness.
Required Experience
- 10+ years of hands-on software engineering and Python development
- Proven experience leading production-grade RAG/LLM platforms
- Strong expertise with OpenAI APIs (Embeddings, Chat Completions, Tool Calling)
- Hands-on experience with MongoDB Atlas Vector Search
- Experience integrating AWS Aurora MySQL and AWS DocumentDB
- Strong API development experience using FastAPI
- Experience with asynchronous processing, data pipelines, batch jobs, and event-driven architectures
- Expertise in CI/CD, testing frameworks, Docker, monitoring, and security best practices
- Experience mentoring engineers and leading architecture reviews
Key Responsibilities
- Own end-to-end RAG architecture from ingestion through generation
- Design scalable extraction pipelines from Aurora MySQL and DocumentDB
- Build chunking, metadata, embedding, and indexing strategies
- Implement vector search, metadata filtering, and multi-tenant retrieval
- Integrate OpenAI models for grounded response generation and citations
- Establish MCP-style tool integrations with auditability and governance
- Drive production readiness, observability, security, and reliability
- Lead technical design reviews, coding standards, and mentoring initiatives
Must-Have Skills
- End-to-end RAG Architecture & LLM Orchestration – 10+ Years
- Python Backend/Data Engineering – 8+ Years
- OpenAI APIs (Embeddings, Chat, Tools) – 6+ Years
- MongoDB Atlas Vector Search – 6+ Years
- AWS Aurora MySQL & DocumentDB Integration – 6+ Years
- FastAPI, Async Pipelines, CI/CD – 6+ Years
Nice to Have
- Hybrid Retrieval (Keyword + Vector)
- Reranking frameworks
- RAGAS, TruLens, LLM Evaluation Frameworks
- OCR, Document Parsing, Knowledge Graphs
- AWS ECS/EKS/Lambda deployments
- Healthcare domain experience
Preferred Technologies
- Python, FastAPI
- OpenAI APIs
- MongoDB Atlas Vector Search
- AWS Aurora MySQL
- AWS DocumentDB
- Docker, Kubernetes, ECS/EKS
- Kafka, SQS, Celery, RQ
- Pytest, CI/CD Pipelines
- Monitoring & Observability Stack