AI Technical Architect
Location: Richardson, TX – Onsite –
Candidate should go to the office 3 days a Week - Please look for Local Candidates
Contract
Key Responsibilities
Define AI/ML reference architecture and solution blueprints (batch/streaming ML, LLM+RAG, multimodal).
Lead end‑to‑end solution design: data ingestion → feature stores → model training → inference → monitoring.
Architect LLM applications (chatbots, copilots, agents, summarization, classification) with RAG, evaluation, safety, and guardrails.
Own MLOps/LLMOps: CI/CD for models, model registry, feature store, lineage, observability, drift and cost monitoring.
Choose the right cloud and runtime (managed services vs. self‑hosted; GPU/CPU; serverless vs. containerized).
Establish security, compliance, and governance (PII handling, encryption, auditability, Responsible AI).
Collaborate with product and business stakeholders to translate requirements into architectural decisions and delivery plans.
Perform technical spikes/POCs, benchmark models/infrastructure, and lead Architecture Reviews.
Create and maintain standards, patterns, and reusable components; mentor engineers across teams.
Drive performance & cost optimization (throughput/latency/SLA/SLO; caching; quantization/distillation; autoscaling).
Support vendor/product evaluations (cloud AI services, vector DBs, orchestration frameworks, monitoring).
Required Qualifications
Bachelor’s/Master’s in Computer Science, Engineering, Data/AI or related field.
15+ years of overall engineering experience with 4+ years in AI/ML solution architecture.
Proven experience designing and deploying AI systems in production at scale (LLM and/or classical ML).
Strong hands‑on proficiency in Python and cloud-native architectures (AWS/Azure/GCP).
Must‑Have Technical Skills
AI/ML & LLM Architecture
Designing LLM/RAG systems: retrieval pipelines, chunking strategies, embeddings, reranking, prompt/response orchestration, evaluation and safety.
Model life cycle: fine‑tuning, PEFT/LoRA, quantization/distillation, latency & cost management.
Classical ML/NLP: feature engineering, model selection, training, cross‑validation, metrics, A/B testing.
MLOps / LLMOps
CI/CD for ML (model/version promotion), feature stores, model registry, lineage and drift detection.
Inference stacks: Torch/TensorFlow, vLLM/TGI/ONNX, GPU orchestration, autoscaling, APM.
Pipelines & orchestration: Airflow, Kubeflow, MLflow, SageMaker, Vertex AI, Azure ML.
AI Technical Architect
Location: Richardson, TX – Onsite –
Candidate should go to the office 3 days a Week
Contract