Cloud Architect with expertise in Generative AI --Senior
GenAi/ Agentic Lead
Location : Santa Clara Location - Onsite Day one-Only Locals
Contract
We are seeking a highly skilled Cloud Architect with
expertise in Generative AI, Copilot Studio, and multi‑cloud platforms spanning
Azure (including Azure AI Foundry), AWS, and Google Cloud. This role will
design scalable, secure, and production‑ready AI systems, enabling RAG, agentic
workflows, and enterprise copilots.
Core Responsibilities:
- Architect end‑to‑end
Generative AI solutions, including model serving (vLLM, TGI), API
integration, and user interaction layers.
- Design and implement RAG
architecture using vector stores, embeddings, hybrid search, and re‑ranking
to embed enterprise knowledge into LLMs.
- Create agentic systems,
enabling multi‑agent collaboration for complex, stateful workflows and
reasoning‑driven automation.
- Develop and govern
Copilots in Copilot Studio, including connectors, actions, plugins, DLP
rules, environment strategy, and integration with Microsoft 365 and
enterprise systems.
- Leverage Azure AI
Foundry (prompt flow, evaluators, safety, model orchestration) to
operationalize LLM applications at scale.
- Evaluate and optimize AI
system performance, balancing quality, latency, throughput, cost
efficiency, and safety compliance.
- Implement Responsible
AI, security, and HITL (HumanintheLoop) controls, ensuring compliance in
regulated environments.‑in‑the‑Loop) controls, ensuring compliance in
regulated environments.
- Produce clear,
maintainable documentation for architecture, patterns, and operational
processes.
Required Qualifications
- 8–10 years of experience
in cloud architecture or enterprise software engineering.
- 3+ years of hands‑on
experience designing or delivering Generative AI or LLM applications.
- Proven experience with
Azure AI Foundry, Azure OpenAI, and Copilot Studio (actions, connectors,
governance, M365 integration).
- Experience deploying AI
solutions on AWS (Bedrock, SageMaker) and/or GCP (Vertex AI).
- Hands‑on experience with
RAG, vector databases (Azure AI Search, Pinecone, OpenSearch, Vertex
Matching Engine), embeddings, and hybrid search.
- Deep understanding of
cloud security (IAM/RBAC, Key Vault/KMS, VPC/PrivateLink, token safety).
- Experience with
Kubernetes (AKS/EKS/GKE), containerization, API frameworks (FastAPI,
Node.js, .NET), Python, TypeScript, or C#/.NET.
- Working knowledge of
transformer architectures and model adaptation techniques (fine‑tuning,
LoRA, prompt engineering).
- Familiarity with AI Ops
/ MLOps tools such as Prompt Flow, MLflow, SageMaker Pipelines, or Vertex
Pipelines.
Preferred Qualifications
- Experience implementing
agent‑based systems using frameworks like LangChain, LlamaIndex, Semantic
Kernel, or AutoGen.
- Background working with
enterprise data ecosystems (Databricks, Snowflake, BigQuery, Redshift).
- Knowledge of Responsible
AI frameworks, guardrails, safety filters, PII redaction, and evaluation
methodologies.
- Experience in regulated
industries (healthcare, finance, government), with understanding of
compliance controls.
- Experience with
observability (OpenTelemetry, Prometheus/Grafana, App Insights) for AI
workloads.
Education:
- Bachelor’s/ Masters in
Computer Science, Engineering, Information Systems, Data Science, or
related field (required).