New role :: C2C is workable :: Cloud Infrastructure Engineer at Charlotte, NC (5 Days onsite)

0 views

Skip to first unread message

Hitesh Kumar Wadhwa

unread,

May 20, 2026, 11:45:23 AMMay 20

to c2cc2hreq...@googlegroups.com

New role::

Job Title: Cloud Infrastructure Engineer (AI/LLM Focused)

Location: Charlotte, NC (5 Days onsite)

Duration: 12+ months

Only GC and USC, C2C is workable

Primary Skills

· vLLM

· TensorRT-LLM

· Triton Inference Server

· SGLang

· Kubernetes ML Serving

· KServe

· OpenShift AI

· GPU Orchestration

· GCP

· Terraform

Key Responsibilities

· Design and manage scalable AI/ML infrastructure for GenAI and LLM workloads.

· Deploy and optimize LLM inference pipelines using vLLM, TensorRT-LLM, Triton Inference Server, and SGLang.

· Implement inference optimization techniques including:

1. Continuous Batching

2. Speculative Decoding

3. KV Cache / Prefix Caching

4. FP8 / AWQ / GPTQ quantization

5. Tensor Parallelism

· Build and maintain Kubernetes-based ML serving platforms using KServe and OpenShift AI.

· Manage GPU orchestration and scheduling using technologies such as Run:AI, CUDA, NCCL, and MIG.

· Develop Helm charts, Kubernetes Operators, and platform automation for AI workloads.

· Conduct performance benchmarking and optimization for GPU-based inference systems.

· Implement monitoring and observability using Prometheus and Grafana.

· Collaborate with data science and ML engineering teams to productionize LLM models.

· Automate infrastructure provisioning and deployment using Terraform.

Required Qualifications

· 6+ years of experience in cloud engineering or platform engineering.

· Experience with LLMOps/MLOps platforms.

· Strong hands-on experience with Kubernetes and containerized AI/ML workloads.

· Experience with GPU infrastructure and distributed inference optimization.

· Proficiency in GCP cloud services and cloud-native architecture.

· Strong scripting/programming skills in Python.

· Experience with ML observability and production monitoring tools.

· Familiarity with OpenShift AI and enterprise Kubernetes ecosystems.

Preferred Qualifications

· Knowledge of GenAI frameworks and RAG architectures.

· Exposure to enterprise AI governance and security practices.

Thank You,

Description: Description: Description: Description: Description: Description: Description: A blue and black logo

AI-generated content may be incorrect.

Hitesh Kumar Wadhwa
Head of Talent Deployment, Key2Source INC

P:+1-302-732-0114
E: hite...@key2source.com | www.key2source.com
5941 81st, PL N, Pinellas Park, FL | 33781, USA

Stay Connected! Follow us on LinkedIn

https://www.linkedin.com/company/key2source-inc/posts/?feedView=all

image001.jpg

image004.png

image005.jpg

Reply all

Reply to author

Forward

0 new messages