New role :: C2C is workable :: Cloud Infrastructure Engineer at Charlotte, NC (5 Days onsite)

0 views
Skip to first unread message

Hitesh Kumar Wadhwa

unread,
May 20, 2026, 11:45:23 AMMay 20
to c2cc2hreq...@googlegroups.com

New role::

 

Job Title: Cloud Infrastructure Engineer (AI/LLM Focused)

Location: Charlotte, NC (5 Days onsite)

Duration: 12+ months

Only GC and USC, C2C is workable

 

Primary Skills

·         vLLM

·         TensorRT-LLM

·         Triton Inference Server

·         SGLang

·         Kubernetes ML Serving

·         KServe

·         OpenShift AI

·         GPU Orchestration

·         GCP

·         Terraform

 

Key Responsibilities

·         Design and manage scalable AI/ML infrastructure for GenAI and LLM workloads.

·         Deploy and optimize LLM inference pipelines using vLLM, TensorRT-LLM, Triton Inference Server, and SGLang.

·         Implement inference optimization techniques including:

1.       Continuous Batching

2.       Speculative Decoding

3.       KV Cache / Prefix Caching

4.       FP8 / AWQ / GPTQ quantization

5.       Tensor Parallelism

·         Build and maintain Kubernetes-based ML serving platforms using KServe and OpenShift AI.

·         Manage GPU orchestration and scheduling using technologies such as Run:AI, CUDA, NCCL, and MIG.

·         Develop Helm charts, Kubernetes Operators, and platform automation for AI workloads.

·         Conduct performance benchmarking and optimization for GPU-based inference systems.

·         Implement monitoring and observability using Prometheus and Grafana.

·         Collaborate with data science and ML engineering teams to productionize LLM models.

·         Automate infrastructure provisioning and deployment using Terraform.

 

Required Qualifications

·         6+ years of experience in cloud engineering or platform engineering.

·         Experience with LLMOps/MLOps platforms.

·         Strong hands-on experience with Kubernetes and containerized AI/ML workloads.

·         Experience with GPU infrastructure and distributed inference optimization.

·         Proficiency in GCP cloud services and cloud-native architecture.

·         Strong scripting/programming skills in Python.

·         Experience with ML observability and production monitoring tools.

·         Familiarity with OpenShift AI and enterprise Kubernetes ecosystems.

 

Preferred Qualifications

·         Knowledge of GenAI frameworks and RAG architectures.

·         Exposure to enterprise AI governance and security practices.

 

Thank You,

 

Description: Description: Description: Description: Description: Description: Description: A blue and black logo

AI-generated content may be incorrect.Description: Description: Description: Description: Description: https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcS3XwoHX8kBvOyoA0kf_9z2zKNj_Ahv41Cjqw&s

Hitesh Kumar Wadhwa
Head of Talent Deployment
, Key2Source INC

P:+1-302-732-0114
E: hite...@key2source.com | www.key2source.com
5941 81st, PL N, Pinellas Park, FL | 33781, USA

Description: Description: Description: Description: Description: Description: Description: linkedin

Stay Connected! Follow us on LinkedIn

https://www.linkedin.com/company/key2source-inc/posts/?feedView=all

 

 

image001.jpg
image004.png
image005.jpg
Reply all
Reply to author
Forward
0 new messages