GenAI Engineer // Philadelphia, Pennsylvania (Hybrid) // Need Local Candidate // F2F

0 views
Skip to first unread message

Birendra Kumar

<bkumar@staffxpertllc.com>
unread,
8:51 AM (7 hours ago) 8:51 AM
to

 

 

Hi

Hope you are doing good!

Job Title: GenAI Engineer

Location: Philadelphia, Pennsylvania (Hybrid) // Need Local Candidate

Duration: Contract (6+ Months)

Mode of Interview: In-Person

Preferred Visa: GC-EAD,H1-B,H4 EAD

Job Summary

STAFFXPERT LLC is seeking a GenAI Engineer on behalf of our client in Philadelphia, Pennsylvania. This role focuses on designing and implementing on-premise large language model (LLM) solutions and vector database architectures. The ideal candidate will have strong hands-on experience with open-source LLMs, Retrieval-Augmented Generation (RAG) pipelines, and secure enterprise deployments.

Key Responsibilities

  • Deploy and optimize open-source LLMs such as Llama 3 and Mistral / Mixtral in on-prem or private environments
  • Develop and integrate LLM-based applications using Python, including prompt engineering and inference workflows
  • Implement CPU-based inference, model quantization, and performance tuning techniques
  • Design and build scalable Retrieval-Augmented Generation (RAG) pipelines
  • Work with vector databases to manage embeddings, indexing, and metadata filtering
  • Ensure security, data privacy, and compliance in air-gapped or enterprise environments
  • Collaborate with cross-functional teams to deliver architecture, prototypes, and documentation

Required Qualifications

  • Strong proficiency in Python for AI/ML application development
  • Hands-on experience with vector databases such as Qdrant, Chroma, Milvus, or pgvector
  • Proven experience implementing Retrieval-Augmented Generation (RAG) solutions
  • Experience deploying LLMs in on-premise or secure environments
  • Strong understanding of embeddings, semantic search, and data pipelines
  • Knowledge of enterprise security practices, including access controls and audit logging

Preferred Qualifications

  • Experience with LangChain or LlamaIndex
  • Familiarity with containerization tools such as Docker and Kubernetes
  • Exposure to inference frameworks like vLLM, llama.cpp, or Hugging Face Transformers
  • Experience with high-performance programming languages (Rust, Go, or C++)
  • Prior experience in regulated or enterprise environments

 

 

Best,

 

Birendra Kumar
Technical Recruiter

E: bku...@staffxpertllc.com | www.staffxpertllc.com
5851 Holmberg Rd, Apt 2114, Parkland, FL | United States - 33067

linkedin icon 

 

 

Disclaimer:
This email and any attachments are confidential and intended solely for the addressed recipient. If received in error, please notify the sender or contact bku...@staffxpertllc.com and delete it; any unauthorized use is prohibited.

STAFFXPERT LLC accepts no liability for errors, omissions, or any loss arising from the use of this email or transmission of viruses or malware.

 

 

Reply all
Reply to author
Forward
0 new messages