AI/ML Cloud Engineer/ MLOps :: Bloomfield , CT
Visa :USC/GC/GC-EAD/H4EAD/ L2 EAD/TN
Reallocation : OK
Must have: This project mainly focuses on deploying AI/ML models and
tracking their usage. The team is using a tool called LiteLLM to measure how
the models are being used.
Key Skills Required:
- Strong experience in
MLOps (especially for model deployment)
- Experience with LLM
Gateways (this is a plus, as they are starting to use it)
What You Will Do:
Design, deploy, and
manage cloud infrastructure for AI/ML workloads on AWS and Azure
Work on AI platforms like:
Amazon SageMaker
Azure Machine Learning
Support model training and deployment environments
Help data scientists and ML engineers by setting up and optimizing
infrastructure for:
Model training
Model deployment (inference)
Key Responsibilities
Cloud Infrastructure Management
- Design, deploy, and manage cloud
infrastructure supporting AI/ML workloads on AWS and Azure.
- Manage compute resources such as EC2, Azure
Virtual Machines, GPU instances, and Kubernetes clusters.
- Provision and configure storage, networking, and
security services for AI platforms.
- Ensure high availability, scalability, and reliability
of AI environments.
- AI Platform Support
- Deploy and maintain AI/ML services such as:
- Amazon SageMaker
- Azure Machine Learning
- AI model training and inference environments
- Support data scientists and ML engineers by providing
optimized infrastructure for model training and deployment.
- Automation & Infrastructure as Code
- Implement Infrastructure as Code (IaC) using
tools such as:
- Terraform
- CloudFormation
- ARM templates / Bicep
- Automate environment provisioning, patching, and
scaling.
- Containerization & Orchestration
- Deploy and manage containerized AI
workloads using:
- Docker
- Kubernetes
- Amazon EKS
- Azure Kubernetes Service (AKS)
- Monitoring & Performance Optimization
- Monitor system health, performance, and resource
utilization using tools like:
- CloudWatch
- Azure Monitor
- Datadog / Prometheus
- Optimize infrastructure for cost, performance, and
GPU utilization.
- Security & Compliance
- Implement cloud security best
practices including:
- IAM / RBAC management
- Network security groups
- Encryption and secrets management
- Ensure compliance with organizational and regulatory
standards.
- CI/CD & DevOps Integration
- Integrate
AI infrastructure with CI/CD pipelines.
- Support automated deployment of models and AI services.
Required Qualifications
- Bachelor’s degree in Computer Science, Information
Systems, or related field.
- 5+ years experience in infrastructure
administration or cloud engineering.
- Strong hands-on experience with:
- AWS cloud services
- Microsoft Azure cloud services
- Experience supporting AI/ML infrastructure or data
platforms.
- Proficiency with Linux administration and
scripting (Python, Bash, PowerShell).
- Experience with Docker and Kubernetes.
Preferred Qualifications
- Experience with GPU infrastructure for AI
workloads.
- Knowledge of ML pipelines and MLOps practices.
- Experience with data platforms (Snowflake,
Databricks, or Spark).
- Familiarity with AI frameworks such as TensorFlow
or PyTorch.
- Cloud certifications such as:
- AWS Certified Solutions Architect
- Azure Administrator or Azure AI Engineer
Key Skills
- Cloud Infrastructure (AWS, Azure)
- AI/ML Platform Support
- Kubernetes / Containers
- Infrastructure Automation
- Monitoring & Performance Tuning
- Security & Compliance
- DevOps & CI/CD
![]()
Regards,
Sandy M | 1Point System LLC
Business Development Manager
Direct: (803)-805-3884• Email: sa...@1pointsys.com
• Fax: 803)-828-2974 • www.1pointsys.com
115 Stone
Village Drive • Suite C • Fort Mill, SC • 29708
LinkedIn
: https://www.linkedin.com/in/sandy-m-74b06b212/
![signature_1667956046]()
An
E-Verified company | An Equal Opportunity Employer