Immediate Hire :::: Senior Java Developer / Forgerock Developer

1 view
Skip to first unread message

Shekhar K

unread,
Aug 21, 2025, 9:15:39 AMAug 21
to Shekhar K
Role     :         Senior Java Developer  / Forgerock Developer

Location  :      Alpharetta, GA Onsite )

Duration  :      Long Term


1. Project Overview

Client is seeking a supplier to provide engineering, maintenance, and enhancement services for its Google Cloud Platform ("GCP") Supercomputer Solutions. The supplier will be responsible for supporting and enhancing two key product areas: Cluster Toolkit and HyperCompute Cluster Service (HCS). This work involves a combination of ongoing operational tasks, testing, documentation, and specific development deliverables.

2. Scope of Work & Deliverables

The supplier will be responsible for the services and deliverables detailed below.

2.1. Ongoing Maintenance

  • The contractor must provide ongoing maintenance and enhancements for all 6 projects covered under the original Statement of Work.

2.2. Cluster Toolkit Cluster Toolkit is an open-source software solution that simplifies the deployment of high-performance computing (HPC), artificial intelligence (AI), and machine learning (ML) workloads on Google Cloud.

Ongoing Responsibilities:

  • Stability Testing: Test the stability of new products, beginning with A3U. This includes:
    • Building NVIDIA Collective Communications Library (NCCL) tests on a Slurm cluster.
    • Setting up and running pairwise tests to identify and report bad nodes.
  • Integration Test Triage: Perform rotational duties to manage and triage integration test failures. This includes:
    • Monitoring daily failure chats and flake tools.
    • Reporting on failures and performing advanced handling, such as creating new bug reports and categorizations.
  • Documentation: Improve, organize, and maintain the Cluster Toolkit documentation. This process involves:
    • Gathering existing documents and identifying information gaps.
    • Creating new documentation and updating existing materials.
    • Organizing the information in g3docs, consolidating it in a team Google Drive, and establishing a review process.
  • Project Cleanup: Once a week, clean up the 'hpc-toolkit-dev' project by identifying and deleting unused resources.
  • Security: Triage and address security alerts by checking for them, creating PageRanks (PRs) to resolve them, and applying the necessary updates.

Key Deliverables:

  • HPC VM Image Releases: Deliver 4-6 High-Performance Computing Virtual Machine (HPC VM) image releases during 2025.
  • Software Widget Releases: Release new software widgets every two weeks during 2025, including managing any necessary hotfixes.

2.3. HyperCompute Cluster Service (HCS) HCS is a service that enables the deployment and management of resilient, high-performance AI and HPC systems at scale.

Key Deliverables:

  • API Integration Testing: Add comprehensive integration tests for all HCS Application Programming Interface (API) surfaces. Coverage must include:
    • HypercomputeClusters: Create, Delete, Update, Get, and List requests and responses.
    • Network: NetworkInitialize params.
    • Storage: StorageInitialize, FileStoreInitialize, Filestore tier, ParallelstoreInitialize, and GcsInitialize params.
    • Compute: Resource request, Guest accelerator, Disk, Provisioning model, Reservation affinity and type, Orchestrator, Slurm, Node test, Storage configuration, and Slurm partition.
  • Critical User Journey (CUJ) Validation: Add integration tests to validate the following critical user journeys:
    • Creating a cluster that consumes a reservation.
    • Creating a cluster with a new network and new storage.
    • Creating a cluster using a pre-existing network and storage created both outside of HCS and by a previous HCS deployment.
    • Destroying all components of an HCS-created cluster.
    • Destroying a cluster while leaving the network and storage intact.
    • Updating a Slurm cluster to add a new reservation to both new and existing partitions.

Required Mandatory Details Must be filled By candidate :

 

Required Details

Details to be filled by candidate

 

 

Candidate Name

 

Position

 Senior Java Developer  / Forgerock Developer

Present location (city and state)

 

Relocation- YES/NO

 

Work Authorization( H-1B, EAD, GC, USC)

 

Telephone No ( No Google  / Text Now or VOIP Number )

 

E-mail ID

Currently Working (Yes/No)

 

Type of Hire - Contract/ C2H

 

Onsite availability (post-selection)

 

Total onsite experience, working in US

 

Overall relevant experience of candidate

 

Availability for Interview (Preferred Time)

 

Rate / Salary

 

Bachelor’s / Master’s University / Stream / Pass out year/ Location

 

LinkedIn Id

 

Current Employer

 

Current Client / Project

 

Candidate ID Submitted( Drivers License/Passport)

& Work Authorization (if H1B/EAD)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 




Thanks & Regards,

Shekhar
Talent Acquisition Group
  
197 Route 18 South  #3000 East Wing, East Brunswick, NJ 08816 
Sent by a Verified sender
Reply all
Reply to author
Forward
0 new messages