Role : Google Cloud Data
Architect – IAM Data Modernization
Location : Dallas, TX / Charlotte, NC (Hybrid – 3 days office)
Highly Preferred OCP Exp
Project/Program
Identity & Access
Management (IAM) Data Modernization – migration of an on-premises SQL data
warehouse to a target-state Data Lake on Google Cloud (GCP), enabling metrics
& reporting, advanced analytics, and GenAI use cases (natural language querying,
accelerated summarization, cross-domain trend analysis) leveraging
PySpark-based processing, cloud-native DevOps CI/CD pipelines, and
containerized deployments on OpenShift (OCP) to deliver scalable, secure, and
high-performance data solutions.
About Program/Project
The IAM Data Modernization
project involves migrating an on-premises SQL data warehouse to a target state
Data Lake in GCP cloud environment. Key highlights include:
- Integration Scope: 30+ source system data ingestions and multiple
downstream integrations
- Capabilities: Metrics, reporting, and Gen AI use cases with natural language
querying, advanced pattern/trend analysis, faster summarizations, and
cross-domain metric monitoring
Benefits:
- Scalability and access to advanced cloud
functionality
- Highly available and performant semantic layer
with historical data support
- Unified data strategy for executive reporting,
analytics, and Gen AI across cyber domains
This modernization
establishes a single source of truth for enterprise-wide data-driven
decision-making.
Required Skills
DevOps / CI-CD
- Experience implementing CI/CD pipelines for data
and analytics workloads
- Familiarity with Git-based source control, build
automation, and deployment strategies
Containers & Platform
- Experience with OpenShift Container Platform
(OCP) for deploying data workloads and services
- Understanding of containerized architecture,
scaling, and environment management
- Proven ability to build CI/CD pipelines for data
and infrastructure workloads
- Experience managing secrets securely using GCP
Secret Manager
- Ownership of observability, SLOs, dashboards,
alerts, and runbooks
- Proficiency in logging, monitoring, and alerting
for data pipelines and platform reliability
Big Data & Processing
- Hands-on experience with PySpark for ETL/ELT,
data transformation, and performance optimization
- Solid understanding of distributed data
processing concepts
Data & Cloud
Architecture
- Strong experience designing data platforms on
Google Cloud Platform (GCP)
- Experience with Data Lakes, data warehousing, and
large-scale migration programs
Data Lake Architecture
& Storage
- Proven experience designing and implementing data
lake architectures (e.g., Bronze/Silver/Gold or layered models)
- Strong knowledge of Cloud Storage (GCS) design,
including bucket layout, naming conventions, lifecycle policies, and
access controls
- Experience with Hadoop/HDFS architecture,
distributed file systems, and data locality principles
- Hands-on experience with columnar data formats
(Parquet, Avro, ORC) and compression techniques
- Expertise in partitioning strategies, backfills,
and large-scale data organization
- Ability to design data models optimized for
analytics and BI consumption
Data Ingestion &
Orchestration
- Experience building batch and streaming ingestion
pipelines using GCP-native services
- Knowledge of Pub/Sub-based streaming
architectures, event schema design, and versioning
- Strong understanding of incremental ingestion and
CDC patterns, including idempotency and deduplication
- Hands-on experience with workflow orchestration
tools (Cloud Composer / Airflow)
- Ability to design robust error handling, replay,
and backfill mechanisms
Data Processing &
Transformation
- Experience developing scalable batch and
streaming pipelines using Dataflow (Apache Beam) and/or Spark (Dataproc)
- Strong proficiency in BigQuery SQL, including
query optimization, partitioning, clustering, and cost control
- Hands-on experience with Hadoop MapReduce and
ecosystem tools (Hive, Pig, Sqoop)
- Advanced Python programming skills for data
engineering, including testing and maintainable code design
- Experience managing schema evolution while
minimizing downstream impact
Analytics & Data
Serving
- Expertise in BigQuery performance optimization
and data serving patterns
- Experience building semantic layers and governed
metrics for consistent analytics
- Familiarity with BI integration, access controls,
and dashboard standards
- Understanding of data exposure patterns via
views, APIs, or curated datasets
Data Governance, Quality
& Metadata
- Experience implementing data catalogs, metadata
management, and ownership models
- Understanding of data lineage for auditability
and troubleshooting
- Strong focus on data quality frameworks,
including validation, freshness checks, and alerting
- Experience defining and enforcing data contracts,
schemas, and SLAs
- Strong focus on data quality frameworks,
including validation, freshness checks, and alerting
- Experience defining and enforcing data contracts,
schemas, and SLAs
Good to have
- Security, Privacy & Compliance
- Hands-on experience implementing fine-grained
access controls for BigQuery and GCS
- Experience with Sprint planning and helping team
technically.
- Strong stakeholder communication and
solution-architecture skills
Qualifications
- Experience: [10-14]+ years in DevOps and Data
Architecture, 5+ years designing on Pyspark/GCP/OCP at scale; prior
on-prem→ cloud migration a must.
- Education: Bachelor's/Master's in Computer
Science, Information Systems, or equivalent experience.
- Certifications: Google Cloud Professional Cloud
Architect/DevOps/OCP (required or within 3 months). Plus: Professional
Data Engineer, Security Engineer.
Thanks & Regards,
Maddula Venkateshwara Reddy | ICS Global Soft
Senior. US IT RECRUITER
venkatre...@gmail.com