GCP Data Engineer / PYSPARK Data Engineer
Dallas, TX (Onsite from Day 1)
12+ Months Contract
Job
Description :
GCP Data Engineer will
create, deliver, and support custom data products, as well as enhance/expand
team capabilities. They will work on analyzing and manipulating large datasets
supporting the enterprise by activating data assets to support Enabling Platforms
and analytics. Google Cloud Data Engineers will be responsible for designing
the transformation and modernization on Google Cloud Platform using GCP
Services
Responsibilities:
- Build
data systems and pipelines on GCP Cloud using Data proc, Data Flow, Data
Fusion, Big query and Pub/Sub
- Implement
schedules/workflows and tasks for Cloud Composer/Apache Airflow.
- Create
and manage data storage solutions using GCP services such as BigQuery,
Cloud _Stbrage and Cloud SQL
- Monitor
and troubleshoot data pipelines and storage solutions using GCP's
Stackdriver and Cloud Monitoring
- Develop
efficient ETL/ELT pipelines and orchestration using Data Prep, Google
Cloud
- Maintain
Data Ingestion and transformation process using Apache PySpark,
- Automate
data processing tasks using scripting languages such as Python or Bash
- Ensuring
data security and compliance with industry standards by configuring IAM
roles,
- service
accounts and access policies. Automating cloud deployments and
infrastructure management using Infrastructure as Code (IaC) tools such as
Terraform or Google Cloud deployment manager.
- Participate
in Code reviews, contribute to development best practices and usage of
Riper Assist tools to create a robust fail-safe data pipelines
- Collaborate
with Product Owners, Scrum Masters and Data Analyst to deliver the User
Stories and Tasks and ensure deployment of pipelines
Experience required:
- 5+
years of application development experience required using one of the core
cloud platforms viz. AWS, Azure & GCP
- Minimum
1+ years of GCP experience. Experience working in GCP based Big Data
deployments (Batch/Real-Time) leveraging Big Query, Big Table, Google
Cloud Storage, Pub/Sub, Data Fusion, Dataflow, Dataproc, Airflow / Cloud
Composer.
- 2 +
years coding skills in lava/Python/PySpark and strong proficiency in SQL.
- Work
with a data team to analyze data, build models and integrate massive
datasets from multiple data sources for data modeling.
- Extracting,
Loading, Transforming, cleaning, and validating data # Designing pipelines
and architectures for data processing.
- Architecting
and implementing next generation data and analytics platforms on GCP
cloud.
- Experience
in working with Agile and Lean methodologies.
- Experience
working with either a Map Reduce or an MPP system on any size/scale.
- Experience
working in Cl/CD model to ensure automated orchestration of pipelines.
Share
the Resumes & Below Details to my Official email id sek...@transreach.com only
- Legal Name (First/Last):
- Phone (Primary and secondary):
- Candidate Email:
- Current Location (City, State):
- Work Authorization / Visa Status :
- Interview Availability:
- LinkedIn URL:
- Education Details ( Bachelors / Masters ,
University Name , Location, Year of Pass out ) :
- Availability once Confirmed:
- Total years of Work Experience in USA :
- Over All Years of Work Experience :
- Open to Relocate ( Yes / No ) :
- Expected Hourly Bill rate on C2C :