RE::Title: Data Engineer
Client: Cognizant
Location: Dallas, TX ONSITE
Rate: $52-55 only
Engineer will be part of the datastore-migration Factory team that will be responsible for the end-to-end datastore migration from on-prem DataLake to AWS hosted LakeHouse.
Responsibilities of the Engineer include:
Pipeline Migration Logic & Scheduling: Refactoring and migrating extraction logic and job scheduling from legacy frameworks to the new Lakehouse environment.
Data Transfer: Executing the physical migration of underlying datasets while ensuring data integrity.
Stakeholder Engagement: Acting as a technical liaison to internal clients, facilitating "hand-off and sign-off" conversations with data owners to ensure migrated assets meet business requirements.
· Consumption Pattern Migration
· Code Conversion: Translating and optimizing legacy SQL and Spark-based consumption patterns (raw and modeled) for compatibility with Snowflake and Iceberg.
· Usage analysis: Understand usage patterns to deliver the required data products.
· Stakeholder Engagement: Acting as a technical liaison to internal clients, facilitating "hand-off and sign-off" conversations with data owners to ensure migrated assets meet business requirements.
· Data Reconciliation & Quality
· A rigorous approach to data validation is required. Candidates must work with reconciliation frameworks to build confidence that migrated data is functionally equivalent to that already used within production flows.
· Engineer will also need to work with internal data management platforms team and must have an aptitude for learning new workflows and language constructs as necessary.
Technical Skills:
Basic Qualifications Education: Bachelor’s or Masters in Computer Science, Applied Mathematics, Engineering, or a related quantitative field.
Experience: Minimum of 3-5 years of professional "hands-on-keyboard" coding experience in a collaborative, team-based environment. Ability to trouble shoot (SQL) and basic scripting experience.
Languages: Professional proficiency in Python or Java.
Methodology: Deep familiarity with the full Software Development Life Cycle (SDLC) and CI/CD best practices & K8s deployment experience.
Core Data Engineering Competencies: Candidates must demonstrate a sophisticated understanding of the following modeling concepts to ensure data correctness during reconciliation: Temporal Data Modeling: Managing state changes over time (e.g., SCD Type 2).
Schema Management: Expertise in Schema Evolution (Ref: Iceberg Apache) and enforcement strategies.
Performance Optimization: Advanced knowledge of data partitioning and clustering.
Architectural Theory: Balancing Normalization vs. Denormalization and the strategic use of Natural vs. Surrogate Keys.