Title: AWS Data Engineer
Location: Plano, Texas
Expected Duration: 12 months contract with possible extension or permanent depending on performance of the candidate and client needs.
Client Project Description: This project involves building the real-time and batch data pipeline in AWS cloud to provide data to the online retirement application and migrate the legacy db2 based data system to AWS cloud.
Skillset Requirements: Having over all experience more than 5 years on data engineer + AWS
Must have:
Strong knowledge on object-oriented design and programming, data structures, algorithms, databases SQL and relational design.
Demonstrable expertise with Python, Elasticsearch, and Spark, wrangling of various data formats - CSV, XML, JSON, Parquet.
Experience with the following technologies is highly desirable: R, AWS cloud computing, Apache Airflow, Apache Kafka, Kibana, Node.js,java, Python, AWS lambda and step functions.Experience with both the relational and NoSQL database design paradigmsExperience with indexing and querying data in ElasticsearchExperience with large-scale data warehousing and analytics projects, including using AWS technologies – Redshift, S3, and EC2Working with various storage backends, possibly including Postgres, Redshift, DynamoDB, and SnowflakeContributing to Docker services in node.js, springboot and PythonExperience with Agile methodology, using test-driven development.Excellent command of written and spoken EnglishSelf-driven problem solver
Additional skills which adds value:
Aware of different Build/Source Code Configuration management tools - Maven/GitWorked on setting up CI/CD Pipeline for large development teamsJenkins + Plugin, CI/CD Pipeline, MavenDistributed Computing technologies in particular Hadoop MapReduce, Spark / Spark-SQL, YARN/MR2.