Position: Hadoop and Kafka Developer
Location: Raleigh, NC
Duration: 6+ Months
Position Description:
The work is on the integration layer with multiple connection points it gives great exposure to modular architecture, as the industry is trending towards this architecture. Also its Agile so the work will be well planned and the resource will have good work life balance.
Duties and Responsibilities:
7-8 years of Hadoop, Kafka, Mongo and Spark experience.
- Hadoop, Kafka, Mongo DB and Spark development and implementation.
- Loading from disparate data sets.
- Pre-processing using Hive and Pig.
- Designing, building, installing, configuring and supporting Hadoop.
- Translate complex functional and technical requirements into detailed design.
- Perform analysis of vast data stores and uncover insights.
- Maintain security and data privacy.
- Create scalable and high-performance web services for data tracking.
- High-speed querying.
- Support the effort to help build new Hadoop clusters.
- Test prototypes and oversee handover to operational teams.
- Propose best practices/standards.
Required qualifications:
- Experience using Hive, Pig, Sqoop, Impala, Spark, Map-reduce, Flume, Avro, HDFS.
- Experience with Kafka
- Experience with Mongo DB
- Experience with Python and Openshift.
- Should have Java development background
- Experience with Dataware house/Data Integration
- Experience with CDH
- Oracle Database
- Design and Requirements gathering experience
- Agile
- Strong Database SQL, ETL and data analysis skills.
- Nice to have – Informatica experience
Required Qualifications:
- Java development background with Jenkins
- An understanding of the principles and values underlying lean and agile flow.
- Skilled at communicating to people of all skills and roles in the organization, including very technical engineers and cost-conscious executives
- Experience in automating workflow and process flows using CI/CD tools
- Provide application support, development, and maintenance of a portfolio of automation and productivity applications.
- Ability to define process and implement in tooling to enable efficient IT and business
- Extensive experience with Source Control management and its influence on the SDLC and DevOps Processes.
Primary Skillset: Kafka, Hadoop – Hive/HDFS, Avro, Flume, Spark, Mongo DB, Python, Java development background.
Nice to Have: OpenShift, Informatica ETL
Java- 6 years
Hadoop- 4 years
Kafka- 3years
Mongo DB- 3 years
Healthcare- 2 years
Bachelors Degree