Dear Associate,
Hope you are doing well! We
have an Immediate Position of Data Engineer in NYC, NY.
Follow the job Description:
Position : Data
Engineer
Location : NYC,
NY
Duration : 6+ Month
Job Description:
Key Responsibilities:
- Lead implementation of data strategy using enterprise data
lake solution.
- Spark and Python expert. Hands-on coding with the ability to
outline solution design with coding best practices, perform code review
and ensuring good and secure
- Well-rounded technologist – Hands on experience with legacy
and modern integration methods Data virtualization, ETL, and various data
formats (e.g. JSON, XML, delimited, etc...
- Possess a broad knowledge of Legal Entities that engage in
financial transactions – Familiarity with building and mastering global
identifiers, counterparty, Investment / Asset Types and Agreements,
Standard Reference Data, Corporate
hierarchies
- Must be able to execute technical strategy to achieve a
business vision, with input from multiple stakeholders (business,
technology, data consumers) and drive the incremental completion
- Communication skills –Strong presentation skills and the
ability to articulate business, technical and project concepts to an
audience at all levels of the organization, verbally and in writing,
clearly, concisely and completely
Desired Qualifications:
- 5+ years’ experience in Python or equivalent technologies
(Scala, Perl) with excellent knowledge of pandas/numpy libraries
specializing in data integration from multiple data sources
using various ETL techniques and frameworks.
- 3+ years’ experience with Spark/PySpark or equivalent
distributed processing system.
- 3+ years’ experience with SQL/T-SQL development and
performance tuning
- Experience with developing real time data pipelines
using Kafka.
- Experience in creating data service API for data consuming
applications
- BS / MS in Computer Science or Information Technology –
equivalent experience accepted
Preferred additional skills:
- Amazon AWS EMR or equivalent could
architecture experience with a major cloud provider
- Experience in distributed
data storage that use parallel process or columnar data
stores (E.g.: Presto, HBase, Druid)
- Ability to apply and explain MDM concepts, data lifecycle
management, etc.
- Software Component mindset with a focus on quality, reuse,
and testability