|
New york technology partners inc.,
|
Apply
|
|
| Role: |
Java developer |
| Location: |
Sunnyvale, CA |
| Interview: |
Webex |
| Job Model: |
Onsite |
|
Skills
- + years of experience in application development, with a strong background in java and big data processing.
- Strong hands-on experience in java, apache spark, and spark sql for distributed data processing.
- Proficiency in cloudera hadoop (cdh) components such as hdfs, hive, impala, hbase, kafka, and sqoop.
- Experience building and optimizing etl pipelines for large-scale data workloads.
- Hands-on experience with sql & nosql databases like hbase, hive, and postgresql.
- Strong knowledge of data warehousing concepts, dimensional modeling, and data lakes.
- Proven ability to troubleshoot and optimize spark applications for high performance.
- Familiarity with version control tools (git, bitbucket) and ci/cd pipelines (jenkins, gitlab).
- Exposure to real-time data streaming technologies like kafka, flume, oozie, and nifi.
- Strong problem-solving skills, attention to detail, and ability to work in a fast-paced environment
|
Description
- We are seeking a senior java spark developer with expertise in java, apache spark, and the cloudera hadoop ecosystem to design and develop large-scale data processing applications. the ideal candidate will have strong hands-on experience in java-based spark development, distributed computing, and performance optimization for handling big data workloads.
|
Responsibilities
- ✅ java & spark development:
- Develop, test, and deploy java-based apache spark applications for large-scale data processing.
- Optimize and fine-tune spark jobs for performance, scalability, and reliability.
- Implement java-based microservices and apis for data integration.
- ✅ big data & cloudera ecosystem:
- Work with cloudera hadoop components such as hdfs, hive, impala, hbase, kafka, and sqoop.
- Design and implement high-performance data storage and retrieval solutions.
- Troubleshoot and resolve performance bottlenecks in spark and cloudera platforms.
- ✅ collaboration & data engineering:
- Collaborate with data scientists, business analysts, and developers to understand data requirements.
- Implement data integrity, accuracy, and security best practices across all data processing tasks.
- Work with kafka, flume, oozie, and nifi for real-time and batch data ingestion.
- ✅ software development & deployment:
- Implement version control (git) and ci/cd pipelines (jenkins, gitlab) for spark applications.
- Deploy and maintain spark applications in cloud or on-premises cloudera environments.
|
| |
| |
|