Data
pipeline and workflow management tools: Databricks Workflows, Airflow,
Step Functions, etc.
Object-oriented/object
function scripting languages: PySpark/Python, Java, C++, Scala, etc.
Experience
working with Data Lakehouse architecture and Delta Lake/Apache Iceberg
Advanced
working SQL knowledge and experience working with relational databases, query
authoring and optimization (SQL) as well as working familiarity with a
variety of databases.
Experience
manipulating, processing, and extracting value from large, disconnected
datasets.
Ability
to inspect existing data pipelines, discern their purpose and
functionality, and re-implement them efficiently in Databricks.
Experience
manipulating structured and unstructured data.
Experience
architecting data systems (transactional and warehouses).
Experience
the SDLC, CI/CD, and operating in dev/test/prod environments.