Erwin Data Modeler Academic License

0 views

Skip to first unread message

Ardelle Abdullah

unread,

Aug 4, 2024, 10:16:04 PM8/4/24

to rupgirawa

Purchasingsoftware is a significant investment, not just in money, but also the time required to become proficient. Professional, focused training from Sandhill ensures that you become productive and effective in the areas that you need, quickly and efficiently. It is the cost-effective way to maximise your return.

Across our range of products, Sandhill offer instruction with flexibly scheduled classes that can be mixed, matched and customised to meet your specific requirements. All Sandhill training material includes detailed step-by-step instructions and screen shots where possible.

erwin Data Modeler is the industry-leading data modelling solution that enables organisations to discover, design, visualise, standardise and deploy enterprise data through an intuitive, graphical facility built on industry standards and best practices.

As customers modernize their data estate to Databricks, they are consolidating various data marts and EDWs into a single scalable lakehouse architecture which supports ETL, BI and AI. Usually one of the first steps of this journey starts with taking stock of the existing data models of the legacy systems and rationalizing and converting them into Bronze, Silver and Gold zones of the Databricks Lakehouse architecture. A robust data modeling tool that can visualize, design, deploy and standardize the lakehouse data assets greatly simplifies the lakehouse design and migration journey as well as accelerates the data governance aspects.

We are pleased to announce our partnership and integration of erwin Data Modeler by Quest with the Databricks Lakehouse Platform to serve these needs. Data modelers can now model and visualize lakehouse data structures with erwin Data Modeler to build Logical and Physical data models to fast-track migration to Databricks. Data Modelers and architects can quickly re-engineer or reconstruct databases and their underlying tables and views on Databricks. You can now easily access erwin Data Modeler from Databricks Partner Connect!

A Data Model reverse engineering is creating a data model from an existing database or script. The modeling tool creates a graphical representation of the selected database objects and the relationships between the objects. This graphical representation can be a logical or a physical model.

Overall, reverse engineering is valuable and a foundational step for data modeling. Reverse engineering enables a deeper understanding of an existing system and its components, controlled access to the enterprise design process, full transparency through modeling lifecycle, improvements in efficiency, time and cost savings, and better documentation which leads to better governance objectives.

The above scenarios assume you are working with a single data source, but most enterprises have different data marts and EDWs to support their reporting needs. Imagine your enterprise fits this description and is now embarking on creating a Databricks Lakehouse to consolidate its data platforms in the cloud in one unified platform for BI and AI. In that situation, it will be easy to utilize erwin Data Modeler to convert your existing data models from a legacy EDW to a Databricks data model. In the example below, a data model built for an EDW like SQL Server, Oracle or Teradata can now be implemented in Databricks by altering the target database to Databricks.

As you can see in the marked circle area, this model is built for SQL Server. Now we will convert this model and migrate its deployment to Databricks by changing the target server. This kind of easy conversion of your data models helps organizations quickly and safely migrate data models from legacy or on-prem databases to the cloud and govern those data sets throughout their lifecycle.

Above picture, we tried to convert a legacy SQL server-based data model to Databricks with a few simple steps. This kind of easy migration path allows and helps organizations to quickly and safely migrate their data and assets to Databricks, encourages remote collaboration, and enhances security.

Now let's move on to our final part; once ER Model is ready and approved by the data architecture team, you can quickly generate a .sql file from erwin DM or connect to Databricks and forward engineer this model to Databricks directly.

erwin Data Modeler Mart also supports GitHub. This support enables your DevOps team's requirement to control your scripts to your choice of enterprise source control repositories. Now with Git support, you can easily collaborate with developers and follow version control workflows.

In this blog, we demonstrated how easy it is to create, reverse engineer or forward engineer data models using erwin Data Modeler and create visual data models for migrating your table definitions to Databricks and reverse engineer data models for Data Governance and Semantic layer creation.

erwin Data Modeler (stylized as erwin but formerly as ERwin) is computer software for data modeling. Originally developed by Logic Works, erwin has since been acquired by a series of companies, before being spun-off by the private equity firm Parallax Capital Partners, which acquired and incorporated it as a separate entity, erwin, Inc., managed by CEO Adam Famularo.

In April 2016, Parallax Capital Partners, a private equity firm, acquired the software from CA Technologies[13] and appointed Adam Famularo as CEO.[14] The company now operates under a new name stylized as erwin, Inc.[15] In September 2016, erwin announced that it had acquired Corso, a British enterprise architecture service provider.[16] In December of the same year, erwin acquired the business process modeling software Casewise, with a plan to integrate the two.[17] In 2017, erwin released its Data Modeler NoSQL, an enterprise-class data modeling solution for MongoDB. In April 2018, NoSQL data modeling support for Couchbase was added.[18] Also that year, erwin launched a data governance solution with impact analysis and integrations to its business process, enterprise architecture and data modeling suites.[19][20] In January 2018, the company acquired data harvesting technology and data governance consulting services company A&P Consulting.[21]

Successfully implementing a Data Vault solution requires skilled resources and traditionally entails a lot of manual effort to define the Data Vault pipeline and create ETL (or ELT) code from scratch. The entire process can take months or even years, and it is often riddled with errors, slowing down the data pipeline. Automating design changes and the code to process data movement ensures organizations can accelerate development and deployment in a timely and cost-effective manner, speeding the time to value of the data.

Quest (the company behind erwin by Quest) and Snowflake formed a partnership to collaborate on developing and deploying an enterprise data platform within Snowflake using erwin data modeling, data governance, and automation tools. With that partnership, Quest has been able to create the automation necessary to build out a Data Vault architecture using the features and functionality of Snowflake.

The erwin/Snowflake Data Vault Automation Solution includes the erwin Data Intelligence Suite, erwin Data Modeler, and the Snowflake platform. The solution covers all aspects of the data warehouse, including entity generation, data lineage analysis, and data governance, plus DDL, DML, and ETL generation.

The erwin automation framework within erwin Data Intelligence generates Data Vault models, mappings, and procedural code for any ETL/ELT tool. erwin Data Modeler adds the capability to define a business-centric ontology or business data model (BDM) and use this to generate the Data Vault artifacts.

You can use erwin Data Modeler to create BDMs or take a conceptual data model and create a logical data model that is not dependent on a specific database technology, which is a massive benefit to data architects. You can forward-engineer the DDL required to instantiate the schema for a range of database management systems. The software includes features to graphically modify the model, including dialog boxes for specifying the number of entity relationships, database constraints, and data uniqueness.

Figure 3 details a mapping between a source (in this case a database table) to the BDM. The left side of the mapping defines the source and the right shows an individual BDM entity. The BDM contains the components necessary to identify the Data Vault objects to be generated. In this case, CUSTOMER contains a business key, foreign key relationships, and user-defined attributes that generate a stage object with hub and link hash keys as well as additional mapped attributes that drive satellite generation for the Data Vault model.

Figure 4 shows the automatically generated mapping detailing the physical load between the source table and the target stage table. erwin Smart Data Connectors automatically derive the physical lineage between the source fields and their target Data Vault 2.0 standard components, hub hash keys, link hash key, hash difference key, load date, and record source. The blue lineage flows show transformations that are taking place. In this example, MD5 hashes, system timestamps, and record source hard rules are generated and will be detailed in the generated Snowflake SQL later in this post.

Alternatively, a business-driven, top-down approach enables you to automate the Data Vault with a mapping from the data source to a BDM (see Figure 6). With this approach, you can map any metadata regardless of its structure or naming conventions to the BDM to drive the Data Vault generation, which enables you to easily integrate multiple data sources into existing Data Vault data warehouses without refactoring.

erwin offers Data Vault automation bundles that can include bottom-up or top-down automation, or even a combination of the two to meet acceleration needs. With proper tagging of well-defined data sources, you can apply bottom-up automation to accelerate delivery, or you can map less-defined data sources to the BDM to properly define the target Data Vault structures.