Data Modeling

Data Modeling

Representing as a concept of data structure (tables) necessary for database, this is a very powerful model of communicating business requirements.

Overview

A data model visually describes the nature of data, business rules that govern the data, and how this is organized in the database. A data model basically consists of two parts - logical design and physical design.

Data model enables functional and technical team in helping with the design of the database. Whereas functional team has more Business Analyts, Business Managers, Smart Management Experts, End Users etc., and Technical Teams refer to having more DBAs, Data Modelers who are responsible for designing the data model and they communicate very effectively with the functional team to get the business requirements and technical teams to implement the database.

Data Modeling Tools

To transform your business, there many data modeling tools available and they change the business requirements into logical data model, and logical data model into physical data model. To generate SQL code for creating database, physical data model and other tools can be used.

Business Process

Tools: What to Learn?

Powerful data models are created with the help of Data modeling tools.

Following are the innumerable options to understand and learn about data modeling tools to build these.

DM Tools - Erwin

Erwin also called as All Fusion Erwin Data Modeler is a leading data modeling tool from Computer Associates, that delivers several softwares for enterprise management, storage management solutions, security solutions, application life cycle management, data management and business intelligence.

DM Tools - Xcase

Xcase, apart from being a powerful and rationale data modeling tool, it also helps in the automation of database creation and maintenance. It also makes the modeling process much easy, simple and visual. The tool provides an excellent support for the leading DBMS and also most impressive set of capabilities.

With the help of Xcase, data modelers create data models either from the beginning or from existing databases with the medium of Reverse Engineering module. As the model is created, XcaseForward Engineering module generates all the DDL scripts required to create a new database or update an existing one.

Development Cycle

Acquiring Business Requirements - First Phase

Data Modelers are required to interact with business analysts to achieve the functional requirements and with end users to find out the reporting.

Conceptual Data Modeling (CDM) - Second Phase

This data model has all major entities, relationships and it is likely not have more details about attributes and is often used in the INITIAL.

Planning Phase

Logical Data Modeling (LDM) - Third Phase

Represented as the actual implementation of a conceptual model in a logical data model, this is the version of the model that depicts all of the business requirements of an organization.

DM Standards

Standardization Needs & Modeling Data As many data modelers may be working on the different subject areas of a data model and all of them are required to use the same naming, writing definitions and business rules.

Now B2B transactions are most frequently occurring and standardization allows in understanding the business in a better manner. If there is any inconsistency in the column names and definition, it would create a chaos in the business.

For instance, as a data warehouse is designed, it receives data from many sources and each source has its own names, data types etc., These disturbances can be removed if a proper standardization is ensured across the organization.

Table Names Standardization:

A full name to the tables provides an idea about data and all that it is about. Abbreviation of table names is not required. However, it may differ depending on the organization's standards.

If the length of a table name exceeds the database standards, abbreviating the table names may be permitted. Some general guidelines are listed below used as either prefix or suffix for the table.

Create a Data Model

There are general guidelines to design a standard data model whereas in real time, a data model may not be created in the same manner as described here. It depends on the enterprise's requirements, and some steps may be removed or added to these:

Data Flow Modeling (DFD)

Create a Data Model

These are the general guidelines to create a standard data model and in real time, a data model may not be created in the same sequential manner as shown below. Based on the enterprise's requirements, some of the steps may be excluded or included in addition to these.

Data Modeler Role

Business Requirement Analysis:
  • Interact with Business Analysts to get the functional requirements.
  • Interact with end users and find out the reporting needs.
  • Conduct interviews, brain storming discussions with project team to get additional requirements
  • Gather accurate data by data analysis and functional analysis.

Modeling Reports

From Data Modeling tools, reports can be easily generated for technical and business needs. The reports that have been generated from logical data model and physical data model are called as business reports and technical reports respectively. Most of the data modeling tools provide default reports like subject area reports, entity reports, attribute reports, table reports, column reports, indexing reports, relationship reports etc. The advantage of these reports is, whether they are technical or non-technical, everybody would understand what is going on within the organization.

Conceptual DM

Conceptual data model includes all major entities and relationships and does not contain much detailed level of information about attributes and is often used in the INITIAL PLANNING PHASE.

Enterprise DM

The development of a common consistent view and understanding of data elements and their relationships across the enterprise is referred to as Enterprise Data Modeling. This type of data modeling provides access to information scattered throughout an enterprise under the control of different divisions or departments with different databases and data models.

Data Flow Modeling (DFD)

Logical DM

This is the actual implementation and extension of a conceptual data model. A Logical data model is the version of a data model that represents the business requirements(entire or part) of an organization and is developed before the physical data model.

Physical DM

Physical data model includes all required tables, columns, relationships, database properties for the physical implementation of databases. Database performance, indexing strategy, physical storage and denormalization are important parameters of a physical model.

Logical vs Physical

When a data modeler works with the client, his title may be a logical data modeler or a physical data modeler or combination of both. A logical data modeler designs the data model to suit business requirements, creates and maintains the lookup data, compares the versions of data model, maintains change log, generate reports from data model and whereas a physical data modeler has to know about the source and target databases properties.

A physical data modeler should know the technical-know-how to create data models from existing databases and to tune the data models with referential integrity, alternate keys, indexes and how to match indexes to SQL code. It would be good if the physical data modeler knows about replication, clustering and so on.

Data Flow Modeling (DFD)

Relational(OLTP) DM

Relational Data Model is a data model that views the real world as entities and relationships. Entities are concepts, real or abstract about which information is collected. Entities are associated with each other by relationship and attributes are properties of entities. Business rules would determine the relationship between each of entities in a data model.V

Dimensional DM

Dimensional Data Modeling comprises of one or more dimension tables and fact tables. Good examples of dimensions are location, product, time, promotion, organization etc. Dimension tables store records related to that particular dimension and no facts(measures) are stored in these tables. For example, Product dimension table will store information about products(Product Category, Product Sub Category, Product and Product Features) and location dimension table will store information about location( country, state, county, city, zip. A fact(measure) table contains measures(sales gross value, total units sold) and dimension columns. These dimension columns are actually foreign keys from the respective dimension tables.

Relational vs Dimensional

Relational Data Model is a data model that views the real world as entities and relationships. Entities are concepts, real or abstract about which information is collected. Entities are associated with each other by relationship and attributes are properties of entities. Business rules would determine the relationship between each of entities in a data model.v

Dimensions

Dimension Table

Dimension table is one that describe the business entities of an enterprise, represented as hierarchical, categorical information such as time, departments, locations, and products. Dimension tables are sometimes called lookup or reference tables.

Data Flow Modeling (DFD)

Location Dimension

In a relational data modeling, for normalization purposes, country lookup, state lookup, county lookup, and city lookups are not merged as a single table. In a dimensional data modeling(star schema), these tables would be merged as a single table called LOCATION DIMENSION for performance and slicing data requirements. This location dimension helps to compare the sales in one region with another region. We may see good sales profit in one region and loss in another region. If it is a loss, the reasons for that may be a new competitor in that area, or failure of our marketing strategy etc.

Slowly Changing Dimensions

Dimensions that change over time are called Slowly Changing Dimensions. For instance, a product price changes over time; People change their names for some reason; Country and State names may change over time. These are a few examples of Slowly Changing Dimensions since some changes are happening to them over a period of time.

Slowly Changing Dimensions are often categorized into three types namely Type1 , Type2 and Type3 . The following section deals with how to capture and handling these changes over time.

The "Product" table mentioned below contains a product named, Product1 with Product ID being the primary key. In the year 2004, the price of Product1 was $150 and over the time, Product1's price changes from $150 to $350. With this information, let us explain the three types of Slowly Changing Dimensions.

Business Process