Inmon vs Kimball – DWHanswers

Introduction:

In data warehousing, we often hear discussions on whether a person or organisation’s data warehouse falls into either Bill Inmon’s or Ralph Kimball’s way of thinking; we’ll call these ways of thinking “paradigms”. But what are they, and how do they differ? More importantly, why are they important when considering how to build your data warehouse?

The Paradigms:

Inmon’s paradigm: A data warehouse is only one part of the overall business intelligence system. An enterprise has one (and only one) data warehouse and data marts source their information from that data warehouse. In the Inmon paradigm, information is stored in Normal Form, typically 3rd or above.

Kimball’s paradigm: The Data warehouse is the conglomerate of all data marts within the enterprise. Data is stored in a dimensional architecture, connected by a star or snowflake schema.

Case Study:

An example scenario might help to make the differences a little clearer; a business owner asks for a way to merge three different sources (ODS – Operational Data Store) of company data into one central pot, to allow the business to understand sales made either in the store or on the website, versus the amount of stock left in the warehouse. Reporting will be required to satisfy both the finance department and the board.

The question mark is where a decision to adopt an Inmon or Kimball paradigm comes in.

At a high level, the Inmon Data Warehouse might look something like this:

Data is sourced from various ODS’ and merged into a single repository, the EDW (Enterprise Data Warehouse). Data is then fed to Datamarts for consumption by the users. The Inmon method relies on a strong, principled foundation as changes to the data warehouse will affect the information held in the adjoining data marts.

The common phrase used by advocates of the Inmon paradigm is “one version of the truth”, but that isn’t technically true, but it as, at least, “one version of the facts”.

Below is a high-level example of how a Kimball Data Warehouse might look:

Notice how the Kimball paradigm merges the role of the dartmart into the data warehouse. Both the “Finance” and “Order” datamarts would be stored within two separate schemas, within the same data warehouse.

The common assumption is that the Kimball design requires fewer “moving parts” to arrive at the same answer, but that is an oversimplification. The Kimball design could house the same data within the two schemas in entirely different ways, or worse still, exclude data that the other includes; this could lead to confusion over the validity of the data.

In reality, the data warehouse systems in most enterprises are closer to Ralph Kimball’s ideal; this is because most data warehouses start out their life as a departmental effort, and as a result only a single schema (or datamart) is required. Only when more data marts are built later do they evolve into a data warehouse.

Conclusion:

It’s important to understand that there is no right or wrong answer as the two paradigms represent entirely different data warehousing methodologies, but one might suit your organisation or personal working habits better than the other. Inmon’s paradigm is more rigid (working from the ground up) and if a change is made, it could affect every level. Kimball’s paradigm is suited to swift development and change (working from the top down); since the datamarts are seperated, a change is unlikely to affect other areas.

Both have pros and cons though, and the choice often boils down to personal taste and time contraints.