Bill Inmon vs Ralph Kimball vs DataVault

Bill Inmon datawarehousing metodology

https://www.linkedin.com/today/author/billinmon

versus

Ralph Kimball datawarehousing metodology:

Home

and versus DataVault:

https://en.wikipedia.org/wiki/Data_vault_modeling

Let’s see all this more in details.

Ralph Kimball aproach:

First we have the OLTP database:

Then the data marts/data warehouse: (In fact Ralph Kimball is most known for his star schema concept)

Ralf Kimball Star Schema

And finally the whole datawarehouse system is something like this

Ralf Kimball Data Warehouse

Source:

Bill Inmon explaines why the data marts should not be connected directly with the OLTP database. In the chapter

“Building independents Data Mart”

As long as there is only one data mart, there is no problem. But there is never only one data mart. Other users hear of the success of the independant data mart and starts building their own data mart.

spiderweb1

So we finally have a spiderweb diagram that looks like this:

Spiderweb2

That is why the Entreprise Data Warehouse is needed between the OLTP databases and the data marts.

Something like this!:

Bill Inmon Data Warehouse (1)

But Ralph Kimball answers:

Myth 1: Dimensional models and data marts are for summary data.

This myth is the root cause of many ill-designed dimensional models. We cannot predict all the questions asked by the business users,we need to provide them with queryable access to the most detailed data. Of course, ou data mart will also contain summaries, but just as a complementary information.Also, the data marts can keep historical data as needed by the business.

Myth 2:Dimensional models and data marts are departemental, not entreprise, solutions

Rather than drawing boundaries based on orgazinational departments, we maintain that data marts, should be organized around the business process, such as orders, invoices,and service calls.Multiple business functions often want to analyze the same metrics resulting from the single business process. We strive to avoid duplicating the core measurements in multiple databases around the organization.

What about Bill Inmon’s spiderweb diagram above ?

Ralph Kimball explains that this argument falls apart because no one advocates multiple extracts from the same source. The spiderweb diagrams fail to appreciate that the data marts are process-centric, not department-centric , and that data is extracted once from the operational source and presented in a single place.

Myth 3: Dimensional models and data marts are not scalable

Modern fact tables have many billions of rows in them. The dimensional models within our data marts are extremely scalable. Relational DBMS vendors have embraced data warehousing and incorporated numerous capabilities into their products to optimize scalability and performance.

Myth 4: Dimensional models and data mart are only appropriate when there is a predictable usage pattern

This is wrong, because, build at the most granural level, the fact table are extremely flexible.

OK, and now the very important question: Who, between the two authors, has better anticipated Big data world ?

To be continued ….

Source:

Leave a comment