Data warehouses are the central data repository that allows Enterprises to consolidate data, automate data operations, and use the central repository to support all reporting, business intelligence (BI), analytics, and decision-making throughout the enterprise.
But designing a data warehouse architecture can be quite challenging.
From questions of scaling, the total cost of ownership, flexibility, and accessibility, the design patterns involved in building and maintaining the data warehousing environment can be demanding.
The two main approaches to designing the data warehouse architecture are the Inmon approach vs the Kimball approach.
In this article, we will dive deeper into what each author proposed, showcase the relative tradeoffs of each approach, and finally contrast and compare the two approaches to make your data architect’s life easier.
#getsmarter
Oops! Something went wrong while submitting the form.
Run a 100% data-driven business without any extra hassle. Pay as you go, starting with our free tier.
First introduced by Ralph Kimball, Kimball’s approach focuses on a bottom-up procedure:
Start by understanding and documenting the most critical business processes, business needs, and business questions being asked.
Document all data sources available throughout the enterprise.
Build ETL pipelines that extract, transform, and load data from the data sources into a denormalized data model. The dimensional model is built as a star schema or a snowflake schema: central fact tables are surrounded by dimension tables.
The dimensional model is usually built around and within dimensional data marts for specific departments.
Advantages of Kimball’s architecture
Simplicity and speed. Kimball’s architecture is much simpler and faster to design and set up.
Understandable. The dimensional data model is easy to understand for non-technical and technical employees alike.
Relevancy. Unlike Inmon’s architecture, Kimball’s bottom-up approach makes all data integrations relevant to the business needs.
Engineering team needs. Kimball’s approach needs fewer engineers with less specialized technical skills to both set up and maintain the data warehouse than Inmon's approach.
Disadvantages of Kimball’s architecture
Data redundancy. Because data is loaded into a dimensional model, there is more data redundancy and therefore a higher chance for errors.
No single source of truth. Data within the data warehouse is designed and organized around data marts. Coupled with data redundancy, Kimball’s architecture does not keep a single source of truth throughout the enterprise.
Less flexible. As data needs change, business requirements alter, and incoming data sources modify their payloads, Kimball’s architecture is less flexible to changes and adapts more slowly.
Incomplete. Kimball's approach starts (and ends) at critical business processes. Therefore it does not offer a complete 360 view of enterprise data. Rather, it is useful for business reporting of specific subject areas.
Inmon’s approach
Bill Inmon, the father of the data warehouse, proposed a different, top-down approach to architecting data warehouses:
Start with the corporate data model. Identify all the different sources of data available to the enterprise. (This is why this approach is also called the Corporate Information Factory or CIF - it starts with the data available to an Enterprise)
From the data and understanding of business needs, identify the key entities (customer, product, order, etc.) and their mutual relationships.
Use the entity structure to build a detailed logical model. The logical model captures in great detail all the attributes of each entity, the mutual relationships, and codependencies between entities, etc. In data modeling terms, the logical model designs logical schemas of entity relationships.
From the logical model, build the physical model. Use ETL processes to extract data from different sources, transform data, and load it into a normalized data model. Each normalized data model keeps data in the 3rd normal form to avoid redundancy. The normalized data model is the core of the data warehouse.
Build data marts for specific departments. Data is accessed via data marts for all reporting needs, the data warehouse acts as a single source of truth.
Advantages of Inmon’s architecture
Flexibility. Inmon’s approach is much more flexible to changing business requirements and more flexible to modifications to the data sources. Because of the ETL process design that leads to normalized data, Inmon’s architecture is more adaptable. The architects change just a couple of normalized tables, and the change is propagated downstream.
Single source of truth. Because of the normalized data model, the data warehouse acts as a single source of truth throughout the enterprise.
Less prone to errors. Normalization avoids data redundancy so both the engineering processes and analytic processes are less prone to errors.
Completeness. Inmon's approach covers all the Enterprise data, so all the reporting needs are covered.
Disadvantages of Inmon’s architecture
Initial set-up and maintenance cost. Setting up and maintaining Inmon’s architecture is much more expensive than the time and investment needed for Kimball’s architecture. Normalized schemas exhibit greater complexity both in design and maintenance than their denormalized counterparts.
Skills needed. Inmon’s approach necessitates highly skilled engineers, which are harder to find and more expensive to keep on the payroll.
More ETL is needed. By separating data marts from the data warehouse, more ETL processes are needed to construct the data marts, therefore leading to more engineering overhead.
Comparison: Kimball vs Inmon
How to decide between Kimball and Inmon’s architectures?
It is all in the tradeoffs between the comparative advantages and disadvantages.
Kimball is the better choice if you want to see results faster, have a small team of engineers, and foresee little changes in the business requirements. Otherwise, the data redundancy could cause anomalies and maintenance costs down the line.
Inmon is the go-to for huge enterprises that wish to see a complete picture of their enterprise data, even if the deployment of the data warehouse is going to cost them more and take longer than Kimball’s counterpart.
Stop working on your data infrastructure, and start using it instead. Create a forever-free account and pay as you grow!
Data Warehousing Tools - Speed up your deployments
Irrespective of which methodology is closer to your heart, tap into Keboola to speed up your data integration engineering.
Keboola is the end-to-end data operations platform that allows you to automate your data pipelines.
It offers more than 250 integrations between sources and databases, data warehouses, and data marts. Use Keboola to speed up your data warehouse deployment. Simply automate the ETL jobs that lead to your Kimball or Inmon architecture of choice.
We use cookies to make Keboola's website a better place. Cookies help to provide a more personalized experience and relevant advertising for you, and web analytics for us. By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts. To learn more about the different cookies we're using, check out our Cookie Policy
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts. More info