In this blog series I will take you through the ins and outs of Data Vault. We will start right at the beginning with the fundamental concepts and then go a bit deeper into specifics, finishing with how Exasol and Data Vault integrate to give organizations flexibility and performance for their analytics.
What is Data Vault?
Let’s start with the basics about Data Vault. Data Vault is a data modeling approach that is detail-oriented, keeping track of data and its history. It enables organizations to be more agile compared to dimensional and normalized data modeling techniques. When it comes to their data and analytics environments, businesses are experiencing a growth in the variety of data, as well as its volume and distributed nature. The flexibility of the Loading...Data Vault modeling technique enables them to adapt quickly to the changing context they operate in.
Dan Linstedt created the Data Vault approach in the 1990s before releasing it to the public in 2000. In 2013, Data Vault 2.0 was released, introducing enhancements around Loading...Big Data, and Loading...NoSQL, as well as integrations for unstructured and semi-structured data.
With the Data Vault approach, Linstedt wanted to enable data architects and data engineers to build a Data Warehouse faster, i.e. with a shorter implementation timeframe, and in a way that more effectively addresses the needs of the business.
Shorter implementation cycles do not only save time, but also costs and help organizations ensure that the business requirements for the DWH are still valid when the project is complete, rather than having shifting goal posts that negatively impact timelines and budgets.
What are the benefits of using Data Vault?
Flexibility is a key factor for organizations choosing the Data Vault approach. The popular agile approach to project management is very aligned with the concepts behind Data Vault and the resulting nimbleness businesses can apply to their data strategy.
As a result, data models created using data vault can scale according to the requirements of an organization. Given the potential cost implications for increasing data storage and processing significantly, it is key to have a data model that responds flexibly to the necessary changes.
When it comes to loading data into the Data Warehouse, Data Vault allows for parallelization because the modeling approach has fewer points where data needs to be synchronized. This results in faster data loading processes, a key benefit especially for organizations dealing with very large data volumes and those handling real-time or near real-time data inserts.
As Data Vault has a strong focus on historical tracking of data, its data models can be audited easily and effectively. With data security regulations in place to protect people’s data, having an auditable data model supports compliance with requirements.
In summary, the key benefits of using Data Vault are:
So, that summarizes the advantages of working with Data Vault, but what about the challenges? In my next blog, I’ll be looking at the limitations and how you can overcome them with Exasol.
Eva Murray, Technology Evangelist, Exasol