SHARING AMERICA'S TECH NEWS FROM THE VALLEY TO THE ALLEY
Many blogs and articles are devoted to data virtualization, at many events data virtualization has been discussed (sometimes heatedly), the topic is explained on numerous webinars and webcasts, it’s on the radar of nearly all analyst organizations, all types of organizations are using the technology today, and the products have matured sufficiently to handle large and complex data environments. One can easily state that although the term data virtualization is not as popular as the terms SQL, data warehouse or big data, data virtualization has been accepted by the market. More and more organizations are deploying the technology to simplify access to their labyrinth of data sources.
But where exactly do we stand with data virtualization? What is the current status? Is it hype or reality? It’s time to evaluate where data virtualization stands today. This article shows the current status of data virtualization. Results of various studies are used to quantify the market.
Data virtualization allows an organization to make its enterprise data easily available to business users. From a more technical standpoint, data virtualization makes all the enterprise data that has been dispersed over a multitude of IT systems look like one logical database—even if that data is buried deeply in IT systems. The effect of using data virtualization is that organizations can increase the return on their investment in data processing.
Data virtualization is relatively young. Exactly when the term was first coined is not entirely clear. It looks as if Eric Broughton used the term first in a paper published in 2005.1 Although not as popular as terms such as big data and cloud, for the last five years data virtualization has moved slowly into the spotlights.
The history of data virtualization is strongly related with data federation, which has been around much longer. (For more information about that relationship, see my article, Clearly Defining Data Virtualization, Data Federation and Data Integration.) Data federation means combining a heterogeneous set of autonomous data stores to form one large data store. In principle, this is what data virtualization does as well. But this is where data federation stops and data virtualization continues. Next to data federation technology, current data virtualization products also support cleansing technology, data profiling and data modeling capabilities, impact and lineage analysis and so on. Some products that started out as pure data federation products evolved into data virtualization products.
The market of data virtualization products includes:
To summarize, data virtualization products and their forerunners have been around for some time. This technology has already matured.
The most recent study of the market for data virtualization was performed by Wayne Eckerson of TechTarget in April 2013 (see Data Virtualization: Perceptions and Market Trends ). This study shows that 35% of the respondents have invested money in data virtualization, 27% of the respondents have partially deployed the software and 18% have it fully deployed. Furthermore, almost one-third of the organizations have data virtualization under consideration.
These numbers are comparable to the ones coming from a TDWI (The Data Warehousing Institute) study.2 This study indicates that 19% of the organizations have data virtualization currently in use and 31% have plans to implement data virtualization. In July 2012, Ted Friedman of Gartner indicated that approximately 27% of the respondents of a study indicated that they were actively involved in or had plans for deployment of federated or virtualized views of data. Ventana Research indicated in an April 2012 study3 that data virtualization is an advancing priority in information management: 12% have completed data virtualization projects, 11% have initiated projects and 20% have planned a project in which data virtualization will be used. Finally, in a 2012 study, Forrester Research4 predicted that the total software revenues (licenses, maintenance and services) for data virtualization would grow to $8 billion by 2014.
These numbers are very promising, but clearly the data virtualization market is not growing explosively. It’s growing as so many enterprise software products grow: slowly and steadily. In addition, we have to remind ourselves that it was only during the last five years that data virtualization vendors have started to push and promote data virtualization heavily, and that period coincides with a global economy under stress. As is generally known, in a poor economy, organizations invest less in new technologies, even if that technology may solve some of the problems and lower the total cost of ownership (TCO).
Noteworthy is that, based on conversations with dominant data virtualization vendors, Europe is significantly behind the USA with respect to adopting data virtualization.
Although some of the data virtualization products were initially designed to support ESB/SOA type systems, today most organizations use data virtualization in business intelligence (BI) environments. This was also shown by Wayne Eckerson’s previously mentioned study. These were the use cases for data virtualization:
In many cases, organizations are attracted to data virtualization because of its agility: the speed with which data sources can be integrated and the speed with which integrated data becomes available to end users. Eckerson’s study confirmed this: 66% of the respondents were interested in data virtualization because of agility.
In the long run, the fast growing interest in self-service BI tools, such as QlikView, Spotfire and Tableau, will boost the adoption of data virtualization. The reason is that self-service BI tools can only deliver a certain level of agility. The moment users ask for a change of a data structure in a data mart or data warehouse, the IT department has to be involved. Developing a BI system using self-service BI tools together with data virtualization servers leads to a much more agile result. In other words, the same level of agility currently offered by self-service BI tools for reporting and analytics can be delivered by data virtualization servers for the data storage aspects of BI environments.
It is also noteworthy that those not too familiar with the technology assume it can be deployed in small environments only. According to Eckerson’s study this is not true. This study shows that a majority of the organizations (59%) that have deployed data virtualization have implemented it on an enterprise scale, only one-quarter (25%) have deployed it on business unit level and just 14% have deployed it at the departmental level for one or more departments. In other words, the majority deploy data virtualization on enterprise scale, which could only mean it’s suitable for large scale environments.
It is important that the data virtualization products continue to develop in the following three areas:
Data virtualization is an accepted technology whose adoption is accelerating steadily. Many IT specialists know what the technology has to offer and how it makes data architecture more flexible. Studies show that 30-35% of the organizations study, invest and/or deploy the technology today. The BI market will continue to push the deployment of data virtualization, but the fast adoption of NoSQL systems will also increase the demand for data virtualization.
In short, data virtualization is not hype; it’s a reality. Most vendors can show an impressive list of organizations deploying the technology today. This is supported by the studies done by renowned analyst organizations.
For those who have just started evaluating this technology, I wish you all the best on your data virtualization journey. End Notes:
Thank you, TiA