Executives and decision makers want to get more data into the right hands to improve products, streamline operations, serve customers better, and discover new markets or opportunities. Two major sets of methods have evolved to enable this wide enterprise data availability, and the analytical power which acts on it.
Business Intelligence and Data Warehousing have both become established in their own right, critical for any organization to understand and implement. Though they are distinct concepts, their synergy is ultimately where businesses derive the greatest value.
Advanced Business Intelligence at McDonald's now.
From data warehousing to business intelligence
Data warehouses (DW) are centralized repositories exposing high-quality enterprise data to relevant users, and to downstream analytical or reporting processes. These downstream processes and the set of software tools used by individuals accessing a DW, together make up business intelligence (BI).
DWs are integrated storage systems first and foremost, though some degree of aggregation, processing, analysis, and reporting can happen within their bounds. BI, however, encompasses all of the tools and strategies used to transform enterprise data into insights and business decisions.
These definitions overlap somewhat, so it’s good to remember that BI tools and applications gather their data from the warehouse, while the warehouse gathers data from raw or external sources, and from itself. Due to this commonality between both concepts, and their greater value when used together, they are increasingly referred to in combination (BI/DW).
What is data warehousing?
Data Warehousing evolved when organizations began moving away from purely functional data behind operations and in transactional systems. With the advent of cheaper hardware, then better software, then the power of cloud computing, processed enterprise data has quickly become the currency of analytics and decision support systems.
In a DW, stakeholders find available, clean, mastered, and useful data. This data sits immediately behind customer facing applications, external partner systems, various internal BI mechanisms, and individuals from analysts to data scientists to decision makers.
Cloud Data Warehouse Trends for 2019 now.
What is business intelligence?
Business intelligence is the application of various tools and technologies for transforming enterprise data into actionable results and insights.
BI includes reporting processes (aggregating data along arbitrary dimensions to give stakeholders a better sense of an organization’s performance or status), visualization (creating static or interactive charts which make raw data more digestible to non-specialists), and an immensely wide variety of more sophisticated analyses (from anomaly detection to predictive systems utilizing machine learning to unsupervised pattern discovery thanks to narrow artificial intelligence).
Data-driven organizations use BI to synthesize the massive, complex information they generate — transforming and summarizing it until all relevant conclusions have been gleaned, and decision makers can feel comfortable they are making the most informed choices for their business.
ELT and new paradigms in BI/DW
Traditionally, organizations feed a data warehouse with a rigid extract, transform, load (ETL) process. ETL takes raw records from internal and external sources, transforms these into established formats according to preset data models, before finally loading the resulting processed data to a central repository such as a DW.
But the volume, velocity, and variety of modern data, the rise of unstructured formats (i.e. non-relational data), the power of distributed cloud computing, and the democratization of analytics have all contributed to new paradigms beyond conventional BI/DW .
Load first, model and analyse later
As analysts and data scientists gain expertise and access to more versatile, convenient software tools, the idea of self-service data is gathering traction. This is where the remixed extract, load, transform (ELT) process emerges.
As individuals have access to more powerful analytical tools, the need for preprocessed and aggregated data lessens. Data scientists prefer accessing raw data directly, so that they may apply their own experience and models, and ensure beyond a doubt that their results hold water. It’s easier to completely validate conclusions and build personal conviction in governance processes with untouched original data at hand.
BI has changed from the automated, standardized process it used to be, evolving to include a new set of tools allowing more flexible data exploration and analysis. This in turn has led to the increasing popularity of data lakes, alongside, and occasionally instead of, data warehouses.
Data lakes and business intelligence
Thanks to the growing ease and reduced cost of storing data, data lakes can economically store massive, unstructured, and unprocessed information.
These new centralized data stores can serve both as the ultimate archive or backup, as well as a hub for advanced self-service reporting and analysis.
The DW’s historical use as a source of truth is being displaced by the purity of information in data lakes. However, warehouses remain useful as a processed source of truth. That is, less advanced users can access data in DWs and trust that although the information is transformed and summarized, they can still achieve reasonable insight with less personal investment of time and effort.
Cloud Data Lakes now.
How BI/DW is evolving
The advent of cloud storage and computing is the very reason that the compound term ‘BI/DW’ exists. Before distributed IT systems, both data warehousing and business intelligence were generic terms, distinct or synonymous depending on context. This is because both the storage and analysis of information was done on-premises and by in-house IT teams, often with individual roles and responsibilities spanning both aspects of BI/DW.
In distributed, online systems, the separation of storage space and computing power has led to greater specialization, and an easier conceptual separation between data ingestion and integration vs. data processing and analysis.
Surveys have shown that employees and end-users are more likely to engage with and utilize BI tools in the cloud than traditional software. The availability of distributed storage and applications continues to grow, so regardless of how BI/DW evolves in the future, it will remain primarily a cloud-based set of technologies.
Getting started with data warehousing and business intelligence
BI/DW is a synergistic whole, composed of distinct methodologies and software, but integrating the storage and analysis of information into one powerful data pipeline. Businesses must understand the interplay between the two, then implement the robust data lakes and warehousing solutions which make valuable enterprise data available for generating insight.
Talend provides a complete, self-contained suite of apps for quickly achieving the high availability, quality, and security of massive data. Try Talend Data Fabric to begin your journey to powerful, integrated, and future-proof BI/DW.