We all know we are at the peak of the hype cycle for…wait for it — Blockchain! We are also already aware of some of the benefits of blockchain — but can blockchain be applicable to traditional data management? Though real-world blockchain implementations in the enterprise are minimal so far, I do believe there is a ton of potential to solve some of the problems that businesses face. But as implementations go up in the industry, can we, as data management practitioners, take advantage of the inherent qualities of a blockchain?
The answer is — maybe! Let me explain by getting a little more deeper into data management concepts in relation to the blockchain.
Blockchain inherently provides a validation to the blocks of data. The data needs to fit into a specific structure and only then can the block be inserted into the blockchain. If the data block doesn’t fit the rigid requirements of the blockchain, it will be rejected. There are many types of blockchain but overall the validation provides a consistency for the data blocks. But consistency is just one dimension of data quality. What about accuracy? Blockchain data is as accurate as the application allows it to be. In other words, there is no inherent check on the data itself. The garbage in and garbage out syndrome still applies. So the application needs to have good checks to make sure inaccurate data doesn’t get into the block. What about the remaining DQ dimensions such as completeness and timeliness? Those issues still remain with data in the blockchain.
The distributed ledger paradigm of blockchain could actually be used to manage reference data. It will help in collaboration between two non-competing parties who like to maintain contractual data between them. This particularly applies well for financial companies who have to share data with regulatory agencies. It could lead to an accurate and automated blockchain reference data reducing costs and operational risks.
Blockchains will create more silos of data complicating the master data management processes even more. They are inherently used for transactional data and as traditional apps are replaced by blockchains, there is a danger that the data gets even more siloed. The move to the cloud to manage the blockchain node processes doesn’t help the concern either. Therefore there is an even bigger necessity to manage master data properly. The success of master data could lead to a successful blockchain implementation.
Due to the transactional nature of blockchain, we really cannot see it being used to store historical data, which is a requirement to build a Data Warehouse. In fact, there will be even more need to integrate the data from the private blockchains that the IT organizations will develop. Also, blockchain is no replacement for databases. It solves a completely different use case and databases will be needed for reporting and predictive analytics. Some blockchain enterprise platforms such as Corda are making a database available where users can actually run SQL statements. These databases will potentially become source systems for integration into a Data Lake or a Data Warehouse.
The blockchain is a next-generation technology and the technology needs to mature for corporations to use it for internal business operations, let alone to use for an important asset such as data. But it will be exciting to see how the technology will evolve in the next decade. Hopefully, through this, blog readers can better understand the impact of blockchains on traditional data management practices.