Big data is a big deal, and the race to harness the promise of data for more profit is on in almost every industry. Many business leaders wonder how to dive into the big data pool without drowning.
To fully leverage big data, it's helpful to take a step back and understand the key features of modern data architecture and how that architecture can turn raw data into reliable, actionable insights. It's also good to know what a data architect is and what they do to make data useful.
What is data architecture?
Data architecture is the process of standardizing how organizations collect, store, transform, distribute, and use data. The goal is to deliver relevant data to people who need it, when they need it, and help them make sense of it.
For decades, if a business strategist wanted data, they would request it from IT, which would create a system to deliver it. This tedious, time-consuming process often resulted in something other than what the strategist expected or needed. In this environment, business strategy was limited by the difficulties of accessing the right data at the right time.
The availability and growth of real-time data from internal and external sources has pushed business strategists to demand more and faster insight from data.
The promise of modern data architecture design is that a well-designed process puts business strategists and technical expertise at the same table. Together, they can determine what data is needed to propel the business forward, how that data can be sourced, and how it can be distributed to provide actionable information for decision makers.
What's pushed big data into the real world is the growing influence of the cloud, which provides the kind of fast, easy, and low-cost scalability that modern data architecture requires. The cloud also allows organizations to pool much or all of their data in one place, where ideally, one master version of the data is available to all who need it.
Best Practices Report: Multiplatform Data Architectures now.
What is a data architect?
A data architect is the mastermind behind data architecture, translating business needs from various business units into data and system requirements. Starting with the business needs and objectives, the data architect creates a technology roadmap to meet the objectives. They create blueprints for data flows and processes that store and distribute data from multiple sources to the people who need it.
The data architect is the collaborator-in-chief who coordinates internal stakeholders spanning multiple departments, business partners, and external vendors around the organization's objectives to define a data strategy. They do this by:
- Defining the data vision by translating business requirements into technical requirements, which become the basis for internal data standards and policies.
- Defining the data architecture, including standards for data models, metadata, security, reference data such as product catalogs, and master data such as inventory and suppliers.
- Defining a structure that decision makers can use to create and improve data systems.
- Defining data flows that govern which parts of the organization generate data, which parts use the data, and how data flows are managed.
Characteristics of effective data architecture.
Data architecture is "modern" if it's built around certain characteristics:
- User-driven: In the past, data was static and access was limited. Decision makers didn't necessarily get what they wanted or needed, but what was available. In modern data architecture, business users can confidently define the requirements, because data architects can pool data and create solutions to access it in ways that meet business objectives.
- Built on shared data: Effective data architecture is built on data structures that encourage collaboration. Good data architecture eliminates silos by combining data from all parts of the organization, along with external sources as needed, into one place to eliminate competing versions of the same data. In this environment, data is not bartered among business units or hoarded, but is seen as a shared, companywide asset.
- Automated: Automation removes the friction that made legacy data systems tedious to configure. Processes that took months to build can now be completed in hours or days using cloud-based tools. If a user wants access to different data, automation enables the architect to quickly design a pipeline to deliver it. As new data is sourced, data architects can quickly integrate it into the architecture.
- Driven by AI: Smart data architecture takes automation to a new level, using machine learning (ML) and artificial intelligence (AI) to adjust, alert, and recommend solutions to new conditions. ML and AI can identify data types, identify and fix data quality errors, create structures for incoming data, identify relationships for fresh insights, and recommend related data sets and analytics.
- Elastic: Elasticity allows companies to scale up or down as needed. Here, the cloud is your best friend, as it allows on-demand scalability quickly and affordably. Elasticity allows administrators to focus on troubleshooting and problem solving rather than on exacting capacity calibration or overbuying hardware to keep up with demand.
- Simple: Simplicity trumps complexity in efficient data architecture. Do you need a show dog or a workhorse? Strive for simplicity in data movement, data platforms, data assembly frameworks, and analytic platforms.
- Secure: Security is built into modern data architecture, ensuring that data is available on a need-to-know basis as defined by the business. Good data architecture also recognizes existing and emerging threats to data security, and ensures regulatory compliance with legislation like HIPAA and GDPR.
The Definitive Guide to Data Quality now.
Data architecture and the cloud.
Big data and variable workloads require organizations to have a scalable, elastic architecture to adapt to new requirements on demand. Fortunately, the cloud provides this scalability at affordable rates. The cloud's ability to efficiently allow administrators to scale up or down has led to new applications and use cases, such as on-demand development and test environments, as well as playgrounds for prototyping and analysis.
Another cloud advantage is affordable system resilience. Much of modern data architecture runs on large server farms in the cloud, and modern cloud providers offer redundancy, failover, and good service level agreements. The cloud also allows administrators to set up mirror images in geographically diverse locations for disaster recovery at a low cost.
Tipico, a German leader in sports betting, recently moved all of their data to the cloud to cut costs and to support real-time data gathering as part of their data architecture. This power and flexibility allows Tipico to understand customer interests in real time, so they can target customers with relevant offers, which has increased response rates. Their cloud-based data architecture allows the company to be more data driven, have more confidence in the data they get and use, and helps them make decisions faster.
Data architecture vs. information architecture.
While data architecture is all about sourcing and massaging raw data into a shareable format, information architecture is the process of turning the data into business intelligence. It's only when data is combined, correlated, and analyzed that information architecture begins to shed light. If data architecture is the power plant, then information architecture is the light fixture.
Yesterday's sales figures don't tell you much on their own, but when put into historical context — and compared with costs and customer retention rates — not only can you see how this data changes over time, it's possible to learn why the data changed over time.
For example, as a marketing executive, you want to know if a recent sales uptick was because of a promotion or just a coincidence. Was it an unrelated spike in demand? Or a nervous sales team attacking its quotas? Was the promotion really successful? Information architecture delivers the deep insights that managers and executives need to make confident decisions on the next move, like whether to pivot to something new or move forward with the current plan.
Three best practices for getting started.
In developing a data architecture strategy, business leaders should keep these considerations top-of-mind:
- Collaboration drives the process. Good data architecture ensures that the business and IT facets of an organization are collaborating on shared goals and outcomes. Decision makers define what data will have the highest business impact, and data architects build a path to sourcing that data and making it accessible.
- Make data governance a priority. Data must be high-quality, of high relevance, and targeted to specific business needs. Use your internal experts as data stewards to verify and clean organizational data. Build a community of stewards who can enhance data quality for all.
- Adaptability enables agility. It's best not to be tied to a specific technology or solution. As new technologies come into the market, the architecture should be able to accommodate and adapt to it. Data types can change, and tools and platforms can change. So good data architecture must be adaptable to these inevitable changes.
The Definitive Guide to Data Governance now.
Data architecture and your organization.
Big data has exploded in the last decade, and the amount and rate of new data will continue to expand. Legacy methods of sourcing, storing, distributing, and using data have become outdated — too cumbersome and slow to meet modern business and customer demands. However, tools and techniques have evolved to give businesses an edge in how to collect and use data that's relevant to their needs.
Data architecture is the design platform for standardizing data collection and usage across the enterprise, giving all data users access to quality, relevant data quickly and relatively inexpensively. Data architecture bridges the traditional gap between business leaders and IT, giving them a platform to ensure that technology and business strategy align to power the business forward.
Talend helps data-driven businesses easily turn massive amounts of data into trusted insights at cloud scale. Our tools help you to quickly load, transform, and cleanse all your data in the cloud, so you can deliver fast and accurate insight to stakeholders. Talend allows you to take advantage of the full elasticity and cost benefits of the cloud, so IT departments can better manage the cost of cloud data warehousing while improving productivity and agility.
When you’re ready to get started, download Talend Data Fabric — our industry-leading, platform for modern data management.