We live and work in a world built on data. Every day we generate, collect and process more data than the day before, and every point holds useful information for how we live our lives. No matter what your business deals in, you won’t be able to deny the value of collecting and analysing data. But when you are being bombarded with huge amounts of data every second, how can you organise it into something you can learn from? That’s where data architecture comes in: the framework that allows data analysts to make sure their data works for them, not the other way around.
What is data architecture?
Data architecture is the framework that sets out how data will be collected. This includes everything from what data will be collected, to how that data will be structured and what tools will be used to collect, store and process the data. Data architecture is a fundamental stage in any data analysis project because it sets the path for the rest of the project and anything you do with that data down the line. A data analysis project with sound data architecture will produce databases that are reliable, easy to use and easy to interpret for years to come.
Questions to ask
Before even starting to arrange your data architecture, it’s important to define what you want to achieve with your data analysis project. With that in mind, here are some key questions to ask yourself that will inform how you build your data architecture.
What is your goal?
Knowing the goals of a project is an essential step in creating an efficient road map to your destination. There are millions of ways to collect, store and analyse data, but not all achieve the same goals. Without defining your goals, you can quickly get buried under a mountain of unstructured data, which will take a huge amount of time to process and a significant amount of resources to store. In fact, if you fail to define your goals before the project, you may even end up with the wrong kind of data, making the whole process a waste of time and money.
Consider first what you are trying to learn from your project. Are you trying to learn more about your regular customers? Or figure out how effective your sales pipeline is? Are you trying to make your production line more streamlined? Be as specific as you can, as it will save you the most time and resources in the long run. Once you have your goal, you can figure out what data you can use.
What data will you use?
Every action we take generates data in some form or another. Businesses have access to huge swathes of data about everything from their employees and internal processes to their products, to customers and stakeholders. There will be multiple ways to pursue your data analysis goal with the data you have access to. What data you use will ultimately affect the direction of your analysis and the usefulness of your findings.
The challenge here is trying not to use too much or too little. It may be tempting to analyse something from every angle but bear in mind that the more data you take on board the more complicated it is to analyse and store. Equally, not collecting enough data will make for less useful analysis. How much data you use will depend on how much capacity you have to store, analyse and process. Also, bear in mind that you can analyse one part of your business using data from another. Be creative about how you go about collecting and processing your data, but always make sure your goal is front and centre in your project.
How will you collect, store and process data?
Once you know what data you are looking at, it’s time to consider the practical questions of data collection, storage and processing. What tools you use depend on what you are trying to achieve and the types of data you will be using. There are a huge number of tools available for each of these three stages, so it helps to look at them one at a time:
Data collection tools
These tools help with extracting and organising raw data. Which you choose will depend on the platforms you are looking to analyse and what goal you have in mind. Examples include:
- Google Analytics
- Adobe Analytics
- Firebase
Data storage tools
The next stage is storing the data you have extracted. Data can be stored as either structured (using clearly defined data types and patterns, making it easily searchable) or unstructured (raw data that comes straight from a collection tool, comprised of data that is not easily searchable). Data storage tools can also help to aggregate data from various platforms, combining data from collection tools like those above as well as media platforms and CRM bases. Examples include:
- SQL Server
- Cloud SQL
- Oracle Database
Data processing tools
These tools help to interpret the data you have stored. They usually include visualisation tools that can help develop your analysis and compile reports to demonstrate your findings. Examples include:
- Microsoft Excel
- Klipfolio
- DataStudio
How will you organise your data?
When you are dealing with large amounts of data over an extended period of time, it becomes incredibly important how you organise that data. If you want to continue to analyse a certain trend over time or refer back to data you used for analysis in the past, having it organised in a logical and useful manner is going to save you huge amounts of time and energy. You may find that the standardised categories that many companies use to label their data are best, or you may prefer to define your own categories that will help you based on your data analysis goals. If you are unsure on what categories are best, data architecture consulting services are a great resource to help start the process.