Back to all blogs
Blogs
June 3, 2021

The 8 principles of good data management

Here's what data providers can do to make life easier for customers.
Share on LinkedIn
Share on X
In-house blogger
Guest blogger
Gregoire Haftman
,
Global Head of Data Sales
Macrobond
All opinions expressed in this content are those of the contributor(s) and do not reflect the views of Macrobond Financial AB.
All written and electronic communication from Macrobond Financial AB is for information or marketing purposes and does not qualify as substantive research.
Editor:

“There is so much data out there, but I spend more time on reconciling, verifying and mapping it, rather than actually analysing it to enrich my decision-making process.”

Sound familiar? Then you’re not alone. It’s one of the most common complaints I hear when I speak to customers.

According to MongoDB, a leading cloud-based database software provider, up to 90 per cent[1] of data generated worldwide is ‘unstructured’ –making it difficult to analyse. While that may suggest that ‘structured’ data should therefore be easier to deal with, that’s not necessarily true, either.

A former mentor of mine once told me: “You will see every day a new data vendor coming up on the market, but when you start looking under the hood, there’s barely anyone who can take pride in the technology that helps users find that data.”

Whether data are structured, semi-structured or unstructured, there is no reason why they should be difficult to work with – so long as data providers follow eight principles for sourcing, integration, governance and usability:

  1. Data connectivity. A database is not the sum of multiple tables. Each data point is connected, directly or indirectly, by common attributes. By enriching raw data observations with an extensive set of metadata attributes, you allow users to connect the dots from a region or a concept to one another.
  2. True-to-source. Algorithm-driven strategies must rely on the data that was published on the market at the time of the publication. To validate true-to-source data, you need to apply multiple layers of consistency and control checks to time-series data.
  3. Unbiased / point-in-time information. For data to be true-to-source, it must also expose what was known at the time of publication to avoid any ex-post adjustment. The lack of point-in-time data can lead to look-ahead bias in a backtest or backcast process – the nemesis of any quant – potentially resulting in false-positive signals in an investment strategy. 
  4. Standard naming conventions. The industry is always seeking to standardise the taxonomies of the instruments issued on the market (e.g., ISIN, CUSIP, SEDOL, MIC or ISO) to facilitate communication between systems and create a common reference point or language. This has yet to be fully applied to macroeconomic concepts. Using an aliasing system and applying ISO conventions wherever possible, eases data navigation.
  5. Hierarchical relationships. As the number of time series in a database grows almost exponentially, it becomes almost impossible to navigate in a data tree if there is no hierarchical structure. The use of extensive metadata allows users to categorise each concept and connect one region to another. Also bear in mind that the data will be used and presented via business intelligence tools that offer a wide range of visualisation possibilities, so data connectivity (principle #1) and hierarchical relationships make it easier to tell a story.
  6. Aggregation / Sum of the parts. Macroeconomic concepts can often be aggregated to a holistic or total representation of this concept or drilled down into a more granular sector or type-of-goods classification. Adding various levels to a concept enables a quick aggregation or drill-down into these concepts through the metadata described in the fifth principle.
  7. Data pass-through. In my experience, there is no data provider that can cover the full spectrum of requirements to feed an investment strategy. For instance, everyone provides ESG data, but some highly specialised providers have a competitive advantage over a more generalist provider. Capturing alternative data has been in the air for quite a few years now and investors and their providers not only refine their investment strategies but also constantly differentiate from a given benchmark. There is much commitment to high-frequency (daily/weekly), granular(per city/state), sentiment (news feeds) and workflow-oriented(nowcast/forecast) initiatives. Having the ability to quickly integrate third-party data through add-ons significantly enlarges the scope of data one can collect.
  8. Data discovery. Offering extensive time series coverage is only beneficial if users can find that data easily and efficiently. Firms must find the right balance between making the data easily discoverable without concealing any part of the database. They should also offer online access to a dynamic catalogue and coverage statistics so customers can integrate the numbers into their internal portals.

Based on my experience, what users want is maximum flexibility to create their own macro view of the world for their investment processes, consume feature-packed data, including point-in-time revisions and pay only for what is used on production (as opposed to unlimited search via the application). And of course, the delivery of the data is key, with Web APIs playing a crucial part in any data and tech architecture.

This blog was first published on Finextra on 3 June 2021.

Close
Previous
Next
Close
Cookie consent
We use cookies to improve your experience on our site.
To find out more, read our terms and conditions and cookie policy.
Accept
Heading
This is some text inside of a div block.
Click to enlarge
Premium data
This chart integrates premium data from our world-leading specialist data partners (When viewing the chart in Macrobond, premium data sources will only display for premium data subscribers)
Learn more
/solutions/source-and-synthesize
Revision History
This chart features Macrobond’s unique Revision History data which shows how key macroeconomic indicators have been revised over time
Learn more
https://help.macrobond.com/tutorials-training/3-analyzing-data/analysis-tree/using-the-series-list/vintage-data/
Change Region
This chart benefits from Macrobond's unique Change Region feature which allows the same analysis to be instantly applied to different regions. Click on learn more to see it in action!
Learn more
/insights/tips-and-tricks/change-region-function