The importance of data quality
The importance of data quality
Data quality assessment is a primary concern of all national statistical offices and should be an integral part of any data-producing agency’s activities.
The Australian Bureau of Statistics’ Data Quality Framework (below) gives a simplified but effective view of the key quality elements.
The common concerns about vital statistics data quality are:
- How completely does the data cover all events and does it cover all the national territory?
- Are there missing elements/characteristics not collected or only partially covered?
- Are standard definitions and classifications being used?
- What percentage of deaths are not assigned a useable cause of death?
- How timely are the data?
- Are the data accessible?
In this and the following modules, we will discuss these quality concerns in more detail, and introduce some important tools to improve the quality of vital statistics data.
Data quality frameworks
- Relevance – data meet the needs of users at different levels
- Accuracy – data correctly estimate or describe the quantities or characteristics being measured – in other words, that the values obtained are close to the (unknown) true values
- Credibility – users have confidence in the statistics and trust the objectivity of the data, which are perceived to be professionally produced in accordance with appropriate standards, and transparent policies and practices
- Accessibility – data can be readily located and accessed in multiple dissemination formats that incorporate information on the types of data collected and how
- Interpretability – users can readily understand, use and analyse the data, assisted by clear definitions of concepts, target populations, variables and terminology, as well as by information describing the limitations of the data
- Coherence – statistical definitions and methods are consistent and any variations in methodology that might affect data values are made clear – for example, different household surveys should use similar wording to generate data on the same indicators
- Timeliness – delays between data collection and availability or tion are minimised, although not to the extent of compromising accuracy and reliability
- Periodicity – vital statistics are shared regularly so that they serve the ongoing needs of policy-makers for up-to-date information
- Representativeness – data adequately represent the whole population and relevant sub-populations
- Disaggregation – data can be stratified by sex, age and major geographical or administrative region
- Confidentiality – data-management practices are aligned with established confidentiality standards for data storage, backup, transfer (especially over the internet) and retrieval.
Australian Bureau of Statistics (2009). ABS data quality framework. ABS, Canberra.
Lopez A et al (2012). Improving the quality and use of birth, death and cause-of-death information: guidance for a standards-based review of country practices, resource kit, World Health Organization, Geneva.
United Nations Statistics Division (2004). Fundamental principles of official statistics.