Checking the accuracy of vital events records
Common issues and challenges: aggregated data
Even when the microdata have been gathered and summarised to a high standard, further analysis is usually needed before the compiled data can be shared with, or communicated to, different audiences, or used as the basis for policy-making. When the microdata are compiled, a careful check is often missing that ensures that the totals are internally consistent (that is, the sum of data by age, cause or some other defining characteristic equals the total) and that the distribution of events in the amalgamated table is consistent with expectations about male–female differences, causes of death, age at death, age of mother and so on.
At the regional or central level where data arrive from different registration points and where further aggregation and consolidation take place, data checking and plausibility analysis should be done on all data received by comparing the data to previous time periods and/or similar local areas. An important characteristic of vital statistics is that they do not change dramatically from year to year and, hence, the data should only show gradual changes. (This may not be true for countries with small population sizes due to year-to-year stochastic variation.) More rapid changes in underlying patterns revealed by the data should be treated with caution and may indicate errors in the data.
Plausibility analysis involves evaluating the premises of the data, to determine if they are true or plausible. The final judgement about the plausibility of data can often only be arrived at by using some comparator – that is, comparing this year’s data to previous years’ data, data from another source such as a census or general regional trends. The data are likely plausible if they show trends similar to those seen in past years or are in line with demographic indicators derived from census data. For example, if the current year’s vital statistics data show 120,000 deaths, the previous year’s vital statistics data showed 118,000 deaths and the national population census calculated 122,000 deaths, then these results are similar enough so that this year’s death count is plausible.