OPINION: The pain of being missed in a national count is very painful, but the statistician-general and his team of experts have the medicine for that headache, writes Dr Pali Lehohla.
THE STATISTICIAN-General of South Africa, Risenga Maluleke, his team and the Statistics Council appeared before a parliamentary portfolio committee to discuss and explain the controversy that has engulfed Census 2022.
It is not a sweet experience to be left out, but it is mournful event when a third of the population is left out. This was the crux of the debate prior and during the hearing. It ended well, as I expected.
The two professors from the University of Cape Town, Tom Moultrie and Rob Dorrington, also correctly vetoed a DA suggestion for a re-run of the census. Indeed, it was a call that Lenin could have labelled an infantile disorder. The two professors nipped it in the bud.
But let us take a stroll down memory lane and enjoy the fruits of democracy and what, qualitatively, differences it brought to the fore from all races, especially in the census. Censuses of the Republics have been conducted in South Africa, with the first one of the Union conducted in 1911.
Not much is documented on the evaluation of these pre-1996 censuses, save to say only the White minority population was reported in full, while on the other hand not much detail was provided in the census reports on the majority Black population. They did not exist except nominally or at the pleasure of the master.
This approach irked me to the core. When I was given the opportunity to run the Bophuthatswana Census of 1985, I ensured I not only increased the number of questions to make life behind the numbers visible, but reported these numbers by village. This was a major breakthrough on reportage for the Black population.
What then was in the Bophuthatswana 1985 and 1991 Census became the bedrock of all post-apartheid censuses. Both these censuses failed to return a full count, however. But it was in the 1991 Census of Bophuthatswana that we ran a comprehensive assessment, which included a post-enumeration survey (PES), and this had an undercount that was about 15%, and I adjusted for that undercount. It was conducted at the height of the political contestation towards democracy. When interviewed for the post of statistician-general, my recount was always the PES of Bophuthatswana, which I titled “How not to do it: The Census 91 of Bophuthatswana”.
The next port of call was the national effort. I was assigned the task of running the inaugural census of a post-apartheid South Africa. The 1996 instrument or questionnaire of the census is a replica of the 1985 and 1991 Census of Bophuthatswana. It carries 75 questions. Fifty of these questions are individual questions and twenty-five are household questions. These questions have been repeated for all the censuses of South Africa, including the recent one, which is Census 2022.
All the post-apartheid censuses have had an undercount. On each occasion the statistician-general adjusted for the undercount, so has been the case in Census 2022.
Over a period of 28 years, the statistician-general introduced survey tools that generate other statistics related to households, businesses and civil registration processes. These are not limited to, but include: the General Household Survey, the Quarterly Labour Force Survey, the Income and Expenditure Survey, the Consumer Price Index and the processing of vital statistics such as Death and Causes of Death, Marriages, Divorces and Births. This raft of data has adequately matured in time-series stability.
The individuals involved are highly professional and mostly possess a rare skill of a geographical perspective on the data.
The importance of this raft of data such as that derived from the Income and Expenditure Survey, Quarterly Labour Force Survey, Deaths, and Births provide adequate data and may render some of the 75 questions redundant. But the only value of data in a statistical agency is time series and that is why in the censuses these sets of data have to be kept.
When the undercount is as massive as it is at 31%, it is very painful to those who were missed in the count, but worse still, the local authorities see their rands and cents disappearing because of the undercount. But not so because over the 30 years Statistics South Africa has mastered the art of assessing the extent of the undercount and estimate it, using a system that was discovered by Professor Gordon Kass of Wits in 1975 and published in 1980.
Professor Dawie Stoker of the HSRC, later of Statomet and, ultimately, StatsSA before he finally retired in 2011, was one of the first to apply this sterling discovery by Kass, who, in the 1980s, introduced a criterion for recursively splitting independent features based on Pearson Chi-Square statistics known as CHAID, which is Automatic Interaction Detection.
Stoker was one of the first to apply this technique on large scale data of Census 96 and improved it further in subsequent censuses of South Africa. He used the features of the Post-Enumeration Survey to predict the characteristics of those who were missed in the Census. This is part of what guides artificial intelligence and machine learning today.
StatsSA has, therefore, established a robust method of estimating from the absolute best and they have done so successfully. Second, based on the raft of data collected in household surveys, there should be no problem with the five variables of water unavailability for two days, income, unemployment mortality and fertility being withheld for good reason from being released.
The data on cause of death is published by the municipality annually. Estimates of fertility are based on birth registration that is almost 95% complete. There are plentiful data sources that can substitute some of the 75 variables of the time-series data. The coefficient of variation, which is an instrument of quality assurance, points to all the other variables that have and are due to be released as falling below the threshold of critical value of 5%, a value beyond which the red flags matter on the data.
The pain of being missed in a national count is very painful, but the statistician-general and his team of experts have the medicine for that headache through PES and CHAID. One understands the source of the noise, it is less about the quality of the census data, but being personally missed in a national effort.
The noise, genuine as it is, is but the sound of a tree falling in the forest. The silent noise of expertise in StatsSA must worry any responsible leader. It is one of a growing forest, however, one exposed to marauding armies ready to poach from the mighty organisation.
It is this silent noise that a responsible leadership should lose sleep on and not the haulers and doubting Thomases, whose ears are filled with broken sticks, and teary retinas squeezed out from sweat heavy thumbs.
* Dr Pali Lehohla is a Professor of Practice at the University of Johannesburg, a Research Associate at Oxford University, a board member of Institute for Economic Justice at Wits and a distinguished Alumni of the University of Ghana. He is the former Statistician-General of South Africa.
– BUSINESS REPORT