Dr. Ivan Fellegi (Former Chief Statistician of Canada, Statistics Canada, As an Individual) at the Status of Women Committee

On November 16th, 2010. See this statement in context.

November 16th, 2010 / 9:45 a.m.

Dr. Ivan Fellegi Former Chief Statistician of Canada, Statistics Canada, As an Individual

Thank you, Madam Chair. It's good to be here. I'm very pleased that I've been invited.

I'll be talking about the national household survey. My views on that issue are well known. I just want to say why I've chosen what I have to say.

First of all, it is because of the certainty of serious biases affecting the resulting data. The percentage response rate to the traditional long-form census was in the mid- to high nineties. Statistics Canada's working assumption about the response rate to the voluntary national household survey is 50%.

This would not matter much if the lost responses were evenly distributed over all population groups, but we know this is not the case. Past experience from Canada and elsewhere shows that underprivileged groups, such as aboriginal people, new immigrants, visible minorities, and, generally, people with low incomes, will respond at a disproportionately low rate--and no extra sampling will compensate for this disproportion.

But these are not the only people likely to be under-counted. Youths generally are likely to be under-counted. So will working mothers with serious time pressures on them, and others about whom we can only speculate.

In fact, this is precisely the main problem. Bias is so pernicious because, in the overwhelming number of cases, neither its magnitude nor even its direction can be ascertained. Statistics Canada states--and they are right--that the results will be useful for “many purposes”. The trouble is that we don't know now, and we will not know after the survey, what are the cases for which they are safe to use and what are the ones for which they are not.

This leads me to my second point. Since we know that the data can be seriously biased, but we will not know which data are affected and by how much, we will regrettably, but quite appropriately, be suspicious of them all. That will be a tragic outcome, because up until now we were able to focus on the substantive issues of policy, having taken the data for granted. Following the national household survey, we can spend just as much time arguing about the data as we can debating the issues of concern.

Coming to my last point, with a 50% response rate, biases of five to ten percentage points can easily distort any estimate, which is serious enough if you want to know the number of people in a certain group, but it can be devastating when our focus is on how the number changed over the last five years.

Indeed, human populations evolve slowly. A change of two to three percentage points over five years is often regarded as major. But clearly, if the bias can be two to four times as big--that is, five to ten percentage points--the real change can be grossly under- or overestimated. Not only will we be in doubt about the magnitude of the estimated change, but even its direction can be reversed by the bias.

To give you a relevant example, I have no idea how, after 2011, we will estimate the change over the last five years in the earning differential between women and men doing similar work and having similar qualifications. The same applies to estimates of the change in the education gap between aboriginal and non-aboriginal groups, whether we are getting more or less successful in economically integrating our new immigrants, and so on.

The issues are significant, and I am concerned about the passing time.

Thank you for your attention.

See context to find out what was said next.