Evidence of meeting #137 for Industry, Science and Technology in the 42nd Parliament, 1st Session. (The original version is on Parliament’s site, as are the minutes.) The winning word was household.

A video is available from Parliament.

On the agenda

MPs speaking

Also speaking

Anil Arora  Chief Statistician of Canada, Statistics Canada
Dan Albas  Central Okanagan—Similkameen—Nicola, CPC
David de Burgh Graham  Laurentides—Labelle, Lib.
Michael Chong  Wellington—Halton Hills, CPC

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

Thank you again, Mr. Chair.

I want to get back to your opening statement. You said the chance a given address is selected as part of our sample is one in 28. The chance that the dwelling is used in the actual sample is 1 in 40.

To me that seems to say that you're actually oversampling. Is that the case? Do you really need 500,000 households?

5:10 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

One, we don't want the banks to even know which dwellings are going to be in the sample.

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

By what ratio are you going to be oversampling? How many Canadians do you actually need for this project to go forward, and why are you using 500,000?

5:10 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

What we need in order to make sure that we have enough information for our census areas, essentially at the neighbourhood level, is about 350,000 dwellings. However, in the design of the project, even the providers of that data don't know which dwellings are going to be used in the sample.

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

You're going to be asking for confidential information that people back home are upset about, sir. You only need 350,000 households, and you're going to be sampling 500,000. I appreciate your answering the question, but I'm disappointed to hear that.

When it comes to the actual retention of your data, you talked about having the two different areas, and how they're separate. Will Statistics Canada maintain a master key that can reidentify the information?

5:10 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

We will keep, as I said, the two files, as you explained. The identifiable information with the StatsCan number comes in. The actual financial transactions come in. Once we have associated a transaction, the expenditures that my family and I have—

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

Will you be able to re-engineer it using a master key to reunite those files if you so choose, yes or no?

5:10 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

If there is a policy need for us to be able to do that, we have a very controlled process in place. It is only—

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

It sounds like the answer is yes.

Will Statistics Canada maintain a technical capability to access the data with personal identifiers after anonymizing the data?

5:10 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

Sorry, could you repeat that?

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

Will you maintain the technical capability to access data with personal identifiers after anonymizing the data?

If you have a master key, will you have the ability to have those files back together at some point in the future?

5:10 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

One, there are no fishing expeditions here. When we have a specific policy need, a case has to be made to be able to link that to another source. That case has to be looked at and agreed to that whatever other source is with that key is joined together and only the anonymized microdata is given to the area that—

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

I understand the separation, but why would you not remove the records that have personal identifiers after anonymizing the data?

5:10 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

Well—

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

Why not delete it?

5:10 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

Essentially, there are retention periods for files such as this. This is still a pilot project. We are trying to assess exactly what the short-term and long-term needs are for these data.

Again, to your point earlier about 500,000 dwellings, as I said, that is out of a universe of 14 million households that we have in Canada—

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

5:10 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

—and a fresh sample is selected every year so that we cannot create a file that keeps that transaction for even the selected households.

5:10 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

To do this game, you're going to have to require the use of the banks. Obviously, the banks don't have the infrastructure to make that information available.

You've also talked about real time and frequency, higher frequency use of the information. Are you going to be seeking to port the information through an API directly to Statistics Canada from the banks themselves? Will this be required of other institutions? Will it also be put upon credit unions, ATB in Alberta, trust companies that do deposit-taking activities?

5:15 p.m.

Chief Statistician of Canada, Statistics Canada

Anil Arora

At the moment, the project is restricted to nine institutions. We have secure transfer protocols that meet all the Government of Canada standards for secure transfer, and that is how we're bringing the data into Statistics Canada.

As you can imagine, we have millions of transactions with Canadians where they're providing us with their really confidential and sensitive data. We use the same protocols that we get to bring that data into Statistics Canada, and we have processes within Statistics Canada that we've built over 100 years with input from the Office of the Privacy Commissioner, as well.

5:15 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

You didn't build the infrastructure for this real time. Again, when someone fills out their census, sir, they know exactly what they're doing. In this case, you're not even advising them that this information may be taken. What if someone moves from one area to another—

5:15 p.m.

Liberal

The Chair Liberal Dan Ruimy

I hate to—

5:15 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

—and then ends up being sampled a second time?

5:15 p.m.

Liberal

The Chair Liberal Dan Ruimy

Mr. Albas.

5:15 p.m.

Central Okanagan—Similkameen—Nicola, CPC

Dan Albas

This is an intrusion in someone's—