Evidence of meeting #40 for Access to Information, Privacy and Ethics in the 40th Parliament, 3rd Session. (The original version is on Parliament’s site, as are the minutes.) The winning word was make.

A recording is available from Parliament.

On the agenda

MPs speaking

Also speaking

David Eaves  As an Individual
Clerk of the Committee  Mr. Chad Mariage

4:40 p.m.

Bloc

Ève-Mary Thaï Thi Lac Bloc Saint-Hyacinthe—Bagot, QC

Okay, thanks.

4:40 p.m.

Liberal

The Chair Liberal Shawn Murphy

I should correct the last answer, I think. My understanding was that the document was translated by the Library of Parliament.

4:45 p.m.

As an Individual

David Eaves

Maybe it was. We can investigate more closely.

4:45 p.m.

Liberal

The Chair Liberal Shawn Murphy

Mr. Siksay, five minutes.

4:45 p.m.

NDP

Bill Siksay NDP Burnaby—Douglas, BC

Thank you, Chair.

Mr. Eaves, I wanted to come back to something. You mentioned that the British had proposed to centralize data in a central organization, a public data corporation you called it. Can you say a little more about that and maybe say how that's different from our Statistics Canada?

4:45 p.m.

As an Individual

David Eaves

Yes. Statistics Canada only has data that it collects, that it hosts. I think what the British are intending to do is significantly more radical than that, which is to say they want to look at data that any ministry collects and to centralize it and manage it from one agency. That's a much grander vision than what StatsCan does. It's actually a grander vision than what I'm aware of any government doing at the moment, but I do think it has real benefits.

One of the big benefits is that it's going to standardize the way we collect and manage information and data. And the second benefit is it's going to make it much easier to share that data with the public. Again, that's what the Washington, D.C., did. Their IT department began to slowly, over time, through bilateral negotiations, host the data that different departments were collecting and they actually host a huge amount of data now. One of the reasons they were able to share it so quickly was because it was located in one place. They could just flick a switch and start sharing with the public.

4:45 p.m.

NDP

Bill Siksay NDP Burnaby—Douglas, BC

So do the British plan on rolling their equivalent of Stats Canada into this?

4:45 p.m.

As an Individual

David Eaves

I cannot comment. I don't know. My understanding is that this corporation is still just a proposal, it's not actually policy.

4:45 p.m.

NDP

Bill Siksay NDP Burnaby—Douglas, BC

Okay.

Now, I know you've also written on your blog, and maybe in other places, about your concern around the long-form census, and how.... Can you say something about how you see that affecting the overall open data project?

4:45 p.m.

As an Individual

David Eaves

I'm not sure that the long-form.... I mean, I've been quite vocal about the long-form census, but I'm not really sure it falls under the purview of this committee or around the debate around open data.

What I would say is that the information collected by StatsCan is enormously valuable to not just the government but also a huge number of non-profits and companies. We need to be thinking about that data as an asset for making our economy stronger, our social sector stronger, and making our government more effective and more efficient. When we choose to limit the amount of information that we collect, we limit all of those sectors and how effective they can be.

So I think that's something that needs to have some real debate. Mostly, though, whether the long-form census data is included or not, more importantly, I think, we need to think about how we're going to get StatsCan data shared with the public in open formats--for free, because they've already paid for it.

4:45 p.m.

NDP

Bill Siksay NDP Burnaby—Douglas, BC

But you would agree that if there's any restriction on the kind of data that's collected by StatsCan, or any lessening of that, that this is an issue around what government data is available to be shared with people and used by--

4:45 p.m.

As an Individual

David Eaves

Yes. You know, if you apply any licence that restricts the use of data, then you have to expect that people are not going to use that data in the most creative or most innovative way. So there's a penalty that you pay whenever you do that. I just don't understand why you would ever limit the use of a public asset like that, especially one that is completely reusable.

4:45 p.m.

NDP

Bill Siksay NDP Burnaby—Douglas, BC

You say that one of the key aspects of this needs to be that the data is provided free to Canadians, to businesses, to use, that there be no charge for that. Does that amount to a subsidy to businesses for using a resource that Canadians have paid for, that Canadians have put together?

4:45 p.m.

As an Individual

David Eaves

I'd actually argue that right now it's the inverse, that ordinary Canadians are subsidizing corporations. The only reason StatsCan is able to collect this data is that it has access to the citizens' tax base and it can use that to finance the collection of the census and larger data statistics. Then it turns around and sells those to those who can afford it. So right now we have your and my tax dollars paying to collect data that then gets sold and that you and I may not actually be able to afford to be able to buy.

So there are two things here. One, it means that any citizen who has an interesting new business idea now has a barrier to entry that their larger competitors can afford; they simply pay and keep them out. More importantly, it's....

Sorry, did someone just say it's not that expensive?

I think if you're a start-up, every cost is an expensive cost. If you're a non-profit, any dollar that you're spending on StatsCan data is a dollar that you're not spending on housing someone or on figuring out how to deliver a service more efficiently. If you're a city, every dollar you're spending on StatsCan data is money that you're not spending on helping citizens' lives get better.

We can debate whether the cost is relevant or not, but the really disturbing thing about the cost is that almost all academic research data out there shows that the amount of money you raise by charging for data.... The only thing it pays for is the system for charging for data. There's almost no money to be made in charging for data.

So what we really have is a system that simply feeds itself. We're charging for data to pay for people who can charge people for paying for data. We're not actually making a huge amount of money off of this. What we really have is citizens who are subsidizing the wealthier actors in our economy.

4:45 p.m.

Liberal

The Chair Liberal Shawn Murphy

Thank you, Bill.

4:50 p.m.

NDP

Bill Siksay NDP Burnaby—Douglas, BC

I guess I'm done.

4:50 p.m.

Liberal

The Chair Liberal Shawn Murphy

Ms. Davidson, you wanted a turn?

4:50 p.m.

Conservative

Patricia Davidson Conservative Sarnia—Lambton, ON

Yes, just briefly.

4:50 p.m.

Liberal

The Chair Liberal Shawn Murphy

Go ahead, then. You have five minutes.

4:50 p.m.

Conservative

Patricia Davidson Conservative Sarnia—Lambton, ON

Thank you.

I have a further question about the machine-readable format you were talking about. You were saying that the PDF form that a lot of our government data is available in could be a concern when it comes to machine-readable. Are we looking at huge costs or considerable expenses to redevelop the form in which we now provide this information?

As well, in the letter you sent to us, you said that starting in January the parliamentary website would begin releasing Hansard in XML. Is that a major change, or is that something that's fairly easily done?

4:50 p.m.

As an Individual

David Eaves

I have two responses to that. First, I was told that Parliament would start releasing Hansard in XML; I actually haven't been to that website in the last couple of days, but as far as I can tell, it still hasn't. That's a little bit of a disappointment for those of us who were looking forward to that.

4:50 p.m.

Conservative

Patricia Davidson Conservative Sarnia—Lambton, ON

Wouldn't today be the first day, though?

4:50 p.m.

As an Individual

David Eaves

Maybe. That's why--

4:50 p.m.

Conservative

Patricia Davidson Conservative Sarnia—Lambton, ON

I mean, today is the first day that the House has sat--

4:50 p.m.

As an Individual

David Eaves

I haven't actually been to the website, so if it's happened, I don't want to upset our good friends who I know are working to try to make this happen.

Will there be a cost? I don't want to sit here and say there wouldn't be a cost, because that would be untrue. But here are the two other ways I think you need to be thinking about this. One of these is going to become a little bit larger, so I'll stop if this gets too boring for people.

The first is that at some point you are going to have to upgrade the systems that are collecting this data anyways. If you're not collecting this data in a format that can be shared, then you're restricting use just within the government.

One of the things I like about open data portals is that once you make the data available to me, you've made the data functionally available to anybody, no matter where they are, whether they're in government, whether they're in the non-profit or whether they're in the for-profit sector. So that in itself should drive some efficiencies. It should help cover any cost there is in that transition. But eventually you're going to have to make that expense anyway. At some point you are going to replace the system and you're going to have to spend that money.

So maybe we don't get all the data tomorrow, but we have a plan in place so that as we transition systems we also make sure they can always export the data in a machine-readable format that the public can use.

But the second part of this--and the one that I think is more interesting from a government expenditure perspective--is that once you have data in open formats, you really change the dynamic of the relationship that you have with a lot of IT vendors. Many IT vendors purposely create data in formats that are very, very closed--in fact so closed that they are the only company that knows how to use that data and can write software for that data. As a result, the Government of Canada is now stuck using that vendor until that vendor goes out of business or until it decides it's going to make a very painful transition out of that kind of data format and data structure.

One of the really powerful opportunities around open data is that it will open up the marketplace for competition in the IT sector in government. Other players now will be able to look and say, “Wow, if that's the data that you're collecting, we could actually collect that data for you using a system that would be much cheaper and we can share with the public in these ways that are much more interesting.”

So I think we can begin to change our relationship with the vendors and try to shrink some of the enormous contracts that we give out in the IT space.