I would like to thank the members of the committee for inviting me to participate on this important panel.
I am a computer science researcher, and I study the problems and the opportunities that open data presents to the science of computers. In particular I study the problem of data curation, which can briefly be defined as ensuring that data maintains its value over its lifetime, ensuring there is still value in that data and that value can be used by humans.
I would like to present three points. I'm going to try to reduce the geek level. I realize I am used to talking to computer science students so please let me know if you don't understand anything I'm saying here.
I have three points that I think can help Canada become a leader in the open data revolution. My first point is that I think the open data portal should adapt the principles of open link data. So when we put a file up on the web we are using technology that's been around for even more than 20 years. Since the beginning of the web we've been able to share data in files over the web. The state-of-the-art data sharing is not just sharing these static, inanimate files. When I say “we” I mean scientists like myself, academics but also industry leaders. When we share data we share data that's linked and that means the objects we're referring to in the datasets are dereferenceable, it's a fancy geek term meaning I can click on it. When I click on it I get important and interesting information about that object and among that important and interesting information I get relationships to other objects and important information about them.
So let me give you a concrete example of that. The most downloaded file from the open data portal in February 2014 was a file about charities. It's a static file. It just has strings and it has numbers in it. So it has strings naming different charities and it has facts about those charities, but it's just a dead file. What I would like is when I download that file, that when I see, say, the Rideau Street Soup Kitchen, I would like to be able to click on that link and get important information about the soup kitchen. For example, I'd like to get important information of where it's located, what community it serves, how many people it serves, some of the facts, some of the data. How much federal money it gets is in that file but other information like whether it gets provincial funding, private funding, who those private funding agencies are and information about them, that's not there.
But it's very easy to provide using today's technology to make the data linkable and to use the principles of open link data to enrich this data. So that's my first point, to embrace the principles of open link data.
My second point, which is highly related, is that open data is about information flow and that information flow can't be unidirectional. If the flow of information is solely from the government to the public then there's no incentive for people to do interesting and creative things with that data. So if we just make the data available, take a data file and throw it over the transom, close our eyes and hope somebody is going to do something interesting with it, we're not creating the incentives to get people involved, to change lives with the data, to solve problems, improve government, just for the economy.
Worse, I think it has the potential of creating this adversarial relationship. It gives the perception that the government's in control of the data, and is just handing it out. There is no ownership or investment in the data itself. So I think open data is fundamentally about creating participatory opportunities where people can become invested in that data and are incentivized to contribute to the data itself and incentivized to improve the data and to create new innovative ways of using the data. I think this investment creates trust and people will trust the data if they can contribute to the data. It also provides an information flowback into government—David was speaking to this as well—where the information itself is flowing back into the government improving government decision-making based on better data.
My third point is that opening up data is important, but it's equally important to create and curate participatory opportunities with this data. These are not just appathons. I think they are other ways in which the community can get involved in doing analysis over this data and improving this data.
I know the open data portal is already deeply involved in this. They have the Great Canadian Appathon, which is in its fourth year. It was just at the University of Toronto, so we're already doing quite a bit along these lines.
But I think there are two important outcomes of this. These activities are not just educational; you're not just teaching students how to use data. Rather, you're looking to find those visionary students—I call them students; everybody younger than me is a student—those visionary people who want to create new entrepreneurship opportunities with that data. I think that using open government data is an absolutely terrific way of getting those folks to stay in Canada. Too many of my entrepreneurial students go to Silicon Valley because that's where the air is filled with start-up culture. You can walk into a start-up and somebody will be there to help you with your start-up.
That kind of start-up culture just doesn't exist here in Canada, but if we have more people using community data, open government data, in their start-ups and building things around that, there is an incentive for them to stay within Canada. But, also, in addition to these visionaries who are going to spur our economy, I would encourage you to look to activate the power over the crowd; and by that I mean creating datasets where the community itself can contribute to those datasets. These are things like allowing the community to comment on the number of open beds in homeless shelters and activities like this where the community gets involved in improving the data and gets invested in the data itself. The power of the crowd can be really important in leveraging government open data.
In conclusion, I'm absolutely thrilled to see this initiative in Canada. I think it is a tremendous opportunity here, and Canada has the potential to become a leader in the open data revolution, and I look forward to seeing much more.