Thank you very much for inviting me to speak to your committee. I'm going to focus my opening remarks on the Open Government Partnership, or the OGP, in which Canada is a member.
The OGP secures commitments from governments to improve transparency, accountability, and citizen engagement; to fight corruption; and to harness technology to strengthen governance. A requirement of membership in the initiative is that each country agree to have an independent review of its national action plan and of its progress every two years. This is called the independent reporting mechanism, and it's part of a system of checks and balances built into the OGP. I'm the independent researcher for Canada. Our first progress report was published in February of this year. The foundation for that report and for my remarks today is based on stakeholder feedback that made up the bulk of the report.
Canada's national action plan to the Open Government Partnership focuses on more than just open data, but given the parameters of your study, I'm going to confine my comments to open data as much as possible.
There are a lot of different issues that I could talk about in relation to open data, but given the limited time, I thought I would focus this morning on some of the main areas of concern that users raised during my stakeholder interviews and meetings. So it's a bit of a critical analysis of our open data strategy. I'm not speaking as much to some of the positive things, but I'm certainly happy to speak to those later during the questions and answers.
I organized my comments today around seven main concerns or points that the majority of stakeholders that I spoke to during the course of my study raised as issues with regard to the open data strategy.
The first is the diversity of data sets. Currently the data.gc.ca portal is largely dominated by geospatial data. There are few to no data sets in many other areas, including employment insurance, health, and issues related to specific demographics such as seniors or aboriginal persons. A lot of the users that I spoke to during the course of my study found that quite limiting, just the nature of the data sets themselves.
The second point is the quality of data. A couple of the points relate to that. Quality of data was perhaps what the majority of stakeholders were most concerned about. There is a widespread belief that the quality of the data in the data portal will suffer, and will continue to suffer in the long term, as a result of steps that have been taken to cut data collection at its point of origin.
A prime example of this—and I have to say this was the example that was given by almost everybody I spoke to during the course of my study—is the cancellation of the mandatory long-form census. Those sorts of measures around data collection have led to concern about the availability of updated and comparable data sets at smaller units of geography in the future. We’re already starting to see that be the case. Even in the last few days there have been a few news stories and reports about the loss of data from the last census exercise.
Those are the first two points.
The third point is the fragmented nature of the data sets that are found on the data.gc.ca portal. Data users noted that it isn’t uncommon for data sets to be released in what they said were bits and pieces instead of in complete and wider-reaching data sets. Sometimes they said they're also separated from their methodology and their quality description. What data users were finding is that when they were trying to work with the data, they had to spend quite a bit of time and really did need quite a high level of expertise to be able to combine data sets and make them really useful.
I think it came out during the course of the conversations I had that this problem might be a function of a bit of a difference in the definition of “data set” amongst data scientists and data users and government. I think there's a bit more conversation needed around the definition of a “data set”.
Another problem related to quality and the nature of the data is the format of the data sets on the portal.
In the past there have been some inconsistencies in the format of many of the data sets. I know that's an issue that the Treasury Board Secretariat has been working on. In the process of developing standards we really need to make sure that good metadata is included with the data sets. Missing and inconsistent metadata makes analysis really difficult; it makes it difficult for data users. The impression that I had from some of the users was that the standards for formatting are set a bit on the lower side, and that some of the metadata from certain data sets had potentially been removed in the name of standardization and consistency.
That brings me to my fifth point, which is the data portal itself. A lot of the people I spoke with had significant concerns about the data portal. I just came back from Open Government Partnership meetings in Dublin—they were European regional meetings—and heard many of the same concerns coming from civil society actors and assessors of action plans, coming from other countries that either have a data portal or are considering starting up a data portal. Data.gc.ca, as you know, is managed out of TBS, the Treasury Board Secretariat, which has the responsibility for the open government file. That centralization of the portal means that the data on the portal is effectively removed from its creators and its curators. It's removed from those, then, who have the highest degrees of specialization and understanding of the data itself. That puts TBS in the perhaps unenviable position of being a middleman, managing relationships and queries between those who are using the data and those who collected the data.
Some thinking needs to go into that issue, and perhaps the location of the data portal should be thought about. Some people I spoke with indicated that NRCan would perhaps seem a more logical home for the data portal, given that the majority of the data sets do belong to them and they have a high degree of expertise in data collection, presentation, and analysis.
Another issue with the portal is the search function. Users did quite widely indicate that it's not particularly user-friendly or well-designed, and they really thought that, at a minimum, with the portal, the functionality of the search function should be improved.
My second last point is that there is a growing data divide that's being created right now. Releasing data sets alone really doesn't have that much potential. It's not going to lead to any kind of significant change. You need people who can take the data and use the data. That requires expertise; it also requires resources. The raw format that the data sets are released in really does privilege data scientists, people who have high degrees of expertise in the use of raw data. Many others, non-governmental organizations, for example, would benefit greatly from the data sets and the information, but they're not able to use them because they lack the resources and they lack the expertise. If we're, in Canada, widely acknowledging that open data is important, then we need to think about potentially developing a mechanism for addressing that data divide and making sure that the data is accessible to a wider range of people than just people with a high degree of expertise—data scientists.
The final point that I'll make today is that open data is not open government. There has been a lot going on with open data, including the important study that you are undertaking with this particular committee. It's where a lot of other governments as well have placed their energy. We're certainly not alone in Canada in focusing on open data.
While there is certainly room for improvement, we have done some good things when it comes to open data. To be focused and careful of time, I didn’t necessarily go over all of those good things. I'm happy to talk to them during the questions. My worry, after talking to a range of stakeholders, and conducting the Canadian evaluation of our open government progress, is that open data is becoming privileged at the expense of other areas of open government and some of the other commitments that we have made in our OGP action plan to the international community and to Canadians.
I'll close there. As I said, I'm happy to answer questions. I've provided the clerk with the link to a full copy of the report, and I can provide any other research that you might find useful.
Thank you very much.