Evidence of meeting #21 for Government Operations and Estimates in the 41st Parliament, 2nd Session. (The original version is on Parliament’s site, as are the minutes.) The winning word was federal.

A recording is available from Parliament.

On the agenda

MPs speaking

Also speaking

Ron McKerlie  Deputy Minister, Open Government, Ministry of Government Services, Government of Ontario
Robert Giggey  Open Data Lead, City of Ottawa
Harvey Low  Manager, Social Research Unit, Toronto Social Development, Finance and Administration Division, City of Toronto
Don Lenihan  Senior Associate, Public Policy Forum, As an Individual
Gordon O'Connor  Carleton—Mississippi Mills, CPC
Marc Foulon  Head, Open Government, Ministry of Government Services, Government of Ontario

9:40 a.m.

NDP

Anne-Marie Day NDP Charlesbourg—Haute-Saint-Charles, QC

Thank you, Mr. Chair.

I thank the witnesses very much for being here.

You know that according to the Charter and the G8, data must be free, universal and accessible. That is part of the basic tenets.

Setting up such a structure means there are initial costs. Can you tell us how much the City of Toronto and the Ontario government invested?

9:45 a.m.

Deputy Minister, Open Government, Ministry of Government Services, Government of Ontario

Ron McKerlie

I can start if you like.

On the question regarding how data has to be free and what the costs are of opening up data, our costs depend on the size of the data set and what has to be done in the conversion. They're relatively modest. They're being borne by the ministries and they're being absorbed as part of their normal operations. We're looking now at changing systems. New systems are being installed so that the data element can be stripped out, cells can be totally compressed if they need to be or they can be eliminated altogether if the sample sizes are too small. They can be auto-published to a portal. That's the vision in the interim costs. Is there any sense of magnitude of cost?

9:45 a.m.

Marc Foulon Head, Open Government, Ministry of Government Services, Government of Ontario

As Ron McKerlie said, right now there are a lot of manual costs. That's why we did the voting process, to try to limit those costs and find out what people care about the most. Internal staff are manually releasing data that people find of high value. There is some cost associated with that, obviously, time and resources. In the longer term it's to build our net new IT applications and solutions where they're going to be automatically generating open data in a very automated way where it will be very low cost for us in the future.

9:45 a.m.

NDP

Anne-Marie Day NDP Charlesbourg—Haute-Saint-Charles, QC

Is there some complementarity, some harmonization between the open data portal of a large city and that of the province or the federal level so as to maximize its use?

9:45 a.m.

Deputy Minister, Open Government, Ministry of Government Services, Government of Ontario

Ron McKerlie

Yes.

I think the biggest opportunity is around a common search feature. I have a slightly different view from what was expressed earlier. I don't think we should replicate the data all over the place. If we do that we're going to have huge storage costs, plus the complexity of trying to keep track of the original source of that data.

I think we should jointly develop a common search engine, so wherever that data resides we can search federal government data, provincial data, municipal data. There's no wrong front door into finding that data. I think that would give us some huge economies of scale.

I think there are some other savings we can create, if we can get to standards. It will make it easier for the research community and the developers if we have standard formats, and if we agree on meta tags, for example.

I think those are areas where we could save, perhaps not money for us immediately but money for the users down the road.

9:45 a.m.

NDP

Anne-Marie Day NDP Charlesbourg—Haute-Saint-Charles, QC

Do you have any statistics on the use made of open data? For instance, are the province's or city's data accessed on a regular basis? What type of data is being asked for? Are they mainly local data such as data concerning the weather, the subway, schedules, or are they data related to geomatics, for instance? Do you receive a lot of requests from businesses?

9:45 a.m.

Deputy Minister, Open Government, Ministry of Government Services, Government of Ontario

Ron McKerlie

The top voted data sets right now are transportation data, finance data, and health and education data. Those would be the top ones.

Am I missing anything?

9:45 a.m.

Head, Open Government, Ministry of Government Services, Government of Ontario

Marc Foulon

In addition to those would probably be general government types of services, so procurement and HR information. A couple of other higher ones are freedom of information statistics, as well as general information about the open government directory of staff. That is in the top 10.

9:45 a.m.

Deputy Minister, Open Government, Ministry of Government Services, Government of Ontario

Ron McKerlie

In terms of actually using the data, we've had a number of applications created with our data. One is called iamsick.ca. The developers took information on hospitals, emergency clinics, pharmacy locations, hours and language of service, and a host of other things. Some of it came from Stats Canada and some from the province, and some came from the City of Toronto. They created an application. If you're moving into a neighbourhood and need medical care or attention, you can find a pharmacy or an after-hours clinic or a doctor or an emergency room.

There are a number of other applications that have been created from our data.

I would say that the usage is still modest, though. It's not huge volumes; it's much smaller volumes. A lot of it is being used to answer questions.

One of the applications took water sampling data and source water data, Google mapping, and Stats Canada population data. Now you can click on it and find out the source of your drinking water anywhere in Ontario, what percentage of the population uses that water, and the recent test results. This is post-Walkerton and the problems we had with water quality in Ontario.

It's smaller volumes.

Thank you.

9:50 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Thank you.

I will have to stop you.

Perhaps Mrs. Day can discuss cities a little later during the meeting.

We will now hear from Mr. Aspin for five minutes.

9:50 a.m.

Conservative

Jay Aspin Conservative Nipissing—Timiskaming, ON

Thank you, Mr. Chair.

Welcome, gentlemen, and thank you for assisting us with our study.

According to the information we've been given, the Province of Ontario and the City of Toronto, and I guess most of the larger municipalities, are working with the open data of municipalities in a group called the Public Sector Open Data.

I wonder, are the formats that are being developed by the province and participating municipalities in this forum consistent with those being offered by the federal government? I'd like each of you to comment on the level of consistency or where improvements need to be made, starting with Mr. McKerlie.

9:50 a.m.

Deputy Minister, Open Government, Ministry of Government Services, Government of Ontario

Ron McKerlie

I'm going to delegate that to Marc. I think he has that level of detail.

Thanks.

9:50 a.m.

Head, Open Government, Ministry of Government Services, Government of Ontario

Marc Foulon

Thank you.

Yes, as you mentioned, we sit on the PSOD, in the Province of Ontario, as well as some other committees with the federal government, other provinces and municipalities, and have some of those conversations. I'd say there's not a set standard or metadata that's out there right now that is being used across all the different levels of government. That's something that we do need to improve on and come together on to put something in place. Even within the Ontario government, with our various ministries, sometimes it is difficult to have some common standards in place.

Our catalogue is an example. We have five or six standard mandatory metadata categories, and a few others that are optional, or depending on the data set, can use certain other characteristics. We have tried to put that in place for most of our data over the last few years, but definitely different levels of government, not only within Canada, but even, say, in North America. That's something we should look at, so researchers, either internal to government or external who are using the data know they're using apples to apples data, and can compare them, match them up, use them as part of their evidence-based policy work.

9:50 a.m.

Open Data Lead, City of Ottawa

Robert Giggey

If I can add, it's difficult to get common data formats for particular topics across the levels of government simply because in many cases you're working with different types of data. There'll be a few circumstances where it's the same. It could be around transportation. If you're reporting on incidents or traffic on federal or provincial highways or municipal roads, you could do that. But in a lot of cases we're looking at topics separately, so this is difficult to do.

One of the biggest gaps is around all the cities themselves. For those using the data, one example is governance, and those that are looking at how governments are run from looking at minutes and agendas and voting records. In the federal government you have one place to look at data formats. In the provinces, you have a few, and you can work on getting that data out so it's usable, but once you get down to the municipal level, if you're trying to show what's happening at all three levels of government, you now have thousands of cities to work with in trying to get a common format.

The cities themselves have a lot that we need to do to help with using common formats. One thing is to look at what the rest of the world is doing, because everybody's tackling this and trying to solve it. In terms of interoperability some global standards have developed, and we have to look toward that.

It's still early. I think municipalities have a lot of work to do to get common data, as well as work with the other three levels to see. Even though there are different types of data on the same topic, how can we get them to work together as cleanly as possible?

9:50 a.m.

Manager, Social Research Unit, Toronto Social Development, Finance and Administration Division, City of Toronto

Harvey Low

In my opinion, there are two types of standards. There's a technical standard, and then there's a policy standard. Technical standards deal with different formats of data, whether they're mapping-format data sets or Excel or that type of thing. I think those are probably less of an issue.

Of greater importance are the policy standards. For example, when we work with low-income groups, there are many different definitions and many different measures of poverty. When you have different levels of government, even ministries within governments, releasing data sets called poverty, there needs to be a consistent metadata set that defines what all those indicators mean. I would say that's probably more of the challenge.

Finally, the other thing that we haven't really considered, haven't spoken about, is information technology. When I mentioned earlier that we release data sets on numerous different portals at the city, we've used a centralized data warehouse and we've used centralized geographic spatial software to make sure that consistency and that vintage is the same throughout the city, so the other standard there is geography, and that one is probably the easier one.

9:55 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Thank you. Mr. Aspin's speaking time has expired.

Mr. Dubourg now has the floor for five minutes.

9:55 a.m.

Liberal

Emmanuel Dubourg Liberal Bourassa, QC

Thank you, Mr. Chair.

It is now my turn to greet you and to thank you for the statements you made. They were very interesting. We have talked a lot about transparency and accessibility. Accessibility has been discussed because we know that data is for everyone, be they youths, children, experts or specialists.

I agree entirely with Mr. Low when he says that people do not necessarily pay attention to the various jurisdictions of government. They need information, and they look for that.

I would like to know if you have had any concerns regarding the accuracy of these data. There are many users, be they students, experts, researchers or historians. Did you have any concerns of that order when you worked on setting up your platform?

9:55 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Fine, in the same order as usual.

9:55 a.m.

Deputy Minister, Open Government, Ministry of Government Services, Government of Ontario

Ron McKerlie

Thanks very much for the question. It's a great question.

What we know for sure is that all data has errors in it. Some of them were created, some of them occurred when the data was collected, and some of them were created and built-in as the data was put together. What we've found though is that the more sets of eyes on the data the higher the quality becomes. So it's actually improved as we've started to open it up because as people, particularly public servants, start to look at it they question missing data, they question anomalies that don't seem to make sense. So it's improved the quality of the data. Yes, we do have concerns, absolutely, that the data isn't perfect, but we understand that a lot of the people working with it are giving us some grace in terms of understanding that it won't be perfect and the quality is improving as more sets of eyes look at it.

9:55 a.m.

Liberal

Emmanuel Dubourg Liberal Bourassa, QC

I have another question.

9:55 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Just a minute, I think that Mr. Low would also like to reply to your first question.

9:55 a.m.

Liberal

Emmanuel Dubourg Liberal Bourassa, QC

Go ahead Mr. Low, please.

9:55 a.m.

Manager, Social Research Unit, Toronto Social Development, Finance and Administration Division, City of Toronto

Harvey Low

Yes, basically the issue that we have in quality of data.... Ron is completely correct. There is no perfect data. I think that the solution here is to provide equal resources and opportunities to support federal departments in explaining their data, providing a proper metadata base or definition and one that's written in English. Do not have statisticians write it. Having somebody who is aware of communications so that they can explain information in a user-friendly manner certainly goes a long way in increasing the public's trust of data.

The final comment is to begin to leverage and use those departments that understand the data, Statistics Canada being one of them. If people know that it's coming from the national census, we find that there's very little question on the reliability of the federal data. The national household survey is a different matter, but the census, certainly, is a reliable source of data and we've heard that loud and clear in the community.

9:55 a.m.

Liberal

Emmanuel Dubourg Liberal Bourassa, QC

I would also like to put a question especially to M. Giggey, since he works for the City of Ottawa.

Did the two official languages pose a problem when you worked on open data? Some information may be available in English, but partially available for francophones, for instance. Was that a problem in your case?

10 a.m.

Open Data Lead, City of Ottawa

Robert Giggey

Thank you.

We are I think in a unique position. There aren't actually many bilingual cities or jurisdictions releasing data, so I've actually connected with staff in the Treasury Board on this topic to help try to solve it.

One of the key problems we have is the operating language at the city, the language that most of our base systems are in, is English. So it can make it difficult to translate data. The position we have at the City of Ottawa is that we've made all the metadata—all the information about the data itself to help with discovery and access—all bilingual because that's somewhat easy to deal with. The data itself comes in whatever language the base systems are in. Many of our front-line systems like recreation, culture, those are all being translated by the operations anyway because they are being used for public information. But there is quite a bit of data that currently isn't translated. It's all available to be translated upon request. But in the interest of moving forward and trying to get the benefits out of open data, we've chosen this model. So far I believe it's worked well. For users so far no issues have been identified and they are able to use technology to translate the information around the data itself.