Evidence of meeting #13 for Government Operations and Estimates in the 41st Parliament, 2nd Session. (The original version is on Parliament’s site, as are the minutes.) The winning word was metadata.

A video is available from Parliament.

On the agenda

MPs speaking

Also speaking

Corinne Charette  Chief Information Officer of the Government of Canada, Treasury Board Secretariat
Stephen Walker  Senior Director, Information Management Decision, Chief Information Officer Branch, Treasury Board Secretariat
Gordon O'Connor  Carleton—Mississippi Mills, CPC
Sylvain Latour  Director, Open Government Secretariat , Treasury Board Secretariat

8:45 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Committee members, good morning. This is our 13th meeting. As you know, this is our first meeting on our study of the government's open data practices.

We have with Treasury Board representatives with us this morning. We welcome Ms. Charette, Senior Director of Information at the Government of Canada, as well as Mr. Walker and Mr. Latour.

I will now give you the floor and I thank you for being here. A little later during the meeting, the members will ask you questions about your presentation.

Ms. Charette, you have the floor.

8:45 a.m.

Corinne Charette Chief Information Officer of the Government of Canada, Treasury Board Secretariat

Good morning and thank you very much, Mr. Chair.

It is a great honour to be here before the committee to speak about our success and our work on open data for the government.

I'm very pleased to be here with my two colleagues to talk about open data. I'll introduce Stephen Walker, who is the senior director for our information management policy sector as well as for open government at TB Secretariat. With him is Sylvain Latour, who is a director of our Open Government Secretariat at TBS.

The way we propose to cover the material this morning is that we have a presentation in two parts, and we propose to have a demo. We will go through the first part of our presentation.

You have in front of you a presentation which, I think, gives a good summary of the key concepts concerning open data.

We'll start off with essentially a primer on open data, what the key concepts are, and then we will stop and do a demonstration. You've noticed the screens in the room. We'll have a live demonstration. Stephen and Sylvain will go through our actual open data portal and show you some examples of the data and how the portal works. Then we'll revert to the presentation to give a summary of what different initiatives are going on within the federal government and with our colleagues across Canada in other jurisdictions, and in fact, on our initiatives internationally on the open data front. Of course, we'd be delighted to answer whatever questions the committee has.

That's what we propose by way of the three-section approach. Before I start into the first part of the presentation, I would like to say that we've just completed a very exciting weekend. On February 28, Minister Clement launched the Canadian Open Data Experience, which is an appathon challenge that brought together, finally, 927 registered participants from across Canada, from universities in all provinces across Canada, to try to see what kinds of applications they could develop using Canada's open data information published on our portal.

It was a very exciting weekend, and at the end of it, preliminary reports suggested that over 100 different apps were developed and will be validated and vetted and be the subject of tough competition. The finale of CODE will be March 28, in Toronto, where the 15 finalists will review their apps with the judges. The finalists will be awarded a prize.

This is very exciting because this would be our first national CODE appathon. Different provinces and cities have had a few, and there have been a number of efforts across Canada, but this is the first on a pan-Canadian basis. The success of CODE is a testimony to the enthusiasm and interest in Canada's open data portal and the information that we make available to Canadians.

With that, I will go into the presentation and hopefully help to demystify this. We'll be doing section 1.

Page 3 is titled “Open Data Fundamentals”. I apologize that some of you may be well aware of this, but we weren't sure so we thought we'd bring everyone to a certain level of knowledge.

So what is raw data?

Raw data is machine-readable data at the lowest level of integration that can be reused alone, or mashed up—as the term is—with other data in innovative ways.The government either generates or collects and aggregates a vast amount of raw data. The best example of raw data would be weather data that we collect through sensors and radar and a variety of other means. We turn that into raw data, numerical data that is available for further processing and manipulation.

So what is metadata? Metadata is data about data. Metadata is key to the potential of open data. Without metadata, the vast numbers of data sets and information that are available are not as useful.

It's very important to describe the contents of a data set and to describe the specific kinds of information in each field of a data set that is presented, so that when application developers go to the data set, they know they're finding the right data set with the right kind of data and they know how to interpret the different fields. That's an important part of using the data effectively. In Canada, making our data available in an open data portal first involves producing metadata in both official languages so that app developers can quickly understand what the nature of the data set is and can use it appropriately.

Finally, what is open data? Open data is the practice that takes the raw data and the metadata and makes it available through a portal, as is the case of data.gc.ca. It allows users to search through the portal for the right data sets and allows them to browse and then to download the data in machine-usable formats so they can develop programs and information systems that can manipulate it and produce other uses for it and greater advantages.

The open data movement is quite well developed today. In October 2013, McKinsey Global reported that the potential for open data to generate economic value is significant. This is McKinsey's view. Certainly, through open data efforts in the U.S., in the U.K., in Canada now, and all over the world, we've seen the rise of many, many businesses through the generation of apps that basically use open data and are now widely available through different online stores and so on. Certainly, all of the large consultancies, including Deloitte, speak to the fact that data is the new capital of the global economy, and the ability to harness the vast amounts of data that we do generate is really a large potential for Canada and for society as a whole.

Just to give you a recap of the history, in Canada we have long been aggregators and generators of data. In fact, the concept of open data started around 1995 with the important stores of geophysical and environmental data that we already collect and manipulate through NRCan and Environment Canada.

In California, of course, in the U.S. in 2007, open data started to become an important movement. In fact, in President Obama's first term, there was really the first important national foray, I guess, into open data, with his mandatory policy on the release of open data. The U.S. launch of that direction certainly stimulated open data movements in the U.K. and internationally. Certainly, we watched in Canada and also thought that this was a valuable movement to embrace. It's really a movement that has grown very quickly, and it is, certainly in Canada, stimulated quite a bit by the work by our cities—cities are very active in open data in Canada—as well as by the provinces and by us in the federal government.

Open data is certainly well established internationally. As you may know, the Open Government Partnership, first launched in 2011 by the U.S. and Brazil as co-chairs, was a strong platform for further developments in open data and making governments accountable, open, and responsive to citizens. Similarly, the World Bank has opened its data, knowledge, and research, and is a strong supporter of open data and of all our efforts.

The Open Knowledge Foundation is a civil society organization dedicated to promoting open data and open content. The OECD has also embraced open data and was present at the 2013 Open Government Partnership conference in the U.K.

Certainly the World Wide Web Foundation is, of course, a strong believer in open data.

Just to give you a capsule of open data in Canada, we're quite pleased with Canada's progress on this front. Four provinces—British Columbia, Alberta, Ontario, and Quebec—have open data portals, as do over 30 cities. In fact, certainly Vancouver was one of the first leaders in open data in Canada and continues to be very dedicated to that, but we're pleased with all of the municipal efforts, including the City of Ottawa, which is also working hard at open data.

Page 8 just contrasts what was it like, how civil society could access the data sets the government created and aggregated and made available, prior to open data. Before open data, the government was already publishing data, but in a different way and in a much smaller and less accessible way. Certainly weather data from Environment Canada has been available for some time, as were maps from Natural Resources Canada.

But what you see on the diagram on the right is one of the fundamental issues of the problem. Each individual department collected and prepared data and made it available on their own individual websites, but not always prominently, often without sufficient, or if you will, standard metadata that described the contents, and not always with appropriate search engines to access it.

So from a user's perspective, it wasn't easy to answer the question of what kind of data is available from the government on topic A or on topic B. The users would routinely have to go through multiple sites, go quite deep into the sites, and then the data was not necessarily in machine-readable format. So while they could visualize it, they couldn't really use it and create an information system.

Finally, an additional issue that users had to tackle at the time was that every individual website made the data available under slightly different licensing terms. The licensing terms are very critical to open data, and the ability to have an open licence that is recognized across Canada, that makes the data available for reuse without restriction on the same terms, is really key.

So that was the situation of data before open data.

Starting in 2009 we started to tackle these questions, and in fact started working on our first view of the licence and the first view of a portal that could potentially make this data available.

Why is open data important for the Government of Canada? Certainly we're strong believers that open data helps to reinforce accountability and the government's agenda. Certainly we are convinced that it does generate economic value for Canadians. It is aligned with our digital strategy, as we are working with our colleagues across government, and it is a key catalyst for innovation and science and technology. We are aligned with our international partners, and the success of CODE, I think, supports the fact that Canadians are equally aligned with it.

Just to highlight the key milestones from a government perspective on open data, in March 2011 the government announced our first open government initiative, and at the time, our first open data portal. That was our first pilot. We launched it with much fewer data sets and with the first version of the licence.

In April 2012 Canada joined the international Open Government Partnership formally. We published our first action plan on open government at that time. The action plan on open government includes, of course, a number of commitments on open data.

In June 2013 the Prime Minister formally adopted the Open Data Charter with other G-8 leaders at the Lough Erne Summit in Northern Ireland.

Just to recap on this part of the presentation, and before we go to the demo of the portal, I'll just say that we have continued to work hard on open data since our joining of the Open Government Partnership.

In fact, this June we launched the second generation open data platform. We now have about 200,000 data sets from 27 departments. We launched with six departments and their data sets. Our search capability is state-of-the-art and we have incorporated social media features onto the site, so we're very pleased with our new portal.

In terms of GC resource management data, the expenditure database was launched in April 2013 to provide Canadians with financial information on departmental spending over the last three years, and we continue to add data sets through all topic areas.

We are working hard right now on a directive on open government, so this will be policy that will help departments and agencies to create a better inventory of their data assets and the information to be published, and provide an implementation timeline for them to achieve this. That will be an important part of our open government action plan commitments, and we're hopeful to see that in the new fiscal year.

Finally, our new open government licence, the second version of which was issued last June, is aligned with a Creative Commons licence. It's plain language. It clearly states the conditions for the reuse of data and aligns with all international best practices.

That's a quick primer on open data. Before we go to the demonstration, would you like to ask any questions?

I am wondering if the committee members would like to ask questions.

9 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Are there any questions from committee members before we move to the demonstration of the site?

9 a.m.

NDP

Anne-Marie Day NDP Charlesbourg—Haute-Saint-Charles, QC

I have just one question and it concerns a word that is used in the supplementary goal No. 3 for 2015. In French, the word “mappage“ is used. What does this mean?

9 a.m.

Chief Information Officer of the Government of Canada, Treasury Board Secretariat

Corinne Charette

To what page are you referring?

9 a.m.

NDP

Anne-Marie Day NDP Charlesbourg—Haute-Saint-Charles, QC

I am looking at the appendix of the G8 Open Data Charter. At the third point for 2015, it reads as follows: “Contribuer à un exercice de mappage de métadonnées du G8”. The word “mappage“ is often used in the presentation in French. I have never seen this word before.

9 a.m.

Chief Information Officer of the Government of Canada, Treasury Board Secretariat

Corinne Charette

It is a word that was used in the translation of the presentation. What it means here is that we need to assign metadata to different data sources and that international standards apply to the metadata that will be used.

Do want to add to the concept of the G-8?

9:05 a.m.

Stephen Walker Senior Director, Information Management Decision, Chief Information Officer Branch, Treasury Board Secretariat

Further to Corinne's point, although metadata is very important to everybody who thinks that their goal is to provide metadata, especially from the public sector, many jurisdictions have developed their own approach to metadata.

If you're a user of open data or a developer, chances are you're going to want to bring data in from more than one jurisdiction. If we're all using different metadata, it can make that very complicated for the individual user.

Our goal is to work with other jurisdictions to map our metadata against each other in order to be able to provide potential users with easy-to-use tools that will make the data from different places more interoperable, more comparable to each other.

9:05 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Gordon O'Connor.

March 4th, 2014 / 9:05 a.m.

Gordon O'Connor Carleton—Mississippi Mills, CPC

You're using the term “metadata”, and I wonder if you can give an example or two of metadata.

9:05 a.m.

Senior Director, Information Management Decision, Chief Information Officer Branch, Treasury Board Secretariat

Stephen Walker

Absolutely.

Let's imagine we have a data set that was just crime statistics. The metadata would provide us with information on who the provider of that data was, so which department; whether or not there was a specific program or service within the Government of Canada that this data was created to support; the date of release of that data; a description of the data so you wouldn't necessarily have to go into the data to find out exactly what it contained, which is important because some of the data sets are very, very large; and the frequency of the data, so how often it is published and renewed.

This is the kind of information that goes into the metadata.

9:05 a.m.

Carleton—Mississippi Mills, CPC

Gordon O'Connor

I'm not going to divert you into another topic, but there have been issues recently with CSE collecting metadata. What you're describing, basically, are all the details related to a certain subject area? Is that right? Metadata is basically the whole picture—everything.

9:05 a.m.

Senior Director, Information Management Decision, Chief Information Officer Branch, Treasury Board Secretariat

Stephen Walker

Metadata would be more of a set of descriptive tags to describe the data, but not the actual data. For example, it would more likely be a set of factors; so time, frequency, title, provider, not the actual data that was held within the data set itself. So it's where it comes from, who it comes from, but not necessarily what the information is.

9:05 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Mr. Trottier, you have the floor.

9:05 a.m.

Conservative

Bernard Trottier Conservative Etobicoke—Lakeshore, ON

Thank you.

Madam Charette, you use the terms “open data” and “open government”. I can see examples where open data would have little to do with open government, for example, publishing weather data isn't really about open government. So could you describe the relationship between these two concepts, which in fact are actually two different initiatives even within the Government of Canada, around open government and open data.

9:05 a.m.

Chief Information Officer of the Government of Canada, Treasury Board Secretariat

Corinne Charette

To us they're certainly interrelated and in fact our open government action plan has three streams of activity that we committed to in April 2012. One of them is open data, one of them is open information, and one of them is open dialogue. In fact, the open government thrust is about making information and data widely available to Canadians and civil society so that they can use it for their own benefit and derive economic advantage, specifically in the case of open data, foster engagement, and generally contribute to society.

For us, open data is one of the three key streams of activity in our open government action plan. In fact, most of the other countries that are part of the OGP have similar open data streams of activity in their own open government action plan: the U.K., the U.S., and many others. In fact, most of the governments have published action plans that reflect those three streams of activity, because they are separate yet complementary.

9:05 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Mr. Byrne, you have the floor.

9:05 a.m.

Liberal

Gerry Byrne Liberal Humber—St. Barbe—Baie Verte, NL

I think Mr. O'Connor touched on something, as did Mr. Trottier, about what is it exactly that we're talking about here? I think there is a presumption when people hear the term “open data”, that it implies the essence of open government. The two, as you clearly defined, are not the same. As I understand it, open data is that which is already available or should be available to the public through normal access to information channels or through that which is already published but is just opaque in the way people can access it and should be made available in a broader, more accessible format. I'll ask you to comment about that if you could.

In addition, there is an assumption about cost recovery of data. The government, especially through Statistics Canada, in particular, has been providing information on a fee-for-service basis, on a cost recovery basis. I'll ask you, will this impact that by eliminating those costs? Does our membership in the G-8 Open Data Charter imply that those costs will now be eliminated and the information made more accessible?

Finally, how far is this going to go? I was perusing your deck a little while ago and it says, to give one example, “What happened to the fish in my lake?”, and it lists a whole lot of environmental, habitat, and other ecosystem observations and reports that would be available. The Department of Fisheries and Oceans has access to databases of who catches fish and how much, but they don't publish those. It's available under the Access to Information Act if you constantly probe and ask them for it.

Is that the extent of this? Is this where this is going? Will there be a PCO or a Treasury Board mandate that says to the Department of Fisheries and Oceans, you have to start publishing this data on a quarterly basis or on an annual basis and do so in a transparent and predictable fashion?

There are a few questions there and I hope you've absorbed them.

9:10 a.m.

Chief Information Officer of the Government of Canada, Treasury Board Secretariat

Corinne Charette

I'll try to address your questions the best we can. First, on your question about being available free of charge, that's an important part of our open data portal. Our open data assets are available free of charge. In fact, Statistics Canada had formerly been charging for access to their data and with our work with Statistics Canada on open data, or data.gc.ca, they have eliminated their fees.

9:10 a.m.

Liberal

Gerry Byrne Liberal Humber—St. Barbe—Baie Verte, NL

It's across the board now.

9:10 a.m.

Chief Information Officer of the Government of Canada, Treasury Board Secretariat

Corinne Charette

All of the data on our portal is available free of charge, and that certainly is an important part of the open data construct internationally and certainly one that we support. So if our data is available, we make it available. Now of course not all of our data is available all at once simply because there is a lot of work by departments on preparing the metadata and preparing these data sets in the format that makes them reusable and easy to understand, compatible with the search tool on the portal, and so on.

So departments are going at it steadily. In fact, in preparation for the Canadian open data experience appathon, departments really came together and made a great effort at preparing, under a very short timeframe, a lot of high value data sets that had been requested by users of the portal. They went ahead and did the metadata and prepared them in the right formats and they were available to the appathon developers. So free is definitely an issue; it's definitely part of the construct.

The next point you question is about privacy, and it's important to note that the open data portal does not present information that is personal in any way. There is no personal information. This is information that is completely impersonal, so to speak, and generally speaking is about topics, but it wouldn't tie a citizen's use to a particular topic or anything like that. We're very concerned of course with the protection of privacy across the government and we work very hard to ensure that our open data assets respect that commitment to the protection of personal information.

The last point that you raised—and I may not have gotten all of your questions—is that what is also important about open data is that, from a federal sense, we have a lot of very valuable data assets, but our assets grow in value when they are combined with data assets at the provincial and potentially at the municipal level. So the federal government only collects data in the realms of jurisdiction that we have programs and services in, and the provinces of course have their own programs and services. They collect information of a slighty different nature. To open data enthusiasts, the greatest value is when they can mash up data that is free and that is under a common licence from all levels of government, and potentially one day internationally as the result of our work on the G8 charter.

9:15 a.m.

Liberal

Gerry Byrne Liberal Humber—St. Barbe—Baie Verte, NL

I think I'll clarify that, Madam Charette. I used a poor example because I used the example of a lake, which is a provincial jurisdiction. I was asking about a similar example, however, totally in the federal jurisdiction, which would be offshore marine seacoast fisheries. Just as an example, that would be data that would be totally collected by the federal government and would actually have no provincial government participation. There is information that is collected by the Department of Fisheries and Oceans similar to information that would be collected by Environment Canada and other things.

What my question really is—without muddling the question through the relationship of the federal-provincial jurisdictional issue on that which is strictly a federal issue—will there be a directive that will be given to departments as a result of Canada's participation in this initiative that says if you have data that would normally be collected and disseminated and made available through the Access to Information Act to applicants and you're not publishing it on the portal, you're not publishing it in a way that's transparent and clearly available, you have a responsibility and a mandate to publish it. Is that part of this initiative?

9:15 a.m.

Chief Information Officer of the Government of Canada, Treasury Board Secretariat

Corinne Charette

It is. In fact that is one of our commitments to our first action plan that was published, which is the directive. In fact we're working hard on that to essentially give departments guidance in how to conduct inventories of the data they could publish, how to identify what they have already published, and over what time periods they should be publishing the data sets that they're already collecting or working with as a result of different programs and services.

That's certainly part of it, but of course we have to do this in a way that respects departmental resource constraints and in fact their ability to maintain the integrity of the data, frequency of refresh, having the right curators, if you will, available to ensure they can respond to questions, and so on. Absolutely we are working hard in that regard to give guidance that will require departments to publish more of their data sets.

9:15 a.m.

Liberal

Gerry Byrne Liberal Humber—St. Barbe—Baie Verte, NL

That's very helpful. Thank you.

9:15 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

I will give the floor to Mr. Martin. Then we can move to the second part of the presentation which is about the demonstration.