Evidence of meeting #22 for Government Operations and Estimates in the 41st Parliament, 2nd Session. (The original version is on Parliament’s site, as are the minutes.) The winning word was actually.

A recording is available from Parliament.

On the agenda

MPs speaking

Also speaking

Michael Chui  Partner, McKinsey Global Institute, McKinsey and Company
Paul Baker  Chief Executive Officer, Chicago Open Data Institute
Gordon O'Connor  Carleton—Mississippi Mills, CPC

8:45 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

We will begin our 22nd meeting.

Today, we are hearing from two witnesses by videoconference. We have a connection with Michael Chui, partner at the McKinsey Global Institute, joining us live from Miami, the United States. Afterwards, we will establish a connection with the other witness, Mr. Baker, the Chief Executive Officer of the Chicago Open Data Institute.

As usual, we will let our witnesses make their presentation for our study on open data. Afterwards, the committee members will have an opportunity to put questions to the individual of their choice.

Mr. Chui, thank you for joining us.

You will have 10 minutes for your presentation.

Go ahead.

8:45 a.m.

Michael Chui Partner, McKinsey Global Institute, McKinsey and Company

Thank you for the honour of being able to spend some time with you. Even though I'm in Miami now and I live in San Francisco, I actually was privileged enough to have grown up in Burlington, Ontario. It is truly an honour to be able to interact with this committee. I did prepare a few brief remarks, which I'm happy to share with you, but I'm actually looking forward to the conversation.

As introduced, my name is Michael Chui. I'm a partner at the McKinsey Global Institute, which is McKinsey and Company's research arm. I lead some of our firm's research on the impact of long-term technology trends. Basically I'd like to share with you a few of the findings from some of the research we conducted.

We published a report in October entitled “Open data: Unlocking innovation and performance with more liquid information”. Clearly, as I think people on the panel are aware, open data has become an increasingly important trend around the world, with over 40 countries having implemented open data portals. While a lot has been written about the importance of open data to unlock transparency as well as accountability in government and public institutions, we really focused on the economic potential that could be unlocked using open data.

Just to explain what we meant when we did our research, we actually viewed open data as being defined or varying across four dimensions.

The first was accessibility, or simply the number of people or the number of entities with access to data. Where more people had access to data, we considered it to be more open.

Second, we also considered machine readability. Of course, almost all data in some form can be machine readable, but some forms are easier to use, easier to process, such as comma-delimited and other formats. That was another dimension that we considered to be important.

Third, we also considered cost. When information is made less expensive, or is free, it's more open. Again, sometimes governments and other institutions implement some sort of cost recovery. We didn't want to say that data was completely closed if a modicum of charge was associated with it.

Finally, the fourth dimension we described involved the rights to use that data, whether it could be redistributed, how it could be processed, etc. Data could be completely unencumbered in terms of legal rights to use, or there could be some restrictions on it. We think that varies along the continuum. We really think that data can be more closed or more open or more liquid, as we described it, rather than just open data and then everything else.

That being said, what did we find when we looked at the potential economic impact of open data? We looked across seven different sectors of the economy. The sectors include education, transportation, consumer products, electricity, oil and gas, health care, and then various aspects of consumer finance. When we looked across all of those different sectors of the economy and we looked globally, we found that an additional $3 trillion to $5 trillion in impact could be created using open data. These benefits include increasing efficiency, developing new products and services, and even consumer surplus, which is the type of benefit that individual citizens can obtain when they have access to more open data or to applications that use open data.

There are a few other findings. Open data also enhances the impact that big data can produce, which has been another area of study for us. Oftentimes, when you combine data from multiple sources, you can actually derive more value. Some of the ways in which you derive value include increasing transparency, exposing variability, enabling the ability to conduct experiments in the real world, segmenting populations to tailor actions, augmenting or automating human decision-making, and then defining new products and services. Really when we looked across the board, if you think about exposing variability and enabling experimentation, about one-third of all the impact we found came from the ability to benchmark, to compare yourself against others.

We also found that individual citizens stand to gain the most from open data. Over half of the impact we found—again, that's not separate from benchmarking, because you can do individual benchmarking as well—in terms of potential benefits would actually accrue to individual citizens or consumers. We found in fact a very closely related concept to open data, which we described as “my data”. That's where an individual citizen or person has access to data that a government or a company has about them. That was one of the sources of benefits that individuals could have, for instance, my ability to compare my health care outcomes with people who are similar to me.

Open data can also help businesses raise their productivity and create new products and services. Companies clearly benefit from the ability to benchmark both internally as well as externally. Open data can also be used to create more tailored products by providing more consumer insights. Of course, open data also creates new risks around reputation and potential loss of control over confidential information, whether it be personal information or corporate or organizational information.

We also think that governments have a truly central role to play as a source of open data, which clearly a number of governments have been leading in that, as a catalyst for the use of open data, as a user itself of open data, and also as a policy-maker. Clearly, government has a tremendous amount of data that it could make available, and increasingly does.

The other interesting thing is if you go back to the point that I just made, which is that a lot of the benefits actually can accrue to a diffuse set of consumers or individual citizens, if you believe that's true, then in fact government is one of the entities that has the potential to actually speak for that diffuse set of groups rather than any special interest group and thereby implement policies that make the benefits of open data more likely to be captured.

The last point I'd make is this. While making data more liquid, making it more open often is an unnecessary action in order to capture some of this value and it's often not sufficient. Other things that have to happen are that you need to create a vibrant system or ecosystem of developers who actually use the data to create applications, because most people won't look at the raw data itself; they'll use applications that take advantage of the data. Open data, as a result, often has to be combined with other sources of data. You need thoughtful policies around intellectual property, privacy, and confidentiality. You'll need to invest in technology along with investment and skills. This is clearly one of those areas where we found a tremendous gap between the need for these skills and the actual supply of them.

Standards also have to be developed in order to make data comparable from multiple sources. Then actually releasing metadata, data about data, can make open data more usable.

In closing, the potential benefits of open data truly can be transformative—as we said, it's in the order of trillions of dollars annually on a global basis—but they can often be self-reinforcing. When open data is made available and applications that are useful are actually developed based on the open data, that often encourages more open data to be released and then that cycle continues.

Let me conclude with that. Hopefully that was a helpful tour of some of the research that we've conducted on open data.

8:50 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Thank you very much for this presentation.

Since our next witness has not yet appeared, we will begin the question and answer period right away.

Mr. Ravignat, you have five minutes.

8:50 a.m.

NDP

Mathieu Ravignat NDP Pontiac, QC

Thank you for being here.

I would like to hear your thoughts on a few issues.

How do you think the government could ensure that the data it transfers or makes public is adapted for commercial use? As you know, that is one of the Canadian government's objectives. Some witnesses have told us that the data was not quite useful. However, this is the first opportunity we have to talk about the data's commercial aspect. So I would like to know whether you have an opinion on how to go about this.

Thank you.

8:55 a.m.

Partner, McKinsey Global Institute, McKinsey and Company

Michael Chui

There are a few things that we've learned from working with private sector clients. You can think about what you need to do with open data as almost being like marketing and the aspects of marketing which we think are applicable.

Number one is understanding the need that's out there. Just as a marketer wants to understand what product or service to deliver by deriving customer insight it's helpful for the government in this case to understand what the needs are for data out there. Ways to find out about that include convening groups, customer focus groups in this case. It would be commercial focus groups where you would actually ask companies what data would be most valuable to them. In fact, at our firm we've actually brought together groups of business executives and made them aware of some of the open data that was available from governments. They were actually surprised and said that this is something they can actually use. Being able to create that awareness which is the first part of marketing is incredibly important.

Then there's continuing to understand.... Right now open data efforts are often supply driven. You know, there's a bunch of data and we'll just throw it out there. Again, you need that demand signal to come back through an open dialogue with your “customers”.

8:55 a.m.

NDP

Mathieu Ravignat NDP Pontiac, QC

It's fair to say that one of the essential pillars of making data available is that dialogue piece. It's making sure that you connect with stakeholders at various levels to ensure the data is actually useful. It's something that unfortunately we've seen very little of in the open government strategy of this particular government, but that's neither here nor there.

The other question I wanted to ask you was whether or not you've had a chance to compare some of the efforts done by the Obama administration with the Canadian efforts, and whether or not you have an opinion with regard to the quality of both initiatives.

8:55 a.m.

Partner, McKinsey Global Institute, McKinsey and Company

Michael Chui

I really haven't had an opportunity to compare the two initiatives. I don't think I could provide an informed perspective on that.

8:55 a.m.

NDP

Mathieu Ravignat NDP Pontiac, QC

Thank you.

Coming back to the commercial use of data, one of the issues is making sure it's a fair level playing field, that those companies that perhaps get procurement contracts with the federal government don't get some kind of favouritism with regard to access to certain data. You obviously don't want to advantage one particular sector over another.

I wonder if you have any thoughts with regard to ensuring the fairness of the access to data.

8:55 a.m.

Partner, McKinsey Global Institute, McKinsey and Company

Michael Chui

I'm not an expert on procurement, so I'm not sure I could comment on that, but certainly when data is made available broadly through a portal, or what have you, in general, anyone who has access to the Internet or the web is able to access the data. Certainly from a pure access standpoint, I think that type of equality of access can be ensured.

8:55 a.m.

NDP

Mathieu Ravignat NDP Pontiac, QC

Thank you.

8:55 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Thank you, Mr. Ravignat.

Mr. Baker just joined us live from Chicago. I think he can hear us.

8:55 a.m.

Paul Baker Chief Executive Officer, Chicago Open Data Institute

It's Paul Baker here. I hear you.

8:55 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Good. We will start with your presentation.

8:55 a.m.

Chief Executive Officer, Chicago Open Data Institute

Paul Baker

Sorry I was late. A water main flooded the freeway in Chicago and all the traffic had to leave. It took an hour and 15 minutes for about a 20-minute trip.

9 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

It's all right. We just heard from Michael Chui, from McKinsey and Company, from Miami. So now we will hear you, Mr. Baker, from Chicago.

You have 10 minutes, and then we will have questions from the members here at this committee.

9 a.m.

Chief Executive Officer, Chicago Open Data Institute

Paul Baker

I was sent a few questions, or things you'd like to know about. I have a bunch of notes about those. I don't know if you're going to ask questions related to what you sent.

As for my background, I've been active in the open data movement in Chicago for about six or seven years. Chicago has gained a reputation as the open data capital of the United States, and even when I travel internationally, people seem to know about Chicago's open data efforts. Government is responsible for some of that. There are a lot of independent designers and developers who've been pushing for open data, lawyers pushing for open data, organizations like Common Cause.

In the United States, the open data movement was initially about people looking for political transparency, wanting to know who was making political contributions. Much of the initial impetus was because, during the Nixon period, the Watergate burglaries were financed by secret corporate donations. The Nixon campaign had caused Common Cause to be formed as a bipartisan Republican and Democratic group that was in favour of transparency in political donations.

From that, government started collecting data. Computers got a lot more powerful. Data was released. The issue of open data is now not only political transparency, it's also efficiency. It's the idea of government as a platform. Whether it's much government data, open data can be used to create businesses like Google Maps or weather reports. Some companies are aiding farmers trying to figure out when they should plant a crop, when it's going to rain, when it's not going to rain, whether they should irrigate, and these kinds of things. There's a lot of efficiency and economic benefits to open data that have come to the fore in the last three or four years as a lot more local and national data has been released.

A couple of weeks ago, the federal government released a bunch of Medicaid paid claim data. There are several other sets of data they're going to release, several other sets of data that have been released. This and other types of data, electronic medical record data, is going to be available to doctors and hospitals and clinics treating Medicaid patients. Much of it's already available. According to the Affordable Care Act, or Obamacare, patients and doctors treating Medicaid patients are required to be able to receive electronic medical records. That's going to change the whole way people are treated.

Genomic data now is being combined with electronic medical record data to do medical studies without actually having to devise an experiment and do blind tests with control groups and that type of thing. You can just look in the data and look for patterns in that data. So, maybe in women treated for breast cancer, some live and some die. You look at the ones who have lived and you look back through how they were treated. You look at their genomic structure, and you look for particular medicines that can treat different types of diseases based on genetic traits and particular drug regimens.

That's a very broad view of what's going on in open data at the federal level. We could maybe get into the city level and the state level a little bit later.

9 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Thank you for your presentation.

We have already begun the question and answer period. We will continue with Mr. Trottier for five minutes.

I ask that you specify to whom you are putting your questions.

9 a.m.

Conservative

Bernard Trottier Conservative Etobicoke—Lakeshore, ON

Thank you, Mr. Chair.

Thank you, guests, for being here this morning.

Mr. Chui, I want to ask you some questions about the McKinsey report published in October 2013. You mentioned in your remarks that you focused on seven sectors. In the report, you identified an economic value of $3.2 trillion to $5.4 trillion annually worldwide just in those seven sectors.

How is it that McKinsey chose to focus on those seven sectors as opposed to others? Obviously, there are some other big sectors in a country like Canada: agriculture, fisheries, mining, and even tourism. I suppose the overall economic value could be much greater when you start considering those other sectors.

Also, if you look at only the United States, in the report you identified a value of $1.1 trillion. Given that the size of the American economy vis-à-vis the Canadian economy is about 11:1, would it be a reasonable assumption to say that there's an economic value of about $100 billion in those seven sectors in Canada?

9:05 a.m.

Partner, McKinsey Global Institute, McKinsey and Company

Michael Chui

There are a couple of things.

The first question was how we picked those sectors. Those sectors weren't picked because we thought those were the sectors where the most value would be created, but we wanted a number of sectors that varied across a number of dimensions. You notice some of them are B to C sectors, consumer focused. Some of them are B to B, and they're more business focused. Some of them are products. Some of them are services. Some of them are more public services, such as education, and some of them are very commercial.

Really what we wanted to do was to have a variety, and that's how we chose them really. It was really meant to give us a flavour for how open data could work in a number of different types of sectors.

Clearly, there are lots of sectors we weren't able to do as part of the research, and as you said, that suggests there will be even more value potential there.

In terms of trying to size the approximate potential for Canada, it's probably not unreasonable to use the sort of metric that if Canada's economy is this much smaller than the U.S. economy, then potentially Canada would be in that level of magnitude. Of course I wouldn't put any precision around it, but I think that's reasonable.

The other thing to keep in mind is that these are not GDP statistics, because they include consumer surplus, which is not captured in GDP. It's important to make sure, if you're trying to compare, that you wouldn't say this has this much GDP impact, because, as I said, over half of that impact would be a measure that is not captured by GDP. I think that's a flaw in the GDP statistic as it turns out.

9:05 a.m.

Conservative

Bernard Trottier Conservative Etobicoke—Lakeshore, ON

Thanks for that clarification.

Some witnesses we've had before this committee talked about the sources of value. There's something of a focus on development of new applications, but others have said that really the true source of value, when it comes to open data, is the sense of removing friction in interactions between different stakeholders in the economy.

You could go with the example of the old days, even within the same institution or within the same government, for example. In the old days of making a request for data, you had to wait several days for that data request to be processed, then receive the data, and then translate that data. There's lots of inefficiency built into the older models, and with open data, things are able to move that much more quickly.

Is that a way to capture the primary source of the value in the McKinsey report that was done last fall?

9:05 a.m.

Partner, McKinsey Global Institute, McKinsey and Company

Michael Chui

Yes. As we took a look at it, in fact, we found many different sources of value. Some of them are creating new products and services. Some of them are creating more efficient markets or more efficient ways to get to information.

One of our interesting findings was that while a lot of open data government efforts allow third parties to have access to government data, sometimes they actually allow other government agencies access to government data, which was more challenging.

I would say, as I said before, that about a third of the impact we found comes from the ability to compare, to do benchmarking, to compare your performance as either an individual or a company or a government against that of others, whether in procurement, in operations, or otherwise. There's a lot of benefit from that, and that really comes from just the ability to understand what best practices are in a different place.

9:05 a.m.

Conservative

Bernard Trottier Conservative Etobicoke—Lakeshore, ON

What about you, Mr. Baker? In some of the work you do, what would you say are the main sources of economic value that are generated by open data?

9:05 a.m.

Chief Executive Officer, Chicago Open Data Institute

Paul Baker

Well, to get back to the previous point, I was recently at a meeting in the City of Chicago with people who work on the open data portal there. I'll give you an example. The department of housing has data.... I mean, there's housing data in seven different departments. Often the right hand doesn't know what the left hand is doing. One of the biggest consumers of Chicago data is actually other departments within the City of Chicago, which is important.

Washington, D.C. was the first city to release substantial data from almost every department, in 2006. That was the first major effort. They studied the actual users of the data. Between 60% and 70% of the people who came to the website and downloaded the data were actually members of city departments. A lot of city departments were afraid to release data initially, but then they ended up finding out it was so much more efficient to be able to get data that they didn't know was related to what they wanted to do from other departments. That's definitely one of the most immediate benefits.

People within government who are kind of reluctant to release data and make their data available, once they see that it has a lot of benefits for them also, it reduces the resistance tremendously. So within government, it has a very immediate value.

In terms of efficiency, I'll give you one example. We were working with the housing department. They have section 8 housing vouchers where basically low-income people can get money to rent apartments, and landlords or people who build apartments can get deferred taxes for eight to ten years in return for renting to low-income people.

It was a paper-based system, so when someone would leave section 8 housing, it might take six months for a landlord to rent the apartment. That's not a very good incentive, if you know you're frequently going to have vacancies and it takes a long time to rent the apartment. As a result, you would have people looking for section 8 housing. It would take a long time for them to find it. At the same time, you would have landlords who would have vacant section 8 housing and couldn't find people to occupy that housing, largely because the different departments dealing with housing data and section 8 housing didn't talk to each other much. It was paper-based.

Now that they have it all machine readable, they can reduce that time to two weeks or a month. It really improves the situation for both the landlord and the person or family who wants housing.

9:10 a.m.

NDP

The Chair NDP Pierre-Luc Dusseault

Thank you. I have to stop you here to turn the floor over to Mrs. Day for five minutes.

9:10 a.m.

NDP

Anne-Marie Day NDP Charlesbourg—Haute-Saint-Charles, QC

Thank you, Mr. Chair.

I want to thank our guests for joining us.

The McKinsey Global Institute's mission is to help leaders in the commercial, public and social sectors make decisions on important management and policy issues.

Mr. Chui, earlier, you talked about the loss of confidentiality and privacy. We know that the public uses the Internet to obtain information. For instance, people may want to obtain all the information about breast cancer, find out whether land is available or for sale, or obtain information about a specific home. They love getting that type of information.

However, they are not as pleased when that data is passed on to random companies. We had a case recently where the personal information of nearly one million individuals was disclosed. A few companies where involved in that.

Has your organization developed any measures to protect citizens in such situations?