Thank you for the honour of being able to spend some time with you. Even though I'm in Miami now and I live in San Francisco, I actually was privileged enough to have grown up in Burlington, Ontario. It is truly an honour to be able to interact with this committee. I did prepare a few brief remarks, which I'm happy to share with you, but I'm actually looking forward to the conversation.
As introduced, my name is Michael Chui. I'm a partner at the McKinsey Global Institute, which is McKinsey and Company's research arm. I lead some of our firm's research on the impact of long-term technology trends. Basically I'd like to share with you a few of the findings from some of the research we conducted.
We published a report in October entitled “Open data: Unlocking innovation and performance with more liquid information”. Clearly, as I think people on the panel are aware, open data has become an increasingly important trend around the world, with over 40 countries having implemented open data portals. While a lot has been written about the importance of open data to unlock transparency as well as accountability in government and public institutions, we really focused on the economic potential that could be unlocked using open data.
Just to explain what we meant when we did our research, we actually viewed open data as being defined or varying across four dimensions.
The first was accessibility, or simply the number of people or the number of entities with access to data. Where more people had access to data, we considered it to be more open.
Second, we also considered machine readability. Of course, almost all data in some form can be machine readable, but some forms are easier to use, easier to process, such as comma-delimited and other formats. That was another dimension that we considered to be important.
Third, we also considered cost. When information is made less expensive, or is free, it's more open. Again, sometimes governments and other institutions implement some sort of cost recovery. We didn't want to say that data was completely closed if a modicum of charge was associated with it.
Finally, the fourth dimension we described involved the rights to use that data, whether it could be redistributed, how it could be processed, etc. Data could be completely unencumbered in terms of legal rights to use, or there could be some restrictions on it. We think that varies along the continuum. We really think that data can be more closed or more open or more liquid, as we described it, rather than just open data and then everything else.
That being said, what did we find when we looked at the potential economic impact of open data? We looked across seven different sectors of the economy. The sectors include education, transportation, consumer products, electricity, oil and gas, health care, and then various aspects of consumer finance. When we looked across all of those different sectors of the economy and we looked globally, we found that an additional $3 trillion to $5 trillion in impact could be created using open data. These benefits include increasing efficiency, developing new products and services, and even consumer surplus, which is the type of benefit that individual citizens can obtain when they have access to more open data or to applications that use open data.
There are a few other findings. Open data also enhances the impact that big data can produce, which has been another area of study for us. Oftentimes, when you combine data from multiple sources, you can actually derive more value. Some of the ways in which you derive value include increasing transparency, exposing variability, enabling the ability to conduct experiments in the real world, segmenting populations to tailor actions, augmenting or automating human decision-making, and then defining new products and services. Really when we looked across the board, if you think about exposing variability and enabling experimentation, about one-third of all the impact we found came from the ability to benchmark, to compare yourself against others.
We also found that individual citizens stand to gain the most from open data. Over half of the impact we found—again, that's not separate from benchmarking, because you can do individual benchmarking as well—in terms of potential benefits would actually accrue to individual citizens or consumers. We found in fact a very closely related concept to open data, which we described as “my data”. That's where an individual citizen or person has access to data that a government or a company has about them. That was one of the sources of benefits that individuals could have, for instance, my ability to compare my health care outcomes with people who are similar to me.
Open data can also help businesses raise their productivity and create new products and services. Companies clearly benefit from the ability to benchmark both internally as well as externally. Open data can also be used to create more tailored products by providing more consumer insights. Of course, open data also creates new risks around reputation and potential loss of control over confidential information, whether it be personal information or corporate or organizational information.
We also think that governments have a truly central role to play as a source of open data, which clearly a number of governments have been leading in that, as a catalyst for the use of open data, as a user itself of open data, and also as a policy-maker. Clearly, government has a tremendous amount of data that it could make available, and increasingly does.
The other interesting thing is if you go back to the point that I just made, which is that a lot of the benefits actually can accrue to a diffuse set of consumers or individual citizens, if you believe that's true, then in fact government is one of the entities that has the potential to actually speak for that diffuse set of groups rather than any special interest group and thereby implement policies that make the benefits of open data more likely to be captured.
The last point I'd make is this. While making data more liquid, making it more open often is an unnecessary action in order to capture some of this value and it's often not sufficient. Other things that have to happen are that you need to create a vibrant system or ecosystem of developers who actually use the data to create applications, because most people won't look at the raw data itself; they'll use applications that take advantage of the data. Open data, as a result, often has to be combined with other sources of data. You need thoughtful policies around intellectual property, privacy, and confidentiality. You'll need to invest in technology along with investment and skills. This is clearly one of those areas where we found a tremendous gap between the need for these skills and the actual supply of them.
Standards also have to be developed in order to make data comparable from multiple sources. Then actually releasing metadata, data about data, can make open data more usable.
In closing, the potential benefits of open data truly can be transformative—as we said, it's in the order of trillions of dollars annually on a global basis—but they can often be self-reinforcing. When open data is made available and applications that are useful are actually developed based on the open data, that often encourages more open data to be released and then that cycle continues.
Let me conclude with that. Hopefully that was a helpful tour of some of the research that we've conducted on open data.