Thank you very much.
I'm very happy to be here this morning. I'm always interested and happy to talk to people about things that we're interested in as well.
I have brought along some slides, but unfortunately there wasn't time to have them translated into French, so you're not going to get the benefit of pictures that would probably make some of the things I'm going to say a little clearer. I'm going to have to take a slightly different approach to what I say because of that, but that's fine.
I looked at the questions of interest to the committee, and I thought the best way I could respond is to talk about where our government is going. We've recently announced an open government initiative, of which the open data is one component. I'll talk about that and ultimately tell you how we're dealing with that and where we're going. I will also go through some work that we've done in the past with data sharing, which will tell you why we're taking the approach we're taking. With all the things that I'm hearing, which Mary and others are saying, we've had the same experiences.
I will do that. I'll talk about our experience. I'm going to talk about the Newfoundland and Labrador community accounts data sharing initiative, from which we've learned an awful lot. It is the foundation for the way that we think about open data. I'm going to talk a bit about meeting user needs, and of course I'm going to end up with a little about where we're actually going.
We've had a long tradition of data sharing, and also supporting the users of our data. The Newfoundland and Labrador Statistics Agency has always been very interested in providing data to people in our 400 communities, scattered around 6,000 miles of coastline. We've given open access to a very wide range of information. We bought the data from Statistics Canada. We've developed it from internal sources, and we just put it out there. Statistics Canada has said that it's a peculiar thing to do; we're buying data and giving it away. However, we've always thought it was very important to do that.
To us, the open data initiative that we see across North America is essentially a focus on things we have always done. We're happy to see that focus. We're very engaged in the idea of open data, and very committed to it. However, it's not something that's absolutely new to us, by any stretch of the imagination.
Regarding the system of community accounts, on my slide I call it the “flagship” of Newfoundland data sharing, and it certainly is. We released it to the public in 2000. It has data for 400 communities, 200 neighbourhoods in our larger communities. It's actually a fully developed data set, in the sense that everything is documented; you can get back to the source. We have applications there, mapping, and so on.
The other thing we've done.... Dr. Doug May of Memorial University and I have partnered in this, and we've been at it for many, many years. The way we've packaged our data in the system of community accounts is that we use a well-being framework. My slides will be available, I think, and you'll be able to see it. I have a schematic there that shows an overview of that well-being framework. The reason we did that is that we wanted to make the data meaningful to people. The idea of the well-being framework is that we present data that gives statistics and measurements of factors that contribute to well-being in people's lives.
When you look at this at a community level, it's very powerful. People very quickly become experts because they know their communities. If you give them a number, all of a sudden it starts putting a quantitative dimension to basically knowing themselves. We have found that to be very effective.
In working with people at the OECD and the Australian Bureau of Statistics, and places like that, we found that we're probably 10 years ahead of things because that's where things seem to be going right now. That was very gratifying. In the beginning I was afraid we were going up the wrong alley, but we weren't.
We found it very useful. When you use that framework, you look at income, employment, unemployment, demographics, and so on. It helps to give you a sense of what data you should put in your system. It also helps with your prioritizing. You get people coming and they're asking, when are we going to have this, and when are we going to have that? We found it to be very effective, and we found that our communities and our neighbourhood people really liked that approach. A lot of people just don't know what the possibilities are.
In terms of lessons learned, which is a driving force in terms of what we're doing with open data, we found that people came to us and said, “You're a statistics agency and we'd like some data”. We asked what they wanted and they asked what we had. That's a hard question for a statistics agency to answer. It's very hard, as you can imagine. You can think of it as a warehouse that's full of all kinds of wonderful data. Most people don't know what they want, and a lot of people don't know the possibilities. This is the power of the conceptual framework that we've put in there with well-defined objectives, and so on and so forth.
Our experience with open data versus more developed data sets is that the majority of users really are not coming to us looking for the open data type of data. We find that most people, as Mary said, who use these data, who are looking for these data, are academics or seasoned data users, and quite often it requires a lot of work to actually use them. Of course, we've always provided those kinds of data when people ask for them, and I wouldn't want to give the impression that we don't think that because people are not asking for the data right now that good open data initiatives that are well delivered and well structured can't develop an appetite and develop a lot of interest and a lot of usage of those types of data. But I think we have to be realistic about where we are today. The market for raw open data, if you want to think of it that way, at this point in time, is certainly not very well developed, and if it is, it's clustered in specific places.
We view the open data approach as really most simplistic—and I don't mean that in a negative way. It's pretty elemental, how the concept defines open data, and then, of course, there's the value-added data, which is the community accounts type of data, in terms of my example. We look at data in terms of a sort of spectrum. There's data, information, and knowledge. From our perspective, the open data, the raw data, would be sort of just the data end of it, but when you do things with that data to make it more useful, you turn data into information, and when more fully developed, you begin to turn it into knowledge.
We've always put a lot of emphasis on trying to provide information and knowledge data. I do believe that in the future, when I get into a visioning mood, I really think we'll put a lot of effort into open data. We'll learn a lot about it, and eventually end up coming back to data that are better supported, better defined, and not simply dumped out of administrative data sets because they were never designed for those kinds of reasons. It will go full cycle. The market for open raw data will be there, and probably bigger than it is today, but I think most of the demand will not be there over the medium to long term.
In my slides, which you won't see.... I had a couple of slides there that I refer to as repairing data usage. The behind-the-scenes challenge is a messy business, and it really is. I encourage you to take a look at the slides when Marc puts them up, because I'm not going to get into it now. It is by no means simple or straightforward to take a set of data that people would consider raw data, and even to do marginal work to turn it into something that's going to be useful to pretty much any user. It doesn't matter how technically strong and numerically literate an academic person or any seasoned data user is. Administrative data files are nightmares to deal with, and that's where a lot of the open data interest actually lies.
These data sets, to be useful, require a lot of support. This is one of the main reasons why our government has had us into the data side of this, as a professional and well-developed statistics agency, to make sure. We want to be a leader in our province in providing a good data product. We don't want to get out there and just churn it out and have all our staff on the phone all the time trying to answer questions as to what this is.
We want to make sure that.... The value-added will vary across the spectrum, but the value needs to be there if this is going to be successful. I would argue, based on experience, that if we don't put effort into making the data sets clean, even the rawest form, if we don't make them clean and well-defined so they can be used properly and efficiently, we are creating a resource nightmare for our organizations in trying to deal with people who are going to be coming looking for help, looking for how to interpret, how to use—where do they come from, what do they mean, what can you do with them? Ultimately, I think this could be the foundation for the failure of open data initiatives, which I think are a very good way for governments to go.
In terms of what data can be shared, what we find at this particular point in time is that it's really a challenge to know which way people are going and which way people actually want to go. For many of the sites we look at, there is no obvious organizational framework. You see that the offerings are all over the place when you look across the different sites that are out there. The word I have on my slide is “spurious”, and in many cases the quality is questionable.
But as for the way we think of it, we think of data as answers to questions, so where we start.... We've been going with the open data, and we've been encouraging our stakeholders across government to do so, the people who are into open information but don't really understand the open data as well as we do because we spend our lives at it.
First, we have to decide what questions we actually want to answer. Once we know what kinds of questions we want to answer, that begins to give us some idea of what the objectives of the initiatives are going to be. Who is the target audience? Are they highly skilled? Are they less skilled? Do we know what they want? Then, based on all of that, what's the best way to provide it across the spectrum? That's from raw data to knowledge, if you want to think of it that way.
As for what we've done in the approach we've taken, our government is fully committed to open government, to open data. There's absolutely no question about that. What we've done is establish a preliminary website. It's almost a demonstration website, but it's not something that will be withdrawn. It's something that will be made bigger. There, our Office of Public Engagement is beginning to consult.
I'm finishing now, Mr. Chair, because I'm sure I must be close to 10 minutes.
They're doing a consultation to see if we can engage with people and see where their interests might lie. A big thing we're doing that's going to be very useful for a variety of reasons is that we're actually building an inventory of all data sets across government. That is not simple. It's a big job, but we do have it under way. Of course, we're making sure as we go that privacy, confidentiality, and all that sort of thing is appropriately dealt with.
Based on our consultations, and also on our judgment, because I sort of feel that we're not going to get an awful lot of feedback from our consultations based on experience.... As I said earlier, you ask them what they want, and they don't really know for sure—