Evidence of meeting #8 for Access to Information, Privacy and Ethics in the 44th Parliament, 1st Session. (The original version is on Parliament’s site, as are the minutes.) The winning word was consent.

A recording is available from Parliament.

On the agenda

MPs speaking

Also speaking

Kamran Khan  Chief Executive Officer and Founder, Professor of Medicine and Public Health, University of Toronto, BlueDot
Alex Demarsh  Director, Data Science, BlueDot
Pamela Snively  Vice-President, Chief Data and Trust Officer, Telus Communications Inc.

3:35 p.m.

Conservative

The Chair Conservative Pat Kelly

I call the meeting to order.

Welcome to meeting number eight of the House of Commons Standing Committee on Access to Information, Privacy and Ethics.

Pursuant to Standing Order 108(3)(h) and the motion adopted by the committee on Thursday, January 13, 2022, the committee has commenced its study on the collection and use of mobility data by the Government of Canada.

Today's meeting is taking place in a hybrid format, pursuant to the House order of November 25, 2021. Members are attending in person in the room and remotely by using the Zoom application. The proceedings will be made available via the House of Commons website. The webcast will always show the person speaking rather than the entirety of the committee.

Before we go to witnesses, a study budget was distributed to all of you. Are there any objections or questions?

I see none. Shall the budget be adopted?

(Motion agreed to [See Minutes of Proceedings]

Now we can proceed directly to hearing from our witnesses.

In the first panel, from BlueDot, we have Dr. Kamran Khan, chief executive officer and founder, and Mr. Alex Demarsh, director of data science.

You have five minutes for your opening statement. Please go ahead.

3:35 p.m.

Dr. Kamran Khan Chief Executive Officer and Founder, Professor of Medicine and Public Health, University of Toronto, BlueDot

Thank you, Mr. Chair.

Good afternoon, everyone. Thank you for the invitation to participate in today's session.

As you just heard, my name is Dr. Kamran Khan. I am BlueDot's founder and CEO. I'm joined by my colleague Alex Demarsh, who is BlueDot's director of data science.

I’d like to begin my opening remarks with some background information to help provide some important context for today’s conversation.

First, I'm an infectious disease physician and have been in clinical practice for the past 20 years. You may recall that 20 years ago a novel coronavirus that the world had never seen or heard of before emerged in Guangdong province in China and rapidly spread to more than two dozen countries around the world, including Canada. That virus was SARS-CoV. I started my career in the midst of that outbreak, and it is an experience I have never forgotten.

It has been the inspiration for everything I have done in the past 20 years of my career as a practising physician, including the past two years of this pandemic when I have been managing hospitalized and critically ill patients with COVID-19; as an epidemiologist and a professor studying outbreaks of emerging diseases and how they spread in our increasingly interconnected world; and as an entrepreneur who founded BlueDot eight years ago to harness the power of global data and modern digital technologies to strengthen our ability to respond to rapidly evolving outbreaks.

I’d like to be clear that BlueDot is an organization that produces infectious disease insights, not one that collects location data from mobile devices. Our sole purpose and reason for existence is to protect lives and livelihoods from the growing global threat posed by emerging infectious diseases.

To fulfill our mission, we procure and analyze diverse worldwide data from publicly and commercially available sources to better detect signals of outbreaks around the world at their earliest stages, to forecast their patterns of spread to cities around the world and to empower local responses that mitigate their health, economic and social consequences.

With COVID-19, we did just that. Our technology used publicly available data to detect a worrisome outbreak emerging in Wuhan back in late December 2019. We then accurately forecasted the global pathways of that outbreak through the worldwide network of flights, publishing our findings online in the world’s first peer-reviewed scientific study on COVID-19.

When COVID-19 began to spread here in our own country, we analyzed de-identified GPS location data that we procured from third party providers that we selected because they adhered to Canadian and other internationally stringent privacy laws and regulations and had strong data privacy practices in place.

These third party providers collect GPS data from mobile apps that have a logical need for location. The apps require express consent to use location data and provide users with the opportunity to withdraw such consent at any time. Note that any location data we receive from these third parties has been de-identified before it ever reaches our organization.

Some of these de-identified location data are also pre-aggregated before we receive them, while some data are delivered at the device level. We have never attempted to connect device-level data to an individual. We have no purpose for doing so and we are contractually prohibited from making any attempts to do so.

Working with the Public Health Agency during this pandemic, we have analyzed and transformed de-identified GPS location data into actionable public health insights to help anticipate epidemic surges, to inform where and when the utilization of finite resources will have the greatest impact on saving lives, and to understand the effectiveness of social distancing interventions, all under rapidly evolving emergency conditions.

Throughout our engagement, we have taken careful steps to ensure that any data or insights we have delivered to the Public Health Agency could not conceivably be associated with any individual.

I founded BlueDot because 20 years ago, as a frontline health care worker, I watched a virus cripple an entire city for four months. I understood then that more disruptive outbreaks would follow, and they have, with greater frequency, scale and impact.

Two years into this pandemic, I am certain that data, analytics and technology can help us stay ahead of outbreaks that we will inevitably face again and protect lives and our way of life. I am equally certain that we can continue to realize the value of such public health insights in a manner that fully respects and protects data privacy.

Thank you again for the invitation to be here today.

3:35 p.m.

Conservative

The Chair Conservative Pat Kelly

Thank you for your impeccable timing on your five-minute statement.

We'll begin our rounds of questions with Mr. Kurek.

3:35 p.m.

Conservative

Damien Kurek Conservative Battle River—Crowfoot, AB

Thank you very much, Doctor. I appreciate your testimony today.

To start off, I was interested when in your opening statement you talked about data that was procured from apps that required express consent to be given for location tracking. In regard to the data that was sent to the Public Health Agency of Canada, do you know how many mobile devices and/or individuals had data collected that was then sent to PHAC?

3:40 p.m.

Chief Executive Officer and Founder, Professor of Medicine and Public Health, University of Toronto, BlueDot

Dr. Kamran Khan

Through you, Mr. Chair, to the honourable member, on the data we collected in the context of the Canadian response to the COVID-19 pandemic, it was approximately five million devices in total.

3:40 p.m.

Conservative

Damien Kurek Conservative Battle River—Crowfoot, AB

Thank you very much, Doctor.

This committee was provided with a slide deck that appeared to be a presentation that would have been given to the Public Health Agency of Canada. Along with that slide deck was a letter from the Parliamentary Secretary to the Minister of Health. In that slide deck, the maps and whatnot had very, very interesting information that I'm sure was helpful in developing policy, but the explanatory slides at the end of that document state, and I quote, “Weekly values of active device users at the province and health region level can be downloaded directly from the BlueDot mobility dashboard.”

Can you tell the committee what this data looks like before it's uploaded to the dashboard, and what information is accessible to the dashboard? First, though, did the Public Health Agency of Canada have access or subscribe to access to that dashboard?

3:40 p.m.

Chief Executive Officer and Founder, Professor of Medicine and Public Health, University of Toronto, BlueDot

Dr. Kamran Khan

Alex, do you want to take that question?

3:40 p.m.

Alex Demarsh Director, Data Science, BlueDot

Sure.

Through the chair to the honourable member, on the first question, we provide analytic reports of population-level mobility metrics via reports like the one you reviewed. We additionally make the same kind of metrics available through a dashboard that the agency can use to view the same kind of analysis directly themselves.

In no case is there individual device-level data shared with the Public Health Agency of Canada. It's additional summary metrics of the type that are outlined in that report. It's supporting data, but in a format that they can use to answer more dynamic questions rather than questions we've predetermined and included in our reports.

3:40 p.m.

Conservative

Damien Kurek Conservative Battle River—Crowfoot, AB

Thank you, Mr. Demarsh. Just to be clear, the Public Health Agency of Canada did subscribe, or had access, to this dashboard that's referred to.

3:40 p.m.

Director, Data Science, BlueDot

Alex Demarsh

That's correct.

The dashboard, to be clear, includes only our summary metrics, not the original data, but yes, they do have access to that data via the dashboard.

3:40 p.m.

Conservative

Damien Kurek Conservative Battle River—Crowfoot, AB

Certainly it's interesting. Part of the concern that's been highlighted by privacy experts is the ability to reidentify and to gain access, and the privacy concerns related to this information.

Is there any possibility that we can see that data?

3:40 p.m.

Director, Data Science, BlueDot

Alex Demarsh

Just to clarify, do you mean the contents of the dashboard we shared with the agency?

3:40 p.m.

Conservative

Damien Kurek Conservative Battle River—Crowfoot, AB

Yes. Would the data available on that dashboard be available for this committee to see?

3:40 p.m.

Director, Data Science, BlueDot

Alex Demarsh

Certainly. Yes. We can follow up in writing with a sample that would inform you of the contents of the dashboard.

3:40 p.m.

Conservative

Damien Kurek Conservative Battle River—Crowfoot, AB

Thank you very much. That's much appreciated.

With regard to the check-ins, the slide deck specifies “anonymized device movement in half-hour windows, at the bottom of the half hour”. As well, a device can have up to 48 check-ins per day, and devices with fewer than eight check-ins per day are removed from the sample.

Can you explain the context? That's a tremendous amount of information. Can you provide detail as to how BlueDot ensures that there is no way for that data to be reidentified?

3:40 p.m.

Director, Data Science, BlueDot

Alex Demarsh

Dr. Khan, unless you'd like to jump in, I'd be happy to take this one.

3:40 p.m.

Chief Executive Officer and Founder, Professor of Medicine and Public Health, University of Toronto, BlueDot

Dr. Kamran Khan

Sure. Why don't you go ahead?

3:40 p.m.

Director, Data Science, BlueDot

Alex Demarsh

To start, every question we're seeking to answer is about populations. These data are only useful in so far as they inform us indirectly about average contact rates in populations. We have no interest in individual devices. The information's only useful in aggregate.

The data are de-identified, so in most cases they're pre-aggregated metrics and summary statistics about those populations. When we do receive individual device level data, there's no identifying information received. The contents of it are simply an approximate location and a time-stamp.

The description of half-hour reporting only pertains to that data we hold in extremely secure internal secure data processing platforms, with only a limited number of internal users having access. We have a number of reasons for using industry best practices for data security.

Beyond that, to your larger point about potential reidentification—

3:45 p.m.

Conservative

The Chair Conservative Pat Kelly

Thank you, Mr. Marsh. We are out of time for Mr. Kurek's round.

Just before we go to Ms. Hepfner, I would ask you, Dr. Khan, when speaking, to perhaps hold your microphone a little bit closer to your mouth. It doesn't appear that you have a boom. We'll see if we can get better audio for the interpreters.

With that, please go ahead, Ms. Hepfner, for six minutes.

3:45 p.m.

Liberal

Lisa Hepfner Liberal Hamilton Mountain, ON

Thank you very much, Mr. Chair.

I also want to thank the witnesses for being here today and helping us refocus on the question that we're addressing in this motion, which is specifically about the data that public health received, in part through BlueDot.

I'd like to just keep you talking, if you don't mind, Alex, about this.

What specific data did public health get? You said, approximate locations and time-stamps, so it's all general information. Is there no way that public health could look at this data, in any way reidentify it and know that Lisa Hepfner was shopping at Lime Ridge mall on the weekend?

3:45 p.m.

Director, Data Science, BlueDot

Alex Demarsh

Absolutely not. To clarify even more, even approximate locations and time-stamps are not a level of data we share with the agency. It's further aggregated, either by the geographic range and larger populations where the device was found, or over time periods of a minimum of 24 hours. It's still more generic and unidentifiable than you've described.

3:45 p.m.

Liberal

Lisa Hepfner Liberal Hamilton Mountain, ON

Dr. Khan, maybe you could comment a little bit, from your expertise, on how valuable this data has been in helping the government fight the pandemic, and maybe what it would have looked like if we hadn't had this data.

3:45 p.m.

Chief Executive Officer and Founder, Professor of Medicine and Public Health, University of Toronto, BlueDot

Dr. Kamran Khan

To the honourable member, thank you for that question. It's a really important one.

With traditional public health data, we count things like cases, hospitalizations and deaths, but when we're dealing with a rapidly evolving outbreak, by the time we see a case, we're already too late. There are a whole bunch of things that have already transpired. There has been, at some point earlier, a contact, an exposure. The person exposed might develop symptoms and get tested. By the time they get their test results back, we're already very far behind.

The entire use for these types of data—again I want to highlight de-identified, anonymized data—is ultimately to estimate contact rates in the population. That's what this is all about. It's just estimating how much contact is occurring in the population, because contacts are a leading indicator of what is coming next. Cases tell you that something has already happened in the past. It's a shift from being reactive to being proactive and anticipatory.

What we don't want is to be behind an outbreak. We want to try to get in front of it. We want to try to change the course and trajectory. Pretty much everything we're talking about here really comes down to one thing: trying to inform public health about contact rates in the population and where they're increasing in a way that is a precursor to exposures, cases, hospitalizations and deaths, so that an intervention can happen.

I've been working in the field of emerging outbreaks for my entire career. We know that outbreaks spread quickly. It means that we have to be able to react, understand and move even more intelligently and in a better coordinated manner.

My sense is that, as a physician, I can take care of one patient at a time. These types of analytics can support the public health response that could be impacting not only lives but all of the economic and societal implications we've had to endure for two years.

3:50 p.m.

Liberal

Lisa Hepfner Liberal Hamilton Mountain, ON

Is there any other way to get this data?

3:50 p.m.

Chief Executive Officer and Founder, Professor of Medicine and Public Health, University of Toronto, BlueDot

Dr. Kamran Khan

I don't believe so.

There have been discussions about things like the use of synthetic mobility data. I do want to highlight that much of those types of approaches are actually using empirical location data as a training dataset. Secondarily, in a very stable environment, that might make sense, but keep in mind the last two years have been anything but stable—constantly changing conditions, new variants and new public health interventions and policies. This has been a very erratic two years, and empirical data are going to give us the best foresight into what is coming next so that we can make intelligent decisions about how to mitigate the health, economic and social consequences.

The last thing I would say is that BlueDot—and my work as a physician for the last 20 years—is about protecting lives but also protecting data privacy. This is something that we take very seriously and is really at the core of what we do as an organization and, candidly, why I founded BlueDot in the first place.