Thank you for inviting me to appear here today. I am an assistant professor of computer engineering and computer science at the University of Toronto, a faculty member at the Vector Institute, where I hold a Canada CIFAR AI chair, and a faculty affiliate at the Schwartz Reisman Institute.
My area of expertise is at the intersection of computer security, privacy and artificial intelligence.
I will first comment on the consumer privacy protection act proposed in Bill C‑27. The arguments I'm going to present are the result of discussions with professors Lisa Austin, David Lie and Aleksandar Nikolov, some colleagues.
I do not believe that the act in its current form creates the right incentives for adoption of privacy-preserving data analysis standards. Specifically, the act's reliance on de-identification as a privacy protection tool is misplaced. For example, as you know, the act allows organizations to disclose personal information to some others for socially beneficial purposes if the personal information is de-identified.
As a researcher in this field, I would say that de-identification creates a false sense of security. Indeed, we can use algorithms to find patterns in data, even when steps have been taken to hide those patterns.
For instance, the state of Victoria in Australia released public transit data that was de-identified by replacing each traveller's smart card ID with a unique random ID. The logic was that no IDs means no identities. However, researchers showed that mapping their own trips, where they tapped on and off public transit, allowed them to reidentify themselves. Equipped with that knowledge, they then learned the random IDs assigned to their colleagues. Once they had knowledge of their colleagues' random IDs, they could find out about any other trip—weekend trips, doctor visits—all things that most would expect to be kept private.
As a researcher in this area, that doesn't surprise me.
Moreover, AI can automate finding these patterns.
With AI, such reidentification can happen for a large portion of individuals in the dataset. This makes the act problematic when trying to regulate privacy in an AI world.
Instead of de-identification, the technical community has embraced different approaches to privacy data analysis, such as differential privacy. Differential privacy has been shown to work well with AI and can demonstrate privacy, even if some things are already known about the data. It would have protected the colleague's privacy in the example I gave earlier. Because differential privacy does not depend upon modifying personal information, this creates a mismatch between what the act requires and emerging best technical practices.
I will now comment on the part of Bill C‑27 that proposes an artificial intelligence and data act. The original text was ambiguous as to the definition of an AI system and a high‑impact system. The amendments that were proposed in November seem to be moving in the right direction. However, the proposed legislation needs to be clearer with respect to data governance.
Currently, the act does not capture important aspects of data governance that can result in harmful AI systems. For example, improper care when curating data leads to a non-representative dataset. My colleagues and I have illustrated this risk with synthetic data used to train AI systems that generate images or text. If the output of these AI systems is being fed back to them, that is, to train new AI systems, these new AI systems perform poorly. The analogy one might use is how the photocopy of a photocopy becomes unreliable.
What's more, this phenomenon can disparately impact populations already at risk of being the subject of harmful AI biases, which can propagate discrimination. I would like to see broader considerations at the data curation stage captured in the act.
Coming back to the bill itself, I encourage you to think about producing support documents to help with its dissemination. AI is a very fast-paced field and it's not an exaggeration to say that there are new developments every day. As a researcher, it is important that I educate the future generation of AI talent on what it means to design responsible AI. In finalizing the bill, please consider plain language documents that academics and others can use in the classroom or laboratory. It will go a long way.
Lastly, since the committee is working on regulating artificial intelligence, I'd like to point out that the bill will have no impact if there are no more AI ecosystems to regulate.
When I chose Canada in 2018 over the other countries that tried to recruit me, I did so because Canada offered me the best possible research environment in which to do my work on responsible AI, thanks to the pan-Canadian AI strategy. Seven years into the strategy, AI funding in Canada has not kept pace. Other countries have larger funding for students and better computing infrastructure, both of which are needed to stay at the forefront of responsible AI research.
Thank you for your work, which lays the foundation for responsible AI. I thought it was important to highlight these few areas for improvement in the interest of artificial intelligence in Canada.
I look forward to your questions.