Mr. Chair and members of the committee, thank you for the opportunity to address you this afternoon.
My name is Momin Malik. I am a researcher working in health care AI, a lecturer at the University of Pennsylvania and a senior investigator in the Institute in Critical Quantitative, Computational, & Mixed Methodologies.
I did my Ph.D. at Carnegie Mellon University's School of Computer Science, where I focused on connecting machine learning and social science. Following that, I did a post-doctoral fellowship at the Berkman Klein Center for Internet & Society at Harvard University on the ethics and governance of AI.
My current research involves statistically valid AI fairness auditing, reproducibility in machine learning and translation from health care research to clinical practice.
For comments specifically on the current form, content and issues of the AI and data act, I will defer to my colleague Christelle Tessono, who was the lead author of the report submitted to the committee last year, to which I contributed. I will be able answer questions related to technical and definitional issues around AI, on which I will focus my comments here.
In my work, I argue for understanding AI not in terms of what it appears to do, nor what it aspires to do, but rather how it does what it does. Thus, I propose talking about AI as the instrumental use of statistical correlations. For example, language models are built on how words occur together in sequences. Such correlations between words are the core of all such technologies and large language models.
We all know the adage “correlation is not causation”. The innovation of AI that goes beyond what statistics have historically done is not to try to use correlations towards understanding and intervention, but instead use them to try to automate processes. We now have models that can use these observed correlations between words to generate synthetic text.
Incidentally, curating the huge volumes of text needed to do this convincingly requires huge amounts of human curation, which companies have largely outsourced to poorly paid and exploitatively managed workers in the global south.
In this sense, AI systems can be like a stage illusion. They can impress us like a stage magician might by seemingly levitating, teleporting or conjuring a rabbit. However, if we look from a different angle, we see the support pole, the body double and the hidden compartment. If we look at AI models in extreme cases—things far from average—we similarly see them breaking down, not working and not being appropriate for the task.
The harms from the instrumental use of correlations as per AI have an important historical precedent in insurance and credit. For more than a century, the actuarial science industry has gathered huge amounts of data, dividing populations by age, gender, race, wealth, geography, marital status and so on, taking average lifespans and, on that basis, making decisions to offer, for example, life insurance policies and at what rates.
There is a long history. I am aware of the U.S. context most strongly. For example, in the 1890s, insurance companies in Massachusetts were not offering life insurance policies to Black citizens, citing shorter lifespans. This was directly after emancipation. This was rejected at the time, and, later on, race became illegal to use. However, correlates of race, like a postal code, are still valid uses and are still legal in the U.S.—and from what I understand, in Canada as well—and thus end up disadvantaging people who can often least afford to pay.
In general, those who are marginalized are most likely to have bad outcomes. We risk optimizing for a status quo that is unjust and further solidifying inequality when using correlations in this way.
Canada's health care system is a distinct contrast to that of the U.S.—something for which the country is justifiably proud. That is an example of collectivizing risk rather than, as private industry does, optimizing in ways that benefit it best but that may not benefit the public at large.
I encourage the committee to take this historical perspective and to reason out the ways in which AI can fail and can cause harm and, on that basis, make planning for regulation.
Just as in areas critical to life, dignity and happiness—like health care, criminal justice and other areas—government regulation has a crucial role to play. Determining what problems exist and how regulation might address them will stem best from listening to marginalized groups, having strong consultation with civil society and having adequate consultation with technical experts who are able to make connections in ways that are meaningful for the work of the committee.
Thank you for your time. I welcome your questions.