Paige Dickie develops artificial intelligence (AI) and digital strategy for Canada’s banking sector at the Vector Institute for Artificial Intelligence in Toronto. She began her career in management consulting — much to the disappointment of her father, an engineer — because she had earned advanced engineering degrees in biomedical and mechanical engineering.
Dickie initially worked at McKinsey, the global consulting firm, helping multinational financial institutions across a range of fields from data strategy and digital transformation to setting up innovation centers. She recently joined Vector to lead what she describes as “an exciting project with Canada’s banking industry. It’s an industry-wide, sector-wide, country-wide initiative where we have three different work streams — a consortium work stream, a regulatory work stream, and a research-based work stream.”
[email protected] interviewed Dickie at a recent conference on artificial intelligence and machine learning in the financial industry, organized in New York City by the SWIFT Institute in collaboration with Cornell’s SC Johnson College of Business.
According to Dickie, AI can have a significant impact in data-rich domains where prediction and pattern recognition play an important role. For instance, in areas such as risk assessment and fraud detection in the banking sector, AI can identify aberrations by analyzing past behaviors. But, of course, there are also concerns around issues such as fairness, interpretability, security and privacy. Learn more. An edited transcript of the conversation follows.
[email protected] : Could you tell us the Vector Institute and the work that you do?
Paige Dickie: Vector is an independent, not-for-profit corporation focused on advancing the field of AI through best-in-class research and applications. Our vision is centered on developing machine- and deep-learning experts. If I were to use an analogy, if Vector were a manufacturing company, what would come off the conveyor belt would be graduate students waving machine- and deep-learning degrees.
We’re funded through three channels. We have federal funding through CIFAR, which is the Canadian Institute for Advanced Research. We have provincial funding through the government of Ontario. We’re also funded by some sponsors.
[email protected]: Are they primarily banks?
Dickie: We have a lot of banks as sponsors, but we also have a number of other companies in industries like manufacturing and health care. One of the important things to recognize about these sponsors is that they’re all located in Canada. This was a deliberate decision on our part. For one, our public and private sectors recognize the economic potential of AI. Not only that, but for those who aren’t aware, Canada happens to be home to some of the world’s most prominent and influential thinkers within the field of AI, including Yoshua Bengio from Montreal; [Richard] Sutton from Edmonton; and Geoffrey Hinton, who’s a founding member of the Vector Institute. His Toronto research lab made major breakthroughs in the field of deep learning that revolutionized speech recognition and object classification.
But, despite all this, we were losing our talent. The data scientists and computer scientists that we produced in Canada would move to the U.S. and head AI roles in major tech companies like Google, Microsoft and Apple. That’s the basis for how Vector was formed. We made a deliberate decision that our sponsors had to be located here, because studies have found that if you’re able to place a PhD researcher for around 18 months after they graduate, they typically make that place their home, because they’re at the age where they’re starting families and are growing roots in the area. If we can keep our talent in Canada for 18 months by increasing the number of options — the tech companies that are in this space, the “cool” companies that they want to work for — then we might be able to reverse that brain drain we’ve been experiencing. “A machine-learning model that mimics your voice can fraudulently gain access to your banking services.” [email protected] : Have you seen that start to happen?
Dickie: Yes. If you look at the number of labs that have opened in the past couple of years, there are probably 15 or 20. We have Uber Advanced Technologies Group, Google Brain, NVIDIA … we’ve had a lot of companies that have opened up in the past couple of years, and many of them have directly attributed it to the Vector Institute. They’re all opening up in and around the same region. Obviously, we can’t say there’s a direct correlation (at least in all cases), but it’s a booming area. Do you know any high school students, teachers or parents who might be interested in learning more about business and personal finance? KWHS is our free resource offering articles, lesson plans, competitions and more designed to encourage financial literacy. user Help spread knowledge. [email protected]: What kind of projects are you tackling for the financial services industry? What are banks most concerned about in terms of AI?
Dickie: As I mentioned, I’m leading a project that has three work streams: a consortium work stream, a regulatory work stream, and a research work stream. In the first, we’re creating a consortium within the banking sector where we’re coming together to apply AI techniques to combat things that fall under the umbrella of financial crimes — like anti-money laundering, fraud and cybersecurity. This work is currently under embargo so I’m unable to share more.
Regarding the regulatory work stream, I’ll first set the context. AI is spreading like wildfire, not just across banking but across every industry. McKinsey Global Institute estimates it will be $250 billion of value for the sector. It’s massive. But if you want to capitalize on this, there’s an issue. Models introduce risk within a bank. More specifically, the more advanced machine-learning models tend to amplify various elements of model risk. Banks have a number of model validation frameworks and practices in place that are intended to address these risks, but unfortunately, these often they fall short and are insufficient to deal with the nuances associated with the more complex machine learning models. Our working group believes that well-targeted modifications to existing model validation frameworks will be able to appropriately mitigate these risks. And so, we’ve come together as a country, as an industry-wide group, to develop refreshed principles of model risk management.
We’re working with various regulators in Canada in addition to a number of consultants. It has become quite a large endeavor. Apart from the principles of model governance, we’re also developing unified perspectives on fairness and interpretability, which is a massive concern. Fairness varies by individual, it varies by country, it varies by sector. Being able to unify our perspective will allow technologists who are building these models to work in a single direction. There are 21 or so different definitions of fairness, and it shouldn’t be put on the people building the models to do it right. This is an ethical, fairness-related question, so it needs to go to the policy-makers, the law-makers, the governments, the regulators. That’s the regulatory work stream.
In the research work stream, we’re exploring two topics. One is around adversarial attacks for voice authentication. For instance, banks are incorporating a number of voice authentication and identification capabilities into their phone channels, but these can, in theory, be fooled. A machine-learning model that mimics your voice can fraudulently gain access to your banking services. We are working towards making those capabilities more robust. We are trying to identify the weaknesses and fix them so that the model can withstand adversarial attacks.
The other opportunity we’re exploring is around data-privacy techniques. This feeds into the longer-term view of the consortium. If you want to have collaborations within a business, if you want to have collaborations between companies, there are data privacy concerns. We’re exploring things like homomorphic encryption, differential privacy, multi-party computation, and federated learning. “Fairness is top of mind for regulators and for banks. But there are a number of challenges. For one, there’s no standard definition.” [email protected]: One hears a lot of hype around AI. Based on your own work, where would you say the real impact of AI is being felt?
Dickie: AI has an important role to play in data-rich domains where prediction and pattern recognition reign. For instance, areas such as risk assessment and fraud detection and management. By analyzing past behaviors, AI can identify odd behavior. I’d say those are the two most fruitful areas for banks right now.
There are obviously concerns, but the concerns shift a little bit. There are concerns around fairness and interpretability of models. They’re known as “black box models,” which isn’t entirely true. But when you get to domains like risk assessment and fraud detection, it’s a different game. You are dealing with innocent people whose data is getting crunched to observe these things, but at the end of it, you’re trying to actually capture criminals. You’re trying to capture people who are committing financial crimes. For example, if you were trying to identify people who are laundering drugs or guns or humans, the attitude towards leveraging AI techniques in those domains from the public shifts immediately. It shifts away from, “You’re violating my data privacy,” and it moves towards, “You’re using my data for something good — a greater good — and I support that.” That’s an area where banks can be successful.
[email protected]: There have been stories about algorithms making decisions that seem to be biased against certain groups of people like minorities, immigrants, and women. What’s the thinking within Canadian institutions and at Vector on how we can tackle such issues?
Dickie: Fairness is top of mind for regulators and for banks. But there are a number of challenges. For one, there’s no standard definition. We can’t agree on what fairness means. There’s so much variance. That’s part of the reason we’re coming together to say, “In these contexts, what is our risk level for each domain?” Only then, with the proper controls and monitoring in those domains, do we deploy certain models. But with fairness, even if we were to agree on a standard definition, the problems don’t stop.
While Canadian banks are committed to being fair, the sensitive features on which they must be fair are often not collected based on the interpretation of what constitutes a “valid use.” In addition, there are barriers to collection (which can vary by industry) or these are collected, but not necessarily consistently, which leads to unrepresentative data. If we’re building models in areas where data is unavailable, then banks are attempting to achieve fairness through an unawareness approach. The thought is that is, “I didn’t know your gender, so how could I possibly be biased?”
The problem is that models — like humans, but on a massively amplified scale — are able to pick up on things. So a model that you build may very well be basing everything on gender even if your gender isn’t in there. It might see that every Friday you go do activities traditionally associated with being female (i.e., a manicure/pamper salon), or that you regularly pick up feminine hygiene products from the superstore. And so it can make these deductions, even though you’ve never once said it anywhere.
That’s the big problem with fairness. And if banks don’t have that data, they can’t test their models. One of the founding members of Vector, Rich Zemel, does lot of research in fairness through awareness — which means that you have the sensitive variables to validate and attempt to remediate any evident biases. There’s group fairness and individual fairness, and there are a number of different ways you can obtain parity within those areas. But again, you can’t test for it or correct it if you don’t have the sensitive variables on hand.
So a decision needs to be made here. You either use the models – which may be biased – and may be confronted with reputation risks or fines from the regulators. Or, you start collecting sensitive variables. This has a number of implications around security and sensitivity of data and you need to make sure that it’s housed […]