AI and Data Science at Hawk Powering the Future of Anti-Money Laundering

The AI and Data Science teams at HAWK:AI are building the Anti-Money Laundering tools of the future. Current rule-based systems have been used, sometimes for decades, to detect and flag suspicious behavior that can indicate money laundering, fraud, or other forms of financial crime. However, in recent years these systems have proved inadequate in two key areas.

The first is correctly identifying instances of financial crime, which is becoming harder as consumer behavior changes due to new and innovative transaction methods. This leads to high levels of false positives (in some cases as many as 95% of transactions flagged as suspicious), which are handled in an inefficient and ineffective manner by ever-increasing compliance teams.

The second is effectively detecting legitimate suspicious behavior. Analyzing this behavior allows compliance teams to spot previously unknown methods of financial crime, allowing them to act before these become commonplace.

Modern financial institutions (FIs) need to solve these key challenges in a manner that is fully explainable, both to internal stakeholders and regulators and in real-time. Read further to discover how we support FIs in the fight against financial crime.

False Positive Reduction

A limitation of rule-based systems is their ability to separate signal from noise and detect “true positive” cases of financial crime. The issue with inflexible rules is that once understood through trial and error, criminals become aware of the rules and engage in behavior to circumvent them. Worse still, changing behavior of regular consumers, spurred by innovation, results in patterns that are flagged as “false positives”. The resulting high workload has increased compliance team headcounts across the industry, often wasting valuable man-hours on these false positives.

A solution is to use AI models that look at past accumulated behavior, derive patterns that update as consumer behavior does and work in combination with rules to close cases automatically. These models are able to reduce false-positive rates by 70% to 90% and can deliver value from day one, even if there are no past operator decisions to help guide the model. That’s not to say operator feedback isn’t valuable; the models recognize and group compliance operator behavior, spotting patterns, and “learning” how a human would react to certain situations.

Labeled and Unlabeled learning

Anyone familiar with AI-powered models knows that labeled data is key to training models and improving both accuracy and precision. In the world of financial crime, confirmed fraud and AML cases are rare, which means our teams often have to work with low volumes of sparsely labeled data.

We mitigate this problem in two ways. The first is through Variational Auto Encoders, which embeds customer behavior into latent space, allowing deep learning algorithms to identify and analyze common grouping that indicates relationships or similar behavior. This allows our customers to benefit from anomaly detection and false-positive reduction from data one, even if we don’t have labeled data or past operator behavior.

The second way is through semi-supervised learning. We look at some of the most interesting cases with domain experts that flag suspicious behavior as the models train to help guide them towards correct processing. More importantly, we build the application in a way that allows our customers themselves to provide feedback into the system, without even consciously recognizing it. Examples of this include when operators escalate (true positive) or close a case (false positive), this feedback is taken into account and helps improve the model.

As a result, we need significantly less data to train our models to high levels of performance (which only improves when deployed in live environments).

Behavioral analytics and anomaly detection

Behavioral analytics are a key to detecting anomalies and spotting known money laundering patterns, such as Money Mules (which can be a fraud, money laundering, and sanctions problem all at once) or “Fan-in, Fan-out” scenarios. These scenarios are not effectively detected by rules, which is where flexible AI models help. By comparing actual transactions to clusters of what we consider “normal client behavior”, we’re able to spot unknown typologies during transaction monitoring. Deviations could include unexpected transactions for a particular customer segment. An example could be the ridesharing industry. A driver and a customer could collude to book rides, paid in cash, which don’t take place. Our models can detect this scheme by uncovering unusually high transaction volumes for a driver or end customer, unusual timing of the rides, frequent split transactions, and more.

Another example could be complete changes in behavior that don’t belong in a typical customer segment. Consider a store using the merchant code for a Bakery. However, their transaction behavior is more typically associated with a very different type of business, such as an automotive store or an escort service. This is clearly suspicious behavior that our models would detect.

Once again, through Variational Auto-Encoders we are able to represent this behavior in latent space, mapping out expected behavior and allowing deep learning algorithms to highlight any deviations.

To ensure the AI models adapt to changing behavior (a key benefit of AI over a rule-based approach), segmentation analysis is repeated regularly to identify new structures and new bad actors within those structures.

Information Sharing – the future of global AML

Secure information sharing leading to Joint Network Analysis has long been touted as the future of improved global AML efforts. Much has been said on this topic over the last decade, and initiatives like TMNL have seen success that will drive industry adoption. Having this future vision is not enough, however, it’s important to build the systems that deliver value today while preparing for the ideal future scenario. We see four distinct levels of information sharing, each improving the detection accuracy over the last:

Support for digital Request for Information (RFI) – A method for subsidiaries or external institutions to ask for and respond to information requests.
A central repository of Typologies & AI Models - A continuously growing store of specialized AI models detecting specific AML and Fraud typologies (such as money mules, fan-in, etc) that can be used by all customers.
Decentralized Learning (also known as Federated Learning) - Refinement of model performance based on decentralized data supplied by different external institutions
Joint Network Analysis – The highest level of information sharing, allowing identification of new cross-institutional typologies/networks, ideally in real-time.

We’re always happy to discuss the Science Powering the Future of AML and Anti-Fraud

Get in touch with the HAWK:AI team