Smartnumbers Investigate
A real-time fraud investigation platform that turned a passive intelligence tool into the primary workflow for fraud teams — weekly active users up 120% from a low base, ML accuracy up 78% through a decision feedback loop, and detection time down from two weeks to same-day, sometimes live.
Design Gallery
The Problem
When I took this on, the product was search-and-explore only: investigators could look at data, but not act on it. There was no feedback loop, so the ML model never improved from real decisions. Investigators logged in a couple of times a week at most, duplicating work across other tools because there was nowhere to record an outcome. The product was positioned as "intelligence": supplementary, not primary.
My Role
I owned discovery, UX strategy, interaction design, and the decision-feedback-loop architecture, embedded with product and engineering as the only designer on the team. I designed the three-way classifier, the alert configuration layer, the notification batching system, and the denylist interaction pattern.
Discovery
I started by shadowing investigators in their offices, sitting by their shoulders while they worked, not hearing a cleaned-up version in a research call.
The insight: they already used historical calls to judge whether something was fraud or genuine. Time told them. But that judgement was trapped in their heads or scattered across other tools, while the product held intelligence no other platform had, and gave them nowhere to apply their expertise to it.
Investigators already knew how to judge fraud from genuine: that judgement was trapped in their heads. The product had intelligence no one else had, and nowhere for that expertise to go.
Usage data backed this up: engagement was flat. Users came in, searched, left. No state, no progress, no reason to return. If the product could capture that investigator judgement — fraud, genuine, suspicious — it would do three things at once: track real usage, feed the ML model, and let investigators skip work they'd already done.
Design Decisions
A three-way classifier, not true/false
Instead of a binary call, I designed a three-state decision (fraud / genuine / suspicious) so investigators could record confidence and partial signals. This gave the model richer training data and let users build a history of what they'd already reviewed.
The core mechanism the whole feedback loop runs on.
From "explore" to "work through"
The original interface was built for search. I redesigned the narrative around action — review calls, mark them, move on — turning a search box investigators had to populate themselves into a queue grouping connections no other platform could surface.
An investigator adding intelligence linked to a call.
Alert conditions investigators controlled
We initially set standard alert conditions ourselves — logical, and wrong (more on that below). The fix was a configurable layer giving investigators control over what triggered an alert, especially where their own organisation already automated certain conditions without exposing the underlying model logic or IP. I ran offline experiments with users to shape how much of that logic could be surfaced safely.
The fix for the default-feed mistake: control without exposing the model.
Notification batching that respected attention
To cut noise, the system sent one alert the first time a configured condition was met, then batched further alerts into 30-minute, hourly, or daily intervals depending on user preference. Before any batch went out, we checked whether those calls had already been worked by anyone in the organisation. If so, the batch was suppressed. No duplicate alerts, no wasted attention.
A denylist as a human override
The ML model was a weak link: fraudsters adapt to evade detection. I added a local, tenant-level denylist: when an investigator (or an automation rule) added a number to it, that signal instantly jumped to the highest risk score. Fast, human-overrideable, immediately effective: a safety valve the model itself couldn't provide.
Every decision feeds the model
Each fraud/genuine/suspicious classification fed back into the ML pipeline. The 78% accuracy improvement came from this loop: not a better algorithm alone, but better-labelled data from investigators using the product as part of their daily work.
What I Got Wrong
I assumed investigators would want to work from a standardised alert feed, so we built a sensible default condition set covering common scenarios. It was logical, and it was wrong.
I assumed investigators wanted a standardised alert feed. It was logical, and it was wrong. They already had alerting elsewhere; our feed sat largely unused.
Investigators already had alerting elsewhere and only came to Smartnumbers to check for intelligence; our feed sat largely unused, even though daily logins stayed steady.
Digging into support tickets and user calls gave a consistent message: the default alerts were too noisy, duplicating signals investigators had already filtered elsewhere. I worked with the PM to pivot from "here's your alert feed" to "here's your configured queue", designing the configurable condition layer and exposing controls so users shaped what landed in front of them. That shift moved the product from occasional passive use to a daily active workflow.
The Pivot: from "here's your alert feed" to "here's your configured queue." That shift moved the product from occasional use to a daily active workflow.
Outcome
- 120% increase in weekly active users, from a low base
- 78% improvement in ML accuracy via the decision feedback loop
- Detection time dropped from two weeks to same-day, sometimes live
- The product's narrative shifted from "intelligence tool" to investigators' primary platform
What I'd Do Differently
- Design for configurability from the start, rather than recovering into it after low engagement
- Map the alert ecosystem investigators already live in before building a first default feed
- Make the feedback loop visible to users, so they understand how their judgements improve the model over time
From Reactive to Predictive
Smartnumbers Investigate transformed how fraud teams operate, from chasing yesterday's fraud to stopping today's. By combining real-time flagging with intelligent decision capture, we built a learning system that gets smarter with every investigation.