Our biggest customer is fake (2024)

Our biggest customer is fake (1)Once upon the summer of 2019, we had a problem. And no, it wasn’t just that our name was still the far nerdier “Dataminer” — thankfully, we quickly fixed that.

You see, at Salv we had just pivoted from a fraud platform to an AML (anti-money laundering) platform, but we had the same problem every new startup had. We didn’t have customers. And, well, to find customers we needed to show them our product. And to show them our product, we needed data. And where the heck were we going to find data just lying around?

That’s when, as data scientists, we had a dream. A dream to have a large enough, diverse enough, complex enough, and fresh enough dataset that we could demo our product to potential customers and, ultimately, find the kinds of criminal networks that would enable us to test our software and try out real-life transaction monitoring scenarios.

And, well, to do that, we needed data somehow.

Our biggest customer is fake (2)

How did you get data?

To get data to show prospective customers, we had a few options:

  1. Use real bank data. While this would have been a dream, in real life, it’s never an option. Sad for us data nerds.
  2. Use the prospective customer’s data in hashed format. That would require a bunch of work on behalf of the customer upfront just to see our product, and we didn’t think they’d be likely to agree. Plus, if it was a small fintech that hadn’t launched, they wouldn’t even have data to offer. And, regardless, hashing is hard work.
  3. Use real-life non-bank data and make it into bank transactions. This could be possible. For example, I found a huge set of real life New York taxi data with persons, transaction amounts, start times, end times, etc. That could work. We could use that data as the base and then build bank-type transaction information on top. While not a bad idea, the problem came in when we started thinking about the fact that all the datasets, including taxi data, were finite. If we wanted to scale the data from 10,000 rows to 3,000,000 rows, it would be tough. Sure, we could do a little swapping and duplicating magic, but it wouldn’t be the same.
  4. Create bank-like data ourselves. As our team of data scientists talked more and more, we realized there were some basic algorithms we could use to generate life-like bank-like transactions. A base set of naming conventions. Behavioural rules. A population registry that resembled the real world. Once we set it up, it would continue generating life-like bank data on its own. It could scale.

Which option did you pick?

Well, the title probably gave our choice away.

In order to protect the innocent, in R, we created a realistic world of people, businesses, transactions, and crooks — none of which ever existed. Well, except for the occasional sanctioned entity. And that’s how Fakebank was born. Full of all kinds of fake transactions, people, companies, and crime, Fakebank is exactly what we data scientists at Salv were dreaming of.

Officially, Fakebank is a database of artificially-generated financial transactions. Unofficially, if I do say so myself, it’s far more interesting.

Last summer, when we launched Fakebank, it had 0 customers. As I write today, Fakebank now has 102,000 — and growing. But we’ve built is so that we can expand to even something like 3 million in a matter of days. We just haven’t seen the need. Yet.

If you’re a data geek like me, then get excited, because I’m going to tell you all the nitty gritty details of how we created it. If you’re not a data geek, well, I think you still may find the process interesting. Or, at the very least, you can impress your math nerd friends the next time you have lunch.

Our biggest customer is fake (3)

What services does Fakebank offer?

To be frank, Fakebank isn’t exactly a high street bank — its offerings are a bit sparse. At the moment, Fakebank has the following financial services implemented:

  • international wire transfer
  • national wire transfer (with a given list of banks)
  • internal wire transfer (within Fakebank)
  • cash withdrawal (using credit/debit card via ATM)
  • cash deposit (using credit/debit card via ATM)

But, if you’re a prospective client worried that Fakebank is too simple, we can easily add a laundry list of Fakebank services like debit card payments and loans if needed. At least, we can add services way easier than you can launch them in real life. No offense.

Our biggest customer is fake (4)

Who uses Fakebank?

As you might have suspected, Fakebank operates in Fakenation as a part of Fakeworld. As an official member state, Fakenation has an individual population registry and a business registry. And it is to this set of fake people and fake businesses in its fake nation that Fakebank markets its financial products and services.

The good news is that Fakebank has some pretty good marketing. Which is why, once an hour, a bunch of people and businesses from those population and business registries are so impressed by Fakebank’s adverts that they become Fakebank customers.

Voila! Converted customers. Wish it was that easy in real life, eh?

Fakenation’s individual and business registries

If we pull back the curtains for a second, you’ll see that when we started this project, we combined our data science nerd power, a lot of googling, and our nearest friends in the financial industry. Which is how we came up with a base list of 44 individuals and 23 business types.

Fun fact, if you look through the names and addresses, you’ll see they often have an Estonian flavour. Or have a very not Estonian name but happen to live in Estonia in a town that doesn’t actually exist. Go figure for a team of 5 Estonian data scientists.

Our biggest customer is fake (5)Apparently “Kacy Wunsch”, definitely not an Estonian name, happily lives in Estonia.

From there, we added a really long list of various characteristics, and then options for each characteristic. Top it off with a random name generator and, boom, you have the basis of Fakenation’s individual and business registries.

As mentioned before, it’s from these registries that Fakebank “converts” entities into customers.

The reason we decided to start our dataset with a with registries as our base is many-fold:

  1. We can set up complex shareholder networks for businesses.
  2. We can use the registries to populate data for multiple banks.
  3. We can generate a network of transactions between banks.
  4. We can create user networks between banks, because banks X and Y can also be set up to convert entities from the same register.

Fakebank’s customer profiles

Just like in real life, each person and business from those Fakenation registries are unique. Thus, each Fakebank customer has their own set of unique data that makes up that person or business. For example — Sly Therin, private person, 30-39 years old, employee, averages 40 transactions per month, minimum outgoing 1€, maximum outgoing 3000€, min incoming 1600€, maximum incoming 2000€.

Fakebank customer profiles have data on:

  • Average monthly transaction amounts, along with minimum & maximum of a single transaction
  • Average monthly transaction count
  • Age range of user (individuals)
  • Founding year (businesses)
  • Shareholder list (businesses)

Fakebank customer bank accounts

Once an individual or business succumbs to the marketing ploys of Fakebank and signs up to become a Fakebank customer, we automatically assign them anywhere from 1 to 5 bank accounts inside Fakebank, along with a corresponding currency tied to each specific account. Just like a real human.

Today, Fakebank converts around 300 new customers per day, but that will increase gradually over time.

Generating the data

As mentioned, impressively, Fakebank manages to convert more customers — individuals and businesses alike — once per hour. And these customers then, obviously, generate their own transactions. Transactions data is set to generate every minute.

Currently as I write, Fakebank has around 9,000 transactions per hour and 80,000 transactions a day. Not bad growth for a startup bank, eh?

Our biggest customer is fake (6)

How do you spot suspicious activity and entities?

If all of Fakebank’s customers were completely innocent, then you and I probably wouldn’t care. And, well, that would be pretty boring. Especially if you’re testing an AML platform like our potential customers are.

So, good news! Or, rather, bad news — just like in real life, Fakebank has a bunch of shady characters and transactions happening all the time. At random intervals, shady transactions are generated inside Fakebank. Usually every day a few SAR (suspicious activity report) cases are created.

Defining suspicious activity

To define what was suspicious, inside Salv, a bunch of people sat together and put together a list of set scenarios that would be considered suspicious activity. There’s a long list of these actions, but, for example it might mean quickly emptying an account after a large sum is received. Or a customer who normally transacts in smaller amounts suddenly sends a massive amount. Or even if someone just receives an unusually high amount from someone in a high-risk country.

The more fun part is that we made sure that around 10% of the cases would set off some sort of alert to check, but then, when you dig deeper, you’ll find the activity is quite normal. The real criminals are less than 1%.

Sanctioned counterparties

Sanctions is the one scenario in which Fakebank doesn’t use fake data — we deemed it vital in order to test the efficacy of KYC procedures. That’s why, at random intervals, Fakebank customers, counterparties and business shareholders are drawn from the Dow Jones’ Sanctioned Persons database to check against.

Counterparties

As mentioned, Fakebank is a part of Fakenation. And Fakebank customers all come from Fakenation’s individual and business registries.

However, the same isn’t necessarily true of customer counterparties. If the Fakebank customer is involved with a counterparty from outside of the Fakenation registry, then it is generated from a separate registry of Fake-ForeignNation Counterparties. And, as you might imagine, different fake foreign nations also hold different levels of risk. Smart, right?

Currently, around 90% of Fakebank transactions are generated by using connections with entities listed in the Fakenation’s individuals’ register. About 9% are generated randomly with a random counterparty selected from the counterparty database.

About 1% of transactions have completely random counterparty (generated at the moment of transaction).

Built-in complexities and networks

Life is complex — even in Fakeworld. Which is why, in order to mimic real life situations, we built some complexities into the data:

  • Built-in connections. Generated Fakeworld data has randomly generated, built-in connections between users. Like employer-employee relationships for salary payments, regular individuals paying their monthly utility bills, friends who split a pub bill, and even family members — but without the drama, and, sadly, without the same last names. Which is an important note. These connections are still basic. You and your sister won’t have the same last name. And your references won’t indicate the nature of your connection with the other entity.
  • Daily transaction rhythms. In real life and in Fakebank, most transactions are sent on workdays between 8-18. On the weekends and off-business hours the volumes are lower. Smart, right?
  • Transaction amount patterns. While the transactions amounts are generated randomly, they do have pre-given parameters taken from that particular user and their connections’ profiles. Mostly exponential distribution is used to generate amounts. Which, in plain English, means that amounts are mostly small, but with occasional higher-end amounts.

Our biggest customer is fake (7)A few small transactions and then… a bit bigger one.

  • Customers with multiple accounts. Customers might have a regular account in GBP, a savings account in EUR, and a USD account for investments. Fakebank customers have anywhere from 1-5 accounts, but only one currency per account.
  • Randomly assigned SICs. Businesses have a Standard Industrial Classification (SIC) codes randomly assigned to them so you can tell if it’s food service, retail, IT, finance, or whatever else comes up. Although, as a note, this is also a fairly basic function. It won’t correlate to shareholders, transaction amounts, or any other data.
  • Everyday references — with an occasional surprise. For around 90% of Fakebank customer transactions, the references are generated based on connection parameters. I.e., an individual person sending money to a business might have a reference like “utility bill.” For the remaining 10%, it gets a little more interesting. The reference will be anywhere from 1 to 3 randomly generated words. Which means you might find yourself chuckling at some of the outcomes — we do.
    Our biggest customer is fake (8)Spot that unusual reference.

Our biggest customer is fake (9)

What’s in a transaction?

Finally, you’ll probably want to know about the actual transactions. We did our best to make them resemble real ones — along with auxiliary tables like account details, account balance, credit card details etc.

But, as we mentioned earlier, Fakebank doesn’t have a complex product offering. So these transactions are modeled off of a commercial bank that offers these types of limited services.

Included transaction information

For every transaction you’ll find in Fakebank, you’ll also find all of this basic information:

  • Financial details — currency, amount, exchange rate (taken from live source), and reference text.
  • Transaction type — i.e. whether it was a deposit, a withdrawal, or a wire transfer
  • Money direction — whether the transaction was incoming or outgoing
  • Status — pending, completed, cancelled, or suspended
  • Counterparty details — name, account, bank’s SWIFT code

Our biggest customer is fake (10)

How do you generate individual and business user datasets?

Non-data people, this is a little boring, so you can probably skip to the next section. But if you want the details, here’s the order:

Generating individual user transactions

Salv’s seed data for individuals → Data is then generated into Fakenation’s individuals registry → Becomes a Fakebank customer → Transactions are generated according to customer’s individual profiles or the customer’s connections.

Generating business transactions

Salv’s seed data for businesses → Data is then generated into Fakenation’s business registry (with some UBOs from the individuals registry) → Business then becomes a Fakebank customer → Then transactions are generated according to that business profile

Our biggest customer is fake (11)

How have you benefited from generating your own data?

When we began building Fakebank our use case was rather narrow, as time went on we realized there were some massive benefits to having our very own fake customer.

  • Data for live demos. This, of course, helped enormously. Instead of showing potential customers a set of static screenshots, we could show them our product in action.
  • Product improvements. It’s far better to learn on a Fakebank than on a real customer. One time, a bug in our data-generation code brought down the creation of data for an entire week. Once we finally got it back up, it flooded our platform with a massive amount of data — which really tested our limits. That helped us expose some of our platform’s weaknesses and fix them.
  • Performance testing. With the previous example, when all the data came in one huge wave, it had the happy side effect of testing our platform’s ability to handle a large load. Afterwards, our product team used the experience to make some much needed improvements. And make sure it was even faster.
  • User experience testing. AML agents need a lot of info to make tough decisions — Crook? Or no crook? With Fakebank, we have a massive amount of data on these Fake customers, but it then became quite clear that an investigator would need to see the data in certain arrangements. Having synthetic data really helped our UX designers manage what AML agents saw in a single view, without first failing on real customers.Scenario and rules testing.* Our product team and data scientists constantly want to test new languages and rules to make sure they’re performing as intended. But without data, it’s impossible. Now that they have Fakebank and its band of Fakecustomers, it’s easy.
  • Platform monitoring. Without any data flowing to the platform, it’s much harder to monitor the nuances of uptime and downtime. Now, with Fakebank, we can.
  • Teaching Salvers more AML. Some of our Salvers don’t have an AML background but really want to learn a lot more. That’s why we organized a hackathon — in a bog, of course — where Salvers learned how to write a scenario in SQL and then find out how many criminals they could catch using their new rules. Each team had a different “shady” scenario and a data scientist as part of their team to help write the rules. Unanimously, Salvers felt great to spending time in the shoes of everyday AML investigators— even for a few hours.

Our biggest customer is fake (12)Several teams of Salvers working on being an AML agent for a day.

  • Testing AML agent skills. An idea we’ve had for a while is to eventually use the platform to help teams check their skills of telling suspicious behaviour from criminal behaviour. We love the idea of giving back to AML teams that are already working so hard, but it’ll probably be awhile before we get to this one. But someday, we hope!

Our biggest customer is fake (13)

That’s just an overview

All right all right. Fascinating details over.

But, believe it or not, this really is just a quick overview. If you’re interested in how you can try out Fakebank, learning more about how it works, or think you might want to use the dataset, then get in touch directly with me: [emailprotected]

This is what gets me up in the morning.

Our biggest customer is fake (2024)

FAQs

Are survey rewards legit? ›

Real surveys are a way for a company to gather information from their customers. They may offer a small discount to encourage participation, but the reward is not the focus. Watch for survey questions or gifts that don't match the store.

How to know if a survey site is legit? ›

8 Ways to Spot a Survey Scam
  1. You're asked to pay to participate in a survey. ...
  2. You're asked to share sensitive information before you can take the survey. ...
  3. They advertise on Craigslist and similar sites asking for your email address. ...
  4. They offer too much money. ...
  5. You're directed to download attachments. ...
  6. They advertise aggressively.

Are phone surveys safe? ›

Some time into the call, the scammer will start a high- pressure sales pitch. Second, information not readily known to the public is valuable to scammers. The scammer may use a phony survey to get the citizen to give up personal information. The personal information is then used against the citizen to commit fraud.

Is scamming people illegal? ›

Fraud is a criminal offense, but a person alleged to have committed fraud can also be taken to civil court. A prosecutor brings criminal charges in criminal court. A fraud victim may file a civil lawsuit against the alleged scammer to recover money.

Is there a survey that actually pay you? ›

Nobody gets rich from online surveys, but they can be fun if you want to earn a small amount of money without doing much work. Swagbucks insists that you can earn $1 to $5 per day by just taking several of its surveys. If you jump around among multiple survey sites, you can probably increase that.

What is a legit survey site to earn money? ›

Which paid survey sites are legitimate? Online surveys are a legitimate way for brands to get consumer feedback on their products and services. Some of the best survey sites include Branded Surveys, Toluna, Swagbucks, LifePoints, OnePoll, i-Say (IPSOS), InboxPounds, PopulusLive, Opinion Outpost and Valued Opinions.

Is there a catch to online surveys? ›

You are asked for confidential or personal information

All reputable online paid surveys websites will ask for some amount of information for the purposes of setting up your account. If you are required to provide confidential information in order to complete the paid surveys online, it's likely a scam.

How do I make sure my survey is valid? ›

The questions must be clear and unambiguous and must reflect the research idea. The survey must be pretested to ensure the questions are clear, unambiguous, and are clearly linked to the research idea.

Are there any legit survey apps? ›

Qualtrics is one of the powerful survey apps known for its AI-guided customer experience software, offering conversational surveys for deeper insights and tailored solutions. It empowers users to monitor and optimize experiences in real time, whether at home, in the office, or on the go.

Should I answer phone surveys? ›

If you are asked to participate in a telephone survey and are interested in participating, you don't have to answer their questions straight away. If you are in any doubt at all, ask the caller which organisation they are from and arrange a time for them to call you back.

Should I trust online surveys? ›

While you can earn extra money through surveys, it's unlikely to make a fortune. Legitimate sites offer reasonable compensation for your participation. Avoid Paying: Steer clear of websites that charge a fee to join. Legitimate survey websites don't ask for payment.

What is a major problem with using telephone surveys? ›

Telephone surveys tend to have a significant risk of coverage error, which would result in a defective frame and biased estimates when using survey data. This bias occurs when you gather survey data from individuals with access to telephones.

Do banks refund scammed money? ›

While getting a refund after losing money to scammers is possible, the outcome depends on factors like bank policy, the type of scam, the amount lost, and how quickly the scam was reported. According to US federal law, banks must reimburse you for unauthorized transactions.

Can a scammer be traced? ›

Tracing a scammer demands persistence, leveraging online resources, and often legal assistance. Initiating with digital footprints like email addresses or phone numbers, employing reverse search tools, and reporting to authorities can unearth valuable leads.

How to catch a scammer? ›

5 tips to help you recognize an online scam
  1. They say you've won a huge prize. ...
  2. They want you to pay in a certain way. ...
  3. They say it's an emergency. ...
  4. They say they're from a government organization or company. ...
  5. The email is littered with grammatical errors.

Are taking surveys for money legit? ›

Online paid surveys are a legitimate business, as many different companies utilize them to conduct market research. However, amidst the many reputable online paid surveys, there are other survey sites that are actually sites created by scammers or cybercriminals.

Does surveys on the go actually pay you? ›

It's anonymized and paid for—you earn cash. That's important because many survey companies get data in ways that they shouldn't. That's why you have our promise to always treat you fairly. Surveys On The Go® will always pay cash for your opinions.

Are paid surveys worth it? ›

Some surveys have dozens of questions, so be prepared to put significant time in. How much money are you looking to make? Some people can make a hefty sum by taking surveys, but that isn't the norm. Don't expect to get rich quickly or replace your full-time income by taking paid surveys.

Are survey money apps real? ›

Yes, they do. Paid survey apps most likely won't make you a millionaire overnight but it's a legitimate business model where some companies provide cash, gift cards, coupons or other small rewards in return for completing a number of surveys.

Top Articles
Latest Posts
Article information

Author: Kelle Weber

Last Updated:

Views: 5731

Rating: 4.2 / 5 (53 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Kelle Weber

Birthday: 2000-08-05

Address: 6796 Juan Square, Markfort, MN 58988

Phone: +8215934114615

Job: Hospitality Director

Hobby: tabletop games, Foreign language learning, Leather crafting, Horseback riding, Swimming, Knapping, Handball

Introduction: My name is Kelle Weber, I am a magnificent, enchanting, fair, joyous, light, determined, joyous person who loves writing and wants to share my knowledge and understanding with you.