5 ways to sniff out online fakers


The problem of online fraud, fake reviews and sock puppetry is only going to get worse, according to recent research. But there are ways to identify likely perpetrators and that’s what Sift Science aims to do.

Sift Science team.

The 8-person San Francisco startup uses machine learning to analyze user interaction with web sites and create a digital profile of who will likely perpetrate online fraud, said company co-founder Brandon Ballinger, an ex-Google(s goog) software engineer.

Companies can use the service — built on Hadoop, HBase, Avro and MongoDB —  by adding some Javascript code to their sites and then using JSON (JavaScript Object Notation) APIs “to track transactions, bans, chargebacks, or custom event types,” according to the company.

Here are some early findings based on the private beta of the service:

1: Fraudsters tend to be nightowls. Most fake accounts are created late at night local time: 3:00 a.m is apparently the witching hour.

2: Bad guys stick with old technology.  People using Chrome on Windows XP are four times more likely to create a fake ID than the average user. (Firefox users are 50 percent more likely than average to create a faux account.)

On the other side of the same coin:

3: Fakers don’t update.  An account created on Safari running on Mac OS X(s aapl) is about 30 percent less likely to be fake. Those running IE9 on Windows 7 are 33 percent less likely than average to be fake.

4: Yahoo email is big. Folks with Yahoo.com(s yhoo) email accounts are five times more likely to create a fake account than someone using Gmail.com or Comcast.com (s cmcsa) addresses.

5: Geography is key. Most traffic coming from Nigeria is fraudulent but also goes through a proxy to disguise its point of origin.

“Based on user actions, we build a model of what a normal user would do on a site versus what a fradulent user would do. We look at the time of account creation, the sequence of pages viewed. If they’re browsing around, they’re probably normal. If they set up an account and jump straight to a transaction, probably not,” Ballinger told me by phone. But then again, they’re tricky. Sift Science found that someone who opens an account, then waits an hour before transacting is 7 times more likely to be fraudulent than the average user.

The proess is similar to Google Analytics in that Sift Science creates a history of user events and comes up with a score for each user that rates the likelihood that he or she is involved in fraud, he said.

Sift Science is heavy on former Googlers:  6 employees are ex-Google engineers. Jason Tan was former CTO of BuzzLabs and Fred Sadaghiani was CTO of Teachstreet.



This article was by an poser for other posers

Perhaps a email that begins

“Consulate wants to transfer 1 million dollars to you” is the only legitimate statement you could have made.

Late night ? There’s this thing called earth that is round and you have overseas orders – complex I know.


The article says “Most fake accounts are created late at night local time: 3:00 a.m “. You obviously don’t know what ‘localtime’ means.


“People using Chrome on Windows XP are four times more likely to create a fake ID than the average user”

I don’t see how this tells us much unless we know the relevant percentages, i.e., if 1% of all users are fake but 4% of people using Chrome on Windows XP are fake, so what? (The overwhelming majority are still real users. Any business flagging people on Chrome on Windows XP as probably fake would therefore be driving away mostly legitimate users by making things tough for them.


Ulthman, (or is it Barrister Ulthman?) there is a reason most internet fraud is referred to as a “419”. This so called “Advance Fee Fraud” has become an industry in Nigeria, and the surrounding countries. Another common name is “419 fraud” after Section 419 of the Nigerian Penal code, the section that specifically prohibits this type of crime. While the scam is not limited to Nigeria, the nation has become associated with this fraud and it has earned an unenviable reputation for being an epicenter of email scam crimes.
Check out : https://en.wikipedia.org/wiki/Nigerian_scam


OMFG… It’s 3:36am and I’m reading this in Chrome on XP. At least I’m in Oklahoma, but hey there are them proxies! If people like these have their way, it’s all going to end in tears.


As someone who deals with Nigeria on a daily basis I can say fraud is a major issue and even though there is a problem in most countries Nigeria is famous for its culture of corruption. Anyone who deals with Africa will note this and though the 100% claim is absurd – so is the claim that most Americans have no idea about the rest of the world.

Getting back to the actual topic … the examples shown are crude and anyone depending on this type of analysis should hire those from the finance world to get more insight – especially anyone from the money transfer (remittance) world.

Anyways .. hope to see more metrics.

A friend

This is nothing but bs, fraudsters will adapt quickly. Simply setting my user agent to Safari running on Mac, clicking a few pages and not logging in from Nigeria will defeat your screening procedure? How long do you think it takes for them to catch on?


Hi, speaking for my company, Sock Puppets R Us, we have taken a keen interest in this post. Your article is a very good start, but would you please provide another more detailed list all the steps we should be taking to appear more real?

John Gunn

Was thinking the same thing socky. Seurity companies rarely give the bad guys a roadmap to the was they get screened out


“Most traffic coming from Nigeria is fraudulent”

Do you have evidence of your claim? I think is pure predudice.


This reminds me of nefrology (I believe it was called) in which cranum measurements were taken as predictors of criminal nature. In the end all you will have are inter-historically (because things change) and internally contradicting statistics that prove both that chrome users are the best and the worst computer-frauds in the world.


That would be “Phrenology”. Some of Terry Pratchett’s books also featured the opposite, retrophrenology, where the practitioner would change the shape of the patient’s cranium in order to influence whether their criminal tendencies (or willingness to use Chrome or XP). Each retrophrenologist came equipped with a big mallet… I suppose the lart would be a similar technology.


What total prove did you have to justify that Most Fake Traffic comes from Nigeria.

Fake traffics can come from anywhere in the globe, due to proxy functionality. Erase that perception you are trying to inject.


“Most traffic coming from Nigeria is fraudulent”

Can you or Sift Science prove that patently absurd and frankly insulting claim?


I think what they mean is that most traffic that “appears” to come from Nigeria is fraudulent…
Yet the only ever royal family member to contact me was from Nigeria! So… You can’t really blame their statement..

Dominic Amann

Hard any one individual to prove, but I can state that 100% if the traffic in my inbox from Nigeria is fraudulent. Would you like me to forward you the contents of my junk box?


All well and good. I the bottom line value in impacting efficiencies – but perhaps quite marginally. It will be a commercial success though for sure because of the strength of the team.
All you have to do these days is plugin some ML to achieve the insights of accepted wisdoms and you’ve got a hight tech start up. I expect similar results could be achieved without ML – by smart application of simple heuristics. After all the problem space here is not complex.
Still, this is probably a good entry point to the market and will open up other opportunities as they go. Industry begets opportunity.
Its at least refreshing to hear about a start up thats a bit above the TechCrunch dross. Good luck to them.

David Mytton

This is a great example of a good (big) data product. It’s tracking very specific activity e.g. conversions or signups and the characteristics which call out those which later turn out to be false/fraudulent in order to do some advance screening. This is a very close connection between the start and end of the flow which makes it easier (not not easy) than the much more generic “big data” claims that are trendy right now.


not to burst your buble, but are you seriously have no clue or just for the startup badge?

Comments are closed.