Why facial recognition software isn’t ready for prime time


In the wake of the manhunt for the Boston bombers, opinions are divided on whether facial recognition technology helped or hindered the search. Headlines like “Why Facial Recognition Failed” (Salon.com) are echoed in a statement from the Boston police commissioner, who told The Washington Post that the technology “came up empty.”

The opposite interpretation can be found at Technorati (“Facial Recognition Technology Helps Identify Boston Marathon Bombing Suspects”). So who is right, and were today’s facial recognition techniques up to the task?

The high-tech video intelligence methods hyped in the media during the manhunt may be available for use by investigators, but that doesn’t mean they’re effective or actually used by law enforcement. Neither San Francisco nor San Jose police use facial recognition, for example, and an FBI biometric system planned for introduction in California and eight other states next year apparently only makes exploratory use of face recognition, relying instead mostly on the trusty fingerprint.

Jim Wayman, director of the National Biometric Test Center at San Jose State University, said automated facial recognition didn’t fail in the Boston case: it simply wasn’t used. Contrary to reports like that of San Francisco’s ABC7, Wayman said video intelligence company 3VR’s products were not used to find the Boston bombing suspects.

3VR did not respond to our request for comment. The FBI also has no large-scale automated face recognition system, according to Wayman.

The essential problem with face recognition is getting an algorithm to correctly match degraded cell phone or surveillance images with well-lit, head-on photos of faces. While this is effortless for the human brain (unless you have prosopagnosia), hair, hats, sunglasses, and facial expressions can throw off automated recognition methods. Of course, before you can even get to the matching stage, you have to identify a suspect, and hope their face is included in driver’s license, mugshot, or other databases.


What video surveillance more broadly was useful for in the Boston case was tracking the movements of the suspects. This still required a considerable human effort: the Post reports one agent watching the same video clip 400 times.

The next development step for facial recognition, both academically and commercially, is 3D, using shadows and facial landmarks to create best-guess models of faces. Face recognition challenges organized by the National Institute of Standards and Technology have expedited improvements at a Moore’s law-like pace, but the nuances that impede computers, like image alignment, occlusion, and face angle, remain a problem.

Better and cheaper (and more ubiquitous) cameras should address the issues of grainy and blurry images; an international standard requires a resolution of 90 pixels between the eyes for facial recognition algorithms to work, says Wayman, whereas the images released of the Boston suspects had 12-20 pixels. A database with which to compare is still required, however; identifying and tracking a face across video streams would be much more useful.

And even when facial recognition technologies improve and mature, the question still remains: should they be ready for prime time, in a way reminiscent of Minority Report? Wayman said currently employed systems that compare live people to their passport photos at airports still have a false negative rate of about 15 percent. If performance in such controlled situations is so fickle, it seems there is still a lot of work to do before these systems can automatically, and accurately, pick out faces of interest from surveillance footage.

Image via Wikimedia Commons user Mrazvan22




First of all thanks for this article, FR is a really interesting subject. You could’ve done something really good about it but here’s why I think you didn’t.

And it’s sad because some of your other articles were pretty good and interesting (the one on finding individuals on social networks for instance).

The first thing I’d like to say, which is not so important compared to the rest, is that I really don’t see the point in putting a picture in the middle of your article without even talking about it or at least saying what it is (apart from giving the source) … It just looks like you’re trying to sound like you know what you’re talking about when actually you’re not. I’m not saying that’s what you’re doing but that’s what it looks like anyways.

So, as it says when you click on the image, it’s a representation given by a neural network architecture called a Hopfield Network. It models the way our memory works and is able to remember and create internal representation of what it sees. This is why it is useful for FR because it can model a face based on a higher level representation of its features.

Now, for the important stuff:

The fact that implementations of specific FR methods and algorithms didn’t work as expected or even weren’t used at all has nothing to do with what FR can or cannot do.

You talk about “today’s facial recognition techniques” but it would be better to talk about “FR techniques that are used”. The techniques used are just a small subset of all available techniques and some of those unused techniques are way better than those people do use. So “today’s FR techniques” would actually be “up to the task” as you say if only people who are acquiring such software were willing to put real effort into understanding what they are using, what their needs are and looking at what the state-of-the-art methods really are.

Some people do amazing things with poor quality images, you just have to look at published papers to see it. They’re also able to identify partially hidden objects or faces. Even when they have hats or sunglasses as you were suggesting.

@brownox: The question is not what software can do but what information it is given. If Picasa recognizes you that easily, it’s because it’s looking in a very small dataset. You can’t just expect Google or anyone else to just “instantly” recognize anyone just because of that. The size of the dataset you have to browse is much bigger in this case. Have you ever heard of computation time or complexity ? Depending on the algorithm, scalability can be a very important issue, and you can’t just compare the performance and accuracy a method yields on a small subset to what it could yield on a bigger set.

@Mary Haskett: It’s not only a question of pixels. And what you’re saying is wrong, a computer does not need “a lot of resolution” as you say. There are lots of research on bio-mimetic algorithms such as artificial neural networks, lots of image processing methods and a lot of machine learning models and algorithms which make a really good job at recognizing patterns, objects, and even human faces on poor quality images. The post you’re talking about says that a computer can’t “fill in the details” just as we humans can do, but this is wrong too, several models and algorithms have been designed and implemented for this purpose. Of course I agree the human eye is extremely more powerful than what computers are capable of for the moment, but the gap is getting smaller and smaller.

I’m really sorry if I’m somehow harsh but it really annoys me when people talk about stuff when it seems like they didn’t even put the slightest effort into understanding what it actually means. Plus you’re giving false information to many readers because they will probably not look into it either. That’s how you end up with huge amount of people saying incredibly wrong stuff and believing it’s true just because they read it on some poorly referenced article like this one …

Mary Haskett

I think part of the confusion is caused by the fact that computers see in a very different way from humans. We are hard-wired from birth to recognize faces – it’s a survival instinct for babies to be able to recognize mom. Computers, not so much.

For computers, it’s all about the pixels. You need a lot of resolution for computer face matching to work – 90 pixels between the eyes is borderline. Crowd shots from surveillance cameras just don’t have the resolution (yet) to be useful.

Here is a great post with pictures that illustrates the point:


Picassa can identify me from grainy, crappy photos that my mom couldn’t recognize me in.

It also is able to identify me from when I was a very small child.

I find it hard to believe that Google couldn’t instantly identify the bommers from the photos provided, assuming that they would be given access to DMV photos as reference.

They either are using crappy software or they didn’t use the software.


Got me thinking… if we had more stereoscopic cameras in surveillance products, the algorithms could score higher?

Whipsmart McCoy

No, you don’t get to use the Boston Marathon bombing to push your point on facial recognition software.


Stephen Wilson

Face Recognition didn’t fail because it wasn’t used??
Let’s look at the broader context.
Dozens of ambulance chasing biometrics advocates raced with appalling haste (within 24 hours of the bombings) to claim that facial recognition would be KEY to the investigations. A senior executive of NEC was even on network TV hawking their technology.
But after all that, it turns out to not even be fit for the purpose.
That must be so incredibly disappointing to any intellectually honest biometrics advocates that they might as well call it a failure.

Nathan H

I would offer alternative considerations for FR in this situation, none correlated, but an out-of-the-box conjecture.

1) It was, has, and did work,, but we are not to know about it, nor even have an inkling that it is currently in practice. Only to those whom they want to know about, use, and operate on, are the ones who do. Ask; why so much extensive ‘drill’ preparation at the finish line, as well as a conglomerate of ‘civilians’ with khaki cargo pants, big black backpacks, and ear buds at that one point, and even intercom announcements to the people to ‘remain calm’ and to ‘not be alarmed’ by the high level of detection teams in that one area. What better way to downplay the fact that a FR alert had been tripped, then to let the journalists tasks to produce a story on it, bash on the fact that is wasn’t used and/or ineffective?

2) It wasn’t used because it requires a substantial level of human resources to validate alerts to provide a near real-time effectiveness. It’s not like Eagle Eye where it monitors, obtains, identifies, searches, validates, and produces alerts all within one entity, there must be a level of high human interaction with the system to achieve operational practicability,, and that is just for a FEW cameras. Is/was it possible for such a system to do so here? and if so, with what/who’s cameras/systems?

3) Claims of it ineffectiveness is the plan… It helps to pass-over an era of FR being the next major bio-metric after fingerprints, and allows other companies like Applied Digital Solutions and Verichip, and over-archingly, the US Gov’t, to go straight to a more ‘controllable’ method of implanted chips that not only ID you without a camera or physical contact, but provide a means of geo-locating, purchasing control, access control, and more.

4) He was intended on being there… this takes a lot of settling to even read into,, to have to consider the fact that somehow alerts were ignored or suppressed by ‘someone’, but consider the over-oppressive actions of the CIA, FBI, DHS and even the hand-off approach of the media when it comes to reviewing & providing to the public imagery that had not been blessed-off by those same alphabet groups. The FBI, once the media got a hold of picture of the suspect and began running them, asked the public to help identify who these men were… that they had no idea.. Then, the story breaks that these men were already known suspicious persons and obtained, what would have been restricted, trips to other Eastern Countries, amid even reports and an inquiry from official Russia as to why these men were allowed to fly. At the very least, if FR doesn’t work full time, the nearly crystal clear photos that were posted all over the media would/could have provided enough potential to muster a match via the DoS Passport photo repository, which scans in all pictures into a FR database. So how can/could they state “we don’t know who they are”, unless they were trying to downplay prior knowledge and provide plausible deniability?

Furthermore, the description provided in the article above of FR is very antiquated with modern day capabilities, as to why 3VR probably didn’t want to respond, and the FBI claims to not have it’s own repository… well, that is partially true,, it is not it’s OWN repository because it is shared, but to say that it doesn’t use one it like hearing the government say that it is unable to track your phone,, which hopefully any American realizes is possible, and ever constant.

Comments are closed.