Session Name: 2 The Golden Needle in the Haystack
S1 Stacey Higginbotham S2 Robert Frohwein S3 Douglas Merrill
STACEY H 00:00
What is with the weird circus music? That was kind of creepy to come into.
ROBERT F 00:04
Had a modern Adam’s Family feel to it.
DOUGLAS M 00:07
It did, I think it’s probably appropriate for you.
STACEY H 00:09
There we go. So, we’re going to get you started on that creepy circus Adam’s family note. This panel, we’re going to talk a lot about data in the financial world, with two innovators in the financial world. Let’s start with Robert, you want to introduce yourself?
ROBERT F 00:09
Sure, Rob Frohwein, founder and CEO of Kabbage we provide work and capital to small business all across the US. And we just launched in the UK in February, millions of these businesses, tens of billions of dollars a year transacting, even hundreds of billions of dollars, but no access to working capital. So, leverage an unbelievable a number of data points to really understand these business, at a very gradual level. So, that we can provide them capital, and like most banks today who don’t provide working capital to small business.
STACEY H 00:57
Alright, and Douglas?
DOUGLAS M 00:59
Sixty million Americans last year took out a paid day loan. A paid day loan is a 14-day loan, balloon payment at the end, for a fee. So, it’s not an interest-based loan. It’s a fee-based loan. The average store payday loan, coat you about 17 bucks per hundred. So, for the privilege borrowing $500 you’ll pay $1400 in average to pay the loan back. I think that’s not fair, but it is an important part of credit. People need credit, they’re not bad people they have bad credit. Things happen, they had a flight tire, and all kinds of things just go wrong. And you need credit. I founded ZestFinance to provide fair transparent and lower cost credit to that 60 million American’s who aren’t bad people, but have bad credit.
STACEY H 01:42
So, what these guys have in common is they’ve taken data and they’ve both created not just products, but companies around kind of massive streams of data and applying it to underwriting standards. So, kind of with that in mind, let’s actually first meet some of your customers, if you can kind of bring the point home. And you guys do this, we’ll start with Douglas.
DOUGLAS M 02:08
So already throughout the 60 million number, which I will repeat again. The 60 million is twice the population of Canada, it’s four times the population of Australia, it’s a pretty big chunk of folks. They are not bad people. On average they work at least 30 hours a week. They typically have at least one full time job in an household. They’re a lot of single parents in the mix. They make between 25 and 35 thousand dollars a year. These are your nurses, your bus drivers, your nannies. Good people, good employed people. I founded the company for a personal reason. Namely my sister-in-law affected is one of those people. She’s a single mother of three, she holds a full time job, and is a full time student. It’s just not possible to do that in the real world. The only credit she had is from payday loans. I wanted to build a company that could give people like her a better deal, using a lot more data and we’re seeing that.
STACEY H 03:01
And you use how many data points?
DOUGLAS M 03:03
We use 70,000 signals, which we then pass through 10 parallel machine learning algorithms of different kinds of techniques. Each machine running algorithm focus on a slightly different idea, maybe they care about default, maybe they care about collectability. They each have different opinions and then we take those 10 models and we un-symbol the, which is more or less taking your 10 best friends and asking them about their restaurant recommendations. And each one person likes the service, one-person tablecloths, whatever. They each have opinions and that makes your opinion better. We take those 10 machine learning models and un-symbol them together and come up with the one final answer. And the whole source from getting the signals, cleaning them passing them to the 10 machine running algorithms and un-symbol-ing them takes about three seconds from send to stern.
ROBERT F 11:30
You think there faster?
STACEY H 03:54
But, Robert tell me about one of your customers.
ROBERT F 03:59
Sure, Francisco Tovar, he came to us two years ago, had created a business called TheLatin Products.com, and also sold on places like eBay and Amazon. Never had credit since he came to the country, went to a bank for a line of capital, and they basically said to him–literally this is a little over two years ago. We think this internet thing is a fad. Because he sold his products over the internet, the underlying theme was also–
STACEY H 04:26
I think it’s bad.
DOUGLAS M 04:26
I think it’s bad too.
ROBERT F 04:27
I think it’s a really long fad. And the other underlying reason was he going into a bank with a old white guy, and he was a young Hispanic guy. And he wasn’t going to give him a loan. He didn’t play football with him 20 years ago in high school. So, we gave him just a few thousand dollars to start. Today, fast forward. He was doing about $40,000 top line revenue, at that point. Today, this year, he’s projecting 2.8 million in revenue. He did 2.3 million in revenue last year and we have north of a $100,000 available to him for growing his business.
STACEY H 04:27
That’s awesome. Okay. So, let’s kind of get into how data, because both of you guys use your access to data to customize products and kind of see the whole person, or more of the person, more of the applicant. And right now we us FICO mostly, when I go apply for any sort of credit I just bought a house, FICO score. And we’re actually that used more and more places, employers, rentals, all kinds of things. So, what is–well something that’s more in thought like this, ever replace sometime like FICO and what would that look like? How would it happen?
DOUGLAS M 05:44
So, to shamelessly steal from Shakespeare, “I come not to bury FICO, but to phrase it. And before about 1950, if you wanted to get credit, you would go bank, you would sit across the table from a white guy in a suit – not like my friend here. But–
ROBERT F 06:01
DOUGLAS M 06:01
And that person would have a red tie, and they would always be white. And you would say, “Please give me some money.” And they would say, “Oh, right. Your kids in Sunday school with my kid, so you’re a good person, so I’ll give you credit.” There was just not a lot of credit available, but that credit dicesioning was based on a really holistic picture, of the borrower. Fast forward to the later true kind of 50, 60′s time frame, the credit bureau standardized their information. So, all of a sudden there’s kind of a way to look bits of credit data about Stacey. And fair Isaac figures out that they can do a really quite simple logistic of direction to produce a score off of that centerized data. And now instead of getting to know Stacey I just need to know her FICO score. Suddenly there’s this massive credit availability. FICO changed the game in terms of making way more credit available. What we’re trying to do, is some sense, take the next step beyond FICO but step back in time, by using thousands of data points to try and build a holistic picture of someone. You can hopefully make another step forward and credit availability for people who FICO doesn’t handle well. And provide an alternative FICO for other kinds of purposes.
DOUGLAS M 07:11
FICO’s been around for fifty years. And has done great things to the credit market, I don’t think that they’ll be a big data replacement score for quiet sometime. But, I think there is value in having a more holistic picture of a person. As a way of going into a financial transaction, insurance, FICO helps a lot for underwriting. Our techniques will help much more. Eventually I think in 20 or 50 years, someone will be up on stage saying, “Oh, you know, those big data guys had credit availability explosion, but they did all of this stuff wrong.” But, we’re not there yet.
STACEY H 07:45
ROBERT F 07:47
I think FICO is–it’s basically putting all your eyes in one basket and I think we saw through the financial crisis when you do that you run a lot of risk of something underlying should change, in that case in the economy and real estate and other industries. I think it is a tool, one of many in your belt. And I believe in terms was going to be the important thing to do in the future is to look at a wealth of information, potentially FICO or a variant of FICO or the underlying data for FICO is included within it. But, I don’t think it’s going to–we’re going to be able to rely on one fact. Michael Milken talks about precision medicine, and how that’s really decreased, mortality rates, and improved quality of life, because now you’re able to specifically target hear and speak the other ache. You’re able to specifically target a particular type of cancer based on the genetics and the DNA of the individual and based on the progress of the disease, and sorts of other factors. We believe in sort of fingerprint underwriting, for our customers. So, we believe you can design a specific product and understand risk on a very gradual basis. And banks can’t do that when you’re dealing with 10′s of thousands of customers coming in, looking anywhere from 500 bucks to a $100,000. That doesn’t work in existing bricks and mortar bank world.
STACEY H 09:09
When we’re talking earlier, you actually talked about making your lines of credit adaptable to or–or adaptive, adaptable to–
ROBERT F 09:17
STACEY H 09:17
Yeah. To how your clients business actually have–wait–doing in real time, or maybe not in real time, but in a couple of weeks. So, and if they’re doing well you can offer greater lines of credit or if there’s business of event that then they can just be coming–you can account for that. But, there is a bit question when you start looking at things that, you perhaps – I don’t know is there an opportunity to misread the signals and pull back capital, just when someone might need it most? What are kind of the down sides to this?
DOUGLAS M 09:55
Look any time–I actually don’t see a huge number of down sides if you’re doing it well. If you’re not doing it well, then there’s big down sides, right? If you’re not reading the signals well, and guess what? We screw up, right? We make mistakes. And making mistakes early is really important in our business, right? I’d rather do it with the portfolio we have today than the portfolio we’ll have a year from now or two years from now. I think signals are important, and it also depends on how you interact with the customer, right? So, is we see something trending the wrong way with small business, and what we do, is we actually have them connect all these data sources, to their accounting Kabbage. Then we track that business on a second by second basis. So, we know exactly what’s happening. So, we’ll pull in the UPS data and all of a sudden we see, “Hey, their receiving less, packages today, than they did last week, or a month ago, or a year ago.” Or during this week they’ve received less, packages. How does that translate into what their revenue’s going to look like in three months? That’s a big question. And that’s something that can’t be done on that level. And yes, we might misread that signal. It might be they went on vacation. But, that’s all part of how you follow op and communicate with them. If you can put a good touch, and we have NPS that’s 76, which in the financial services world it’s net promoter score. You’ve got most companies 10 to 20. You’ve got AMEX at 41, and we’re sort of double that. And the reason is, is because we apply great touch to the data that we pull in, and that’s important. You can’t lose sight of that.
STACEY H 11:17
And that kind of brings up the human elements in data. And I think a lot of people don’t think about this, but it’s really important when you’re starting to amass and deal with lots of this data, to be either talk to the end person, like “Hey, what happened?” But, also to–like you’re talking about your voting on the 10 – what are they?
DOUGLAS M 11:38
How many miles.
STACEY H 11:41
There you go. That’s not people, but I assume there are people somewhere in your process that make it–fine-tune it to make it better, or no? You may disagree.
DOUGLAS M 11:52
I think actually one of the lessons of knowing how to do machine learning in the real world, is recognize it as art as well as science. There’s a bunch of math and a bunch of code, but there’s also a bunch of places where people have to be engaged to help the machine learn the right stuff. A trivial example is if you look at any number of data sources, including are you alive or dead. It turns out that about 10% of the customers on our book are listed as dead.
ROBERT F 12:25
Our dead people pay better I think.
DOUGLAS M 12:26
And dead people in fact pay better, which means when the zombie apocalypse comes my company is going to win. [laughter] Leaving the humorous side, which you all did not laugh enough, I know it’s early.
STACEY H 12:38
Wake up you all.
DOUGLAS M 12:40
But this is going to be a long time for you all to laugh.
ROBERT F 12:41
That is fine finance humor. That’s excellent finance humor.
DOUGLAS M 12:45
Thanks for that. [laughter] But, you have to know, what does a distribution of a variable mean, what is the likely hood of a variable. Those are things that humans understand not machines. And so to a very powerful signal for us, is it turns out if you go read my founders letter on the site you’re a much better credit risk. A machine can’t arbitrary lean that, some human had to say, “Oh, I wonder if there’s a signal there.” Doing big data for real is mixture of art and science, and the people are the art.
STACEY H 13:14
And brings me to my next question, which is what kind of skill sets are we going to need in finance? I’m thinking you’ve got the quant jockey’s with their Excel spreadsheets and their running these models and it is what it is. And you have the machine-learning people, there’s a very vast gulf between I know Excel and I can perform linear regression and machine learning. So, what kind of skill sets do people need to develop to kind of progress and do well in this kind of new big data world, in finance? You can go beyond finance if you want.
ROBERT F 13:55
You go ahead. I agree with him by the way, whatever he’s going to say. [laughter]
DOUGLAS M 13:59
Or he is going to make fun of me, whichever, or better actually. So, hire PC’s in math physics, pure science physics, relatively few kinds, of people who would describe themselves as machine running people. And we built the core team around a few of us who are at Google and communally call ourselves ex-Googler’s, because we’re all that guy. So, there was a bunch of raw horsepower, particularly in math, surrounded by some folks who had done this for a living. And what’s interesting is they all came in with one set of skills or another, hey we have, Systat or some small stats package, or we have one of the larger, Sas or whatever. We moved everyone to R, and including our business analytics are all in R, and so we don’t have Excel jockey’s any more. We have people doing our work including our financial planning, because it’s a much stronger language for doing math. There’s a lot of classes now beginning in ML, they’re actually, you’ve got PhD’s in ML.
STACEY H 15:02
ML, machine learning.
DOUGLAS M 15:03
Machine learning, sorry. But, again I believe school should begin turning out people who understand how to do data manipulation in R, who understand how to do graphics in R, who understand how to do data distribution in R. But, that’s not enough, it’s like saying, “Okay, I gave Stacey a pen, therefore she’s a journalist.” And that’s not how it works, And saying, “Oh, hey look, you can program in R, you’re a big data person.” Isn’t right either.
ROBERT F 15:30
Yeah. I think the only thing I’d add to it –and by the way I thought you did a great job on that. That was awesome. Is that it also takes somebody to give them direction, right? So, what we’re talking about is data that nobody’s ever dealt with. So, a lot of these folks are great at crunching the data and putting it through, the techniques they’ve learned in school or before. But, what I found is you actually have to have somebody come outside the box, and say these are the types of things that–this is our goal this our mission. These are the types they types of things I want you go to look for. And so for us, we have a deal with UPS. And we are not about to start underwriting solely on UPS data that we get. When I say solely, that’s a primary underwriting source. They’ll be lots of other data but looking at that data, because we now understand where their signals coming from shipping data are relevant to whether somebody’s going to be good, or not such a great credit risk, and that wouldn’t necessarily be obviously to someone that’s coming in from a traditional underwriting background, or traditional sort of data crunch background. Is what kind of data can you bring to bear, how can, you look at the world a little bit differently and say there’s actually some value in a data set that nobody ever thought there was. And that paints the sort of the 360-degree picture. Because we pull in that data, we pull in social data, which is relevant for small business, probably not as much for a consumer. Pulling Google Analytics, we pull on all sorts of transaction data. So, all of that helps us get a full 360-degree picture of the small business.
STACEY H 16:56
Can you walk me through how you guys came up with the UPS underwriting metric? I mean, that’s…
ROBERT F 17:08
So, what we did initially is we basically had 10′s of thousands of customers and we said we had this deal with UPS. Add you’re UPS data to your account. And we’re able to pull in a tremendous amount of data from the UPS feeds that thousands of customers gave us. And we are able calling back that to payment behavior. And then we’re able to build models off of that. So, think about things like the size of the package, how often–no that’s a real bad thing to say. How often the packages are shipped, how been they’ve been in address, how long they’ve been sending to, how many customer’s do they have? How long has their account been established, whether they’ve had any payment issues? Or anything along those lines. All of those data signals as well as understanding it on an ongoing basis is what we used to really understand whether that business is going to work for us or not.
STACEY H 17:59
So, how did you recognize that. I guess again, kind of wondering is how, who recognize that, or…?
ROBERT F 18:05
It seemed oblivious to us, I mean, they work with millions of small business shippers every single day around the world.
STACEY H 18:13
So, it’s just intuition plus–
ROBERT F 18:14
STACEY H 18:15
–pass to the [inaudible]
ROBERT F 18:16
That’s probably a pretty interesting data set. And guess what, it turned out it was. So, sometimes you have to a stab. Might have turned out it wasn’t, but if you don’t try, which most people are unwilling to try. If you don’t try and put interesting data into the mix, you’re not going to figure these things out. And banks are afraid to do that.
STACEY H 18:32
Alright, so now we have this much more adaptive personalized finance world based on people. Using data intuition and checking that intuition against actual results, yes. And zombies.
ROBERT F 18:45
STACEY H 18:45
Alright, thank you guys–
DOUGLAS M 18:47
STACEY H 18:47
–very much. This was a great talk