Stay on Top of Enterprise Technology Trends
Get updates impacting your industry from our GigaOm Research Community
You believe that automated machine learning is going to transform data science. What major changes do you expect to see in your lifetime?
I can easily see automated machine learning transforming the field of data science within the next 5 years. In essence, most advancements in data science, AI, and machine learning lately— for example, AlphaGo dominating the world’s best Go players—have been made by large teams of researchers and engineers meticulously designing and optimizing complex machine learning systems that specialize in one particular problem. Automated machine learning seeks to augment and perhaps one day replace those teams of researchers with AI systems that follow the same design and optimization process, but aren’t constrained by human biases. As an added bonus, automated machine learning systems won’t be limited by human needs such as food and sleep, so they can work as long as there’s computer hardware and electricity.
What are the inherent dangers of machine learning? Do you believe that humans should be afraid of machines “outsmarting” us?
I believe we should worry about machines outsmarting us, but not in the way that people typically think. Typically, we worry about a “super AI” that will come out of nowhere that is bent on world domination—Skynet, as a classic example. I believe such stories are nonsense, and are spun by storytellers who want to sell books and generate clicks for their web site.
I believe we need to worry about the simple intelligent machines that we will be building and using in everyday life. There’s a great story about this topic about an AI system that the US Army tried to build back in the day:
Once upon a time, the US Army wanted to use neural networks to automatically detect camouflaged enemy tanks. The researchers trained a neural net on 50 photos of camouflaged tanks in trees, and 50 photos of trees without tanks. Using standard techniques for supervised learning, the researchers trained the neural network to a weighting that correctly loaded the training set – output “yes” for the 50 photos of camouflaged tanks, and output “no” for the 50 photos of forest. This did not ensure, or even imply, that new examples would be classified correctly. The neural network might have “learned” 100 special cases that would not generalize to any new problem. Wisely, the researchers had originally taken 200 photos, 100 photos of tanks and 100 photos of trees. They had used only 50 of each for the training set. The researchers ran the neural network on the remaining 100 photos, and without further training the neural network classified all remaining photos correctly. Success confirmed! The researchers handed the finished work to the Pentagon, which soon handed it back, complaining that in their own tests the neural network did no better than chance at discriminating photos.
It turned out that in the researchers’ data set, photos of camouflaged tanks had been taken on cloudy days, while photos of plain forest had been taken on sunny days. The neural network had learned to distinguish cloudy days from sunny days, instead of distinguishing camouflaged tanks from empty forest.
You see, machines are brilliant in their own way, but that doesn’t mean they’re smart in the way we want them to be smart. We should be careful with how we design and train our machines, and always make sure that the machine learned what we wanted it to learn.
What problems in automated machine learning are holding industries back from using it more rapidly?
Simply put, most automated machine learning technologies are still in a nascent stage at the moment. Commercial tools such as the Automated Statistician can offer basic automated data analysis services, but they’re far from automating the entire process of designing machine learning systems. There are many research prototypes that attempt to automate a large portion of the machine learning design process—TPOT, for example—but there is still a long way to go toward distilling the collective knowledge of thousands of machine learning experts into a single AI system. Needless to say, automated machine learning technologies are a prime sector to invest in at the moment, and we will undoubtedly see many more companies offering automated machine learning services in the near future.
You basically have three camps about AI and jobs: 1) AI will eventually do anything we can do better, and all jobs will be done by AIs. 2) AI will never take away any NET jobs because jobs are made by people who figure out new things that need doing. 3) We will lose some net jobs, most likely near the bottom of the skill ladder. Where do you fall?
In our lifetime, we already have and will continue to see expert AI systems replacing humans in jobs all along the skill ladder. Every time we visit the grocery store, we see basic AI systems replacing cashiers in the form of automated checkout lanes. Every time we use Uber to request a driver, we’re using an AI that replaced human taxi dispatchers. Even self-driving cars are poised to replace human drivers in the near future, and promise to provide safer and more efficient driving services than humans.
These are jobs that won’t be easily replaced by the AI that supplanted them, so we’re likely facing a net loss in jobs in the near future as we continue to develop these expert AI systems. In fact, in our lifetime I see it as a very real possibility that we simply won’t have enough jobs to employ everyone. Perhaps it’s time to consider what life will be like in a post-work economy.
Randy Olson a Senior Data Scientist working with Professor Jason H. Moore at the University of Pennsylvania Institute for Biomedical Informatics developing state-of-the-art machine learning algorithms to solve biomedical problems. He specializes in artificial intelligence, machine learning, and data visualization. He works tirelessly to promote open and reproducible science, leading by example and openly publishing his work on GitHub and open access journals. He’s passionate about training the next generation of data scientists to be more efficient, effective, and collaborative in their work.