01 Prepare : Introduction
Welcome to Machine Learning and Data Mining! This is an exciting field that is in great demand in the world today.
This week will we introduce the concepts and technologies we will be using throughout the semester.
Reading
Machine Learning, Chapter 1
Machine Learning Introduction
After completing this reading, you should be able to:
Explain the difference between artificial intelligence, machine learning, and data science.
Discuss Moravec's paradox, and its implications for AI research.
Understand how machine learning is used for both prediction and inference.
Machine Learning Definitions
Everyone seems to have a slightly different take on the differences between Artificial Intelligence, Machine Learning, and Data Science. The following four articles cover some of the most common definitions.
As you read them, think about the differences and similarities of the definitions. Given the backgrounds of the various authors, whose opinions might you give more weight to?
Of particular note is this quote from the Granville article:
Earlier in my career (circa 1990) I worked on image remote sensing technology, among other things to identify patterns (or shapes or features, for instance lakes) in satellite images and to perform image segmentation: at that time my research was labeled as computational statistics, but the people doing the exact same thing in the computer science department next door in my home university, called their research artificial intelligence. Today, it would be called data science or artificial intelligence, the sub-domains being signal processing, computer vision or IoT.
As with most things in the realm of science, there tends to be a wide gap between how the media, government, and business sectors view a particular technology compared to how it's viewed by the engineers and scientists using that technology.
For our purposes in this course, we'll define these terms as follows:
Artificial Intelligence: The study of man-made "agents" that perceive their environment and take actions that maximize their chances of success at some goal.[1]
Machine Learning: A subfield within Artificial Intelligence that gives "computers the ability to learn without being explicitly programmed."[2]
Data Science: The study and use of the techniques, statistics, algorithms, and tools needed to extract knowledge and insights from data.[3]
Moravec's Paradox
In the 1980's, Hans Moravec made the following observation, which came to be known as Moravec's Paradox:
...as the number of demonstrations has mounted, it has become clear that it is comparatively easy to make computers exhibit adult-level performance in solving problems on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility.[4]
So, while AI and machine learning algorithms can accomplish many tasks much better than humans can, any toddler can outperform even the most state-of-the-art neural network in picking out photos of their parents or pet cat.
Even though Moravec wrote about this over thirty years ago, the same sentiment persists in AI research today. In a 2016 interview, Dr. Sean Holden an AI researcher at Cambridge University, discussed the differences between human intelligence and artificial intelligence:
“Most AI researchers don’t try to solve the whole problem because it’s too hard. They take some specific problem and do it better. That’s not to say that the way humans think isn’t useful to AI, but working out how brains do things is hard. And there’s a difference in scale. Brains are doing things that are in some senses quite different from what AI researchers are currently attacking – I’d be ecstatic, for example, if I could build a robot that could put on a duvet cover.”[6]
Dr. Fumiya Iida, from the Machine Intelligence Lab at Cambridge, adds:
“We have hundreds of thousands of muscles in our body, so how can the brain control this? A computer can’t. Every fraction of a second you have to co-ordinate hundreds of muscles just to grab a cup, for example.”[6]
Prediction vs. Inference
In machine learning, we are typically interested in doing one of two things: making inferences, or making predictions.
Inference: Given a set of data you want to infer how the output is generated as a function of the data.
Prediction: Given a new measurement, you want to use an existing data set to build a model that reliably chooses the correct identifier from a set of outcomes.[7]
This example explains the differences between those two goals:
Inference: You want to find out what the effect of Age, Passenger Class and, Gender has on surviving the Titanic Disaster. You can put up a logistic regression and infer the effect each passenger characteristic has on survival rates.
Prediction: Given some information on a Titanic passenger, you want to choose from the set {lives,dies} and be correct as often as possible.[7]
References
1. Artificial Intelligence: A Modern Approach by Russell and Norvig (Prentice Hall, 2009).↩
2. Some Studies in Machine Learning Using the Game of Checkers, by Arthur L. Samuel (IBM Journal, Vol 3, No 3, 1959).↩
3. Wikipedia article on Data Science.↩
4. Mind Children, by Hans Moravec (Harvard University Press, 1988).↩