Today's webinar is entitled, "How deep learning is poised to change your life forever." Our presenter is alumnus, Dr. Raymond Ptucha. He holds two RIT degrees, a master's in imaging science and PhD in computer science. He holds a BS in computer science and in electrical engineering. He currently serves as assistant professor, specializing in machine learning, computer vision, robotics, and embedded control. Prior to joining RIT's faculty ranks, he was a research scientist with Kodak company for 20 years, working on computational imaging algorithms and awarded 31 U.S. patents with another 19 applications on file. In 2010, he was awarded fellowship for advanced studies. Research earned the 2014 best doctoral dissertation award. He is an active member of his IEE chapter >> Good afternoon, everybody.
Welcome. I'm going to be talking about interesting changes in our technology, and in particular, I love new technology. This morning I was driving into work, and I'm thinking boy, every day I do this, 20 minutes to work, 25 minutes or so going back, wouldn't it be saw many if I could sit in the car and have the car do it for me? If the car could pull up to my office door, let me out, and go park itself? As I have conversations with people, I'm a firm believer that in the future, we'll have these devices with us at all times.
This particular device, call it an agent. I'll have a conversation when I'm done with it and it will tell me some of the good things I said, bad things, interesting comments from listeners, and so on. Perhaps I could type questions about a schedule similar to the way that you do okay Google or ask Siri. Perhaps I'll share intimate secrets with this agent and it will give me Frank advise and thus, it may become one of my best friends in the future. Further, as devices affect our lifestyle, I can't wait until I wake up in the morning and my house somehow automatically monitors my health and lets me know if there are problems by looking in the mirror. And simple tasks such as cutting my lawn all happen automatically. As we talk about machine learning, first of all, machine learning would be the ability of machines to generalize and think and behave, react, do things just like humans do. This particular field of machine learning, I'm going to talk specifically about deep learning, it's really trans forming so many different aspects of life from financial markets, international security, really, it's making our lives so much better, and perhaps you will have a much better lifestyle than all other prior generations to us. Now, if we just take a look at the popularity of machine learning over the past couple of years, people Googling machine learning, and it's the popularity of Googling that particular phrase. You can see it's really gotten very popular with about a 4X increase. Contrast that to deep learning, which is what we're going to talk about today, it was really an unknown word.
Very, very few people Googled that five years ago, so there's well over 25% increase in the past five years. Further, if you look at Stanford university, which is number one most popular university, other than RIT, in the world, when you look at popularity of specific courses, they used to have a course on machine learning, and now that's finally been superseded. The initial offering was 150 students and then 350, and then jumped to 750. This showed how interested students are who are studying technology and this particular field. I would like to think we're entering the fourth industrial revolution. It started with water and steam in the late 1700s, and building machines that humans would have been doing. Then in the 1900s with the discovery of electricity and feeding electricity to plants and homes made automation and afforded mass production. That's what we call the second industrial revolution. Then the 1950s with the discovery of silicon and transistors, electronics, and manufacturing lines which are becoming more automated, with each revolution, our lifestyle got increasingly better. I'd like to think that starting about ten or 15 years ago, we're starting the cyber‑physical industrial revolution, where the boundaries between electronics and humans is starting to get blurred.
Examples like some sort of agent which goes around with me and gives me advice, and perhaps one day there will be a chip that plugs into my brain and on that chip is the Wikipedia of all information. Every time I look at someone, it will give me detailed information about who that person is, how do I know them? First and last name, names of kids, and detailed information so maybe we can have richer conversations with people. So, I'd like to take a moment and talk about the point of singularity. Let's think about humans, and in particular about evolution. Our ancestors go back a couple million years, and humans maybe only 200,000 years, and civilization five to 6,000 years.
If we take at evolution over the years, over the past couple thousand years, we have a time line in blue over here showing that intelligence has been getting greater, so the Y axis would be amount of intelligence and X would be time. Because this time line is so long, for all intents and purposes, it's almost as if we can't tell that humans are getting smarter. Can you say that you're smarter than your parents or grandparents? It's hardly noticeable. But if we compare ourselves to maybe our ancestors from 100,000 years ago, we probably are smarter.
Let's contrast that to the right, which is essentially computers, machines. There were no machines, really, so zero intelligence. But over the past 50 years or so, that intelligence is starting to increase. And in the past 10 or 15 years, it's been increasing even more. Now we're at the point where computers are still not quite as smart as humans, but there's definitely a pattern here. We see how the computers are getting increasingly smarter. It's a matter of when. At some point in time, there will be a crossover, which we called the point of singularity. This person is a futurist, and in this chart, he's showing over time on the X axis, versus computation, on the Y axis, how computers are getting smarter and smarter. Back in the 1990s, he made a prediction that by the year 2000, we'll have a single computer which is just as smart as an insect brain.
Sure enough, that came true. He also made a prediction that by the year 2010, there will be a single computer that is just as smart as a single mouse brain. That is true as well. The next prediction is, he said, well, I'll predict by the year 2029, there will be one single computer that will be just as smart as a human brain. Remember, this is his prediction, not truth or anything. By the year 2049, there will be one single computer which is just as smart as all of our collective brains on this room and on the phone, and all students and faculty at RIT, if we all put our brains together, there will be one computer just as smart as us. A little bit more bullish on it. He said if you took every human on planet earth, if you took 8 billion brains and put them all together, working on one single problem, we'll have one single computer just as smart as those 8 billion people working on the one problem, and he's predicting this will happen by the year 2049.
I'm very jealous of a lot of my students because they will probably live to see a lot of things that I won't see, which I'm excited about. As machines are getting smarter and smarter, eventually they'll match and then surpass human capabilities. Today, let's say I'm driving down the highway with wife and kids and we're on a vacation and one of the kids in the back says mom, dad, take a look, there's one of the cool looking cars with a spinning thing on the top. That's an autonomous car and there's no driver. Everybody turns and says, wow, cool, that's awesome. When they're my age, let's say they're driving with their own kids in the backseat, they'll be driving along for vacation, you know, doing what kids do, probably picking on each other and what have you. And then hey, mom and dad? Look out the window. There's one of the really cool cars. There's a human driving it! And everybody will be astonished.
Because a human? Humans don't drive cars. Their reactions are slow. They get distracted. Maybe they drink and drive. They can't see in all directions. They can't communicate with all objects all around them.
There are many things that they can't do, which computers can >> Let's see if we can get the brains on the webinar here to work on a question for you. We have our first poll question up. If you are in full screen mode, you'll need to return to the traditional screen to see the poll questions on the right‑hand side. We're asking for your views. When do you think when will computers become as capable as human beings? Take a moment or two, a couple moments and answer that question. I think you can jump into the next slide. >> Andrew is one of my heroes.
He left Baidu and started his own company back in Silicon Valley. So, he has this, AI is the new electricity. Just like electricity trans formed the world as we know it in the 1900s, AI is trans forming our lifestyle right now today. If we take popular companies like Google and Facebook and Apple, they're just hiring herds of deep learning experts every single day. Baidu alone has over 1300 deep learning engineers and Google has over 100 groups as well. This is a technique that is trans forming our lifestyle. Can I see the results of the poll? >> Likes like most are thinking closer than farther. Some of you are into the 50-year range. >> This is an interesting question.
There is no real answer to this particular question. Keep in mind, we really don't know when this will happen, but we do know it will happen. What I have here on the next slide is I'm showing some other futurists who make predictions about when this might happen. And we start in the upper left‑hand corner, Ray Kurzweil is predicting 2029. If we take Robin Hanson, he's saying this could take one to 400 centuries. We have Ruchir Puri from IBM Watson, he didn't want to make the prediction. Other famous folks, Nick Bostrom, he's thinking that it could happen in a couple of decades. Rodney Brooks has several start‑ups, and he's saying all you guys thinking this is going to happen, you know, it's not. So, save your breath. We'll never see it in our lifetimes.
And then there's Gary Marcus. He's actually a psychology professor, and he's also a futurist and thinking 20 to 50 years. For those of you who thought it would happen in those particular times, we can say you all got it right. Let's look at AI trends happening right now. First of all, supervised learning where you train computer models using ground truth examples that come from humans themselves. The humans are manually training these models. You hear lots of hype about AI. Some saying term of years and then they take over.
We're a long way away from that. You say anything that a human could do in less than one second of thought, yes, it can probably be done now with today's technology with AI. But advanced reasoning is something that we just don't know how to do yet.
>> We've got a couple questions that are sort of in the same vein, and that is looking for some kind of sense of how you are defining intelligence. Are you talking computation or talking about something more? >> When I talk about intelligence, it is more than computation. Computation is more like a calculator. Computers are very good at number crunching or step by step algorithms following the code. So, intelligence is more reasoning. Think of the way a human would step back and they would think about something and they would apply prior knowledge to a particular problem before coming up with a solution. >> Okay.
Thank you. >> So why is AI just now becoming so practical? The first reason is availability of data. It is a big thing to collect these huge data sets and we train these algorithms to read in that particular data and be able to make inferences about new things that they have never seen before. We have seen lots of sustained advances in hardware, GPUs, as well as algorithm learning techniques. Omnipresent activities. If you take a look at the right‑hand side, what we have here is we had a plot and that particular plot is just showing, say, the more data that you have on the X axis, the better performance you're generally going to get.
So, traditional machine learning techniques would say yeah, the more data you give it, the better it's going to perform. Over the past ten years or so, it's becoming ‑‑ these algorithms are getting better and better, working smarter and smarter with hardware, and now with these GPUs, we say wow, we can get really good performance and we can get better performance on today's supercomputers. Keep in mind that today's supercomputers are tomorrow's cell phones or smart phones. This is just evolving in front of us. If we take a look at some AI jobs, what I mean by this is, in what aspects of life is AI really replacing humans? Well, they are already re placing humans for credit cards. Who do we give credit to or not give credit to? For many years we have been using algorithms to do this.
We don't need bank tellers anymore. When you go up to the store, if you think about the automated check‑out lines that you have in your local grocery store or home improvement store, it's becoming more and more common to go to lines which don't need a cash year. Same thing with insurance claims. Even litigation and lawyers, it's almost impossible for a human to read hundreds and hundreds of cases, thousands and thousands of cases, perhaps millions of cases, memorize them, and then they hear your new particularly unique lute and make inferences and remember all of the tens of thousands of cases. Humans have a difficult time doing that, but computers are very good at doing it. Same thing with conversation bots. Law enforcement and in health care.
We're seeing lots of influence from AI. Now deep learning, which is what this particular webinar is about is really the hottest topic in pattern recognition. When I say that, computer vision would mean the ability to look at visual stimulus and make inferences. This would be facial recognition. And same thing with speech recognition. We can process speech for this particular webinar, there are automated captioning techniques that listen to what I'm saying and a human will have to check for errors today. We won't need that in the future.
So, here's an example. On the left‑hand side, I'm showing an image, which is a low-resolution image of some text. Pretty difficult to read. But you apply deep‑learning techniques, and you can get the image on the right, and it's so much easier to see. Another example, let's talk about computer vision. In this particular task, you give the computer an image and say what is it? It says that's a cat. And then you may say, here's an image.
Find all of these objects. It would have to draw a box around all the objects and label all of the objects appropriately. In this case it would find one object and draw a boundary box around it and say that particular object is a cat. You can extend this to multiple objects. You can say, hey, square bounding boxes? That's old school. Technology is evolving so fast that now when you see techniques saying label every single pixel in that image whether it's background, cat, dog, or duck in this particular case. You can do single and multiple object recognition. Take autonomous driving. What we're seeing here, this is a dash board cam from a car, from an autonomous car, the camera looking out. You can see a couple of things happening here. There's one deep learning algorithm, which is finding all the objects in the scene and labeling them.
For example, it's finding all the cars and there are magenta numbers on top of each one, so that gives a unique identity to each particular car. So, one deep network, its only job is to find all the objects and track them over time. The next deep learning model generates these blue lines, and its job is to hey, find the lane markers and identify the lane markers in the images I'm seeing. And the third deep learning algorithm that's going on in this particular card is the red line. The red line says hey, what is my path now and into the future? It's actually controlling the steering as it's going through. This is a video, and it's courtesy of NVIDIA, and it's showing an autonomous car going through the streets and identifying objects, people, cars, street lights, and now technology that will extend it to identify every single pixel in that image. Is it the roadway, sidewalk, cars, street signs, people, motorcycles, and each object has a different color. Now, not only has AI been influencing specific technologies, but it's also good at doing things like art.
Say you take a really awesome picture from Germany, and let's say maybe your favorite artist of all time is JMW Turner. What would it look like if JMW Turner had painted that picture you took? Using modern machine learning techniques, we can mimic the brush strokes and that's what your photo would have looked like had JMW Turner been alive and painted your photograph. Same thing with other artists as well. So, for example, you can change your profile picture as if van Gogh were alive today and painted your portrait? Now you can use it as a profile picture. Every single smart phone on the market today has speech recognition, and each of those speech recognition uses deep learning. Captioning, we're already seeing how you can take an image or a frame and recognize objects.
On the right‑hand side, you can see that hey, you can take a picture or a frame or a video, and you can give it to the computer and say, Hey, what is going on with that picture or video. It would kick out a group of people shopping in an outdoor market. Interesting.
It's never seen this picture before, and it generates that sentence. You say tell me many. There are many at the food stand. Keep in mind the computer has never seen this picture before.
Now there's been other studies showing that especially the students, they love tech and they're so good at it. So much better than me. Now new studies have come in and said that speech recognition is so good that it's up to three times fast tore communicate with your friends using speech recognition than using phones. The study was from English to mandarin. I love AI assistants, Google Home, Alexa, the new fad.
Last year during Christmas these things were selling like hot cakes and I imagine this coming year they'll be one of the hottest selling products. We'll see more and more okay Google or ask Siri questions or Alexa. They could be Wikipedia or your calendar or giving you advice. This that and the other thing. All right, let's take a step back and talk about the human brain.
Because the human brain is such a marvel of the universe. The human brain is made up of what I call billion and billions of cells called neurons. Each one is connected to 10,000 other neurons, forming a vast intricate connection, a web of connections, maybe 100 trillion connections.
There's nothing like that that we know in the universe today. It's such a marvel of the universe. We've done so much study on it. We've learned in the last five years than all prior years, but we still know very little. How does the brain store all of my memories? I can remember when I was 4 years old, sitting on the stool and my parents singing happy birthday. How do I have the ability to love and hate and have emotion or to learn new concepts or generalize? It's just amazing. Well, if you take an example like that, what we have here is we have ‑‑ there's 30 images on the left‑hand side, each of different classes, airplane, car, cat or dog. Because you have one of these awesome universes in your head, you can look at any one of the images, and you can classify it as one of the four classes, yet they're all so different.
The cars are completely different from one another. How is it done? The one thing we do know is there are many layers of aspect hierarchy. You take the visual cortex here..
[ Captioner was disconnected. Reconnecting now ]. ... they all look the same, and you can recognize them as the same cat. You can recognize this picture of a cat, and finally, you can easily detect that all of these objects in this particular scene of cats, despite their huge class variation. Okay. So, what we want to do is we want to somehow teach computers to see. How do we do that? Fine. Evolution maybe took a million years or so.
Some of the eye brain has developed over the years. Let's say the modern child in today's era, let's say the particular child, that he collects a new image every 200 milliseconds, by age 3, this child will process over 200 million images. Or 200 million training samples. It's more like 236 million, if you do the math. But it took that child's brain three years to process and learn those 236 million images. With modern day computers, we can do that in just a few days. That's why we're able to teach such amazing things to computers. Now, so the idea here is that traditional machine learning and computer vision was really built off of, you know, smart engineers in the lab saying, Hey, I think I know what is making me be able to tell the difference between a cat and a dog and a motorcycle and a boat.
We would code up these cues that helped the machine learning generalize. What we do today, though, in deep learning, we say hey, we finally realize that computers are better than humans are, so we feed it all the information in terms of computer vision, and we say you go ahead and try to figure out what the good features are. All right. So I'm going to spend a couple minutes talking about what we call an artificial neuron. We have biological neurons in the human brain, and we sort of loosely mimicked them in an artificial neuron. What an artificial neuron will do, it will have a weighted linear summation of a bunch of inputs and then pass it out. There are a couple representations that you see here. The bottom most one, we're showing a simple mathematical representation, which is a simple product.
The G function is what we call a function. We can pass the weighted linear input sum and pass it on to many other neurons. Now, if we take a whole bunch of those artificial neurons and getting them together as seen on this particular slide, we form what we call an artificial neural network. When you start taking, say, hundreds of these and getting them together, intelligence starts to emerge. And this is similar to the way we think that maybe this is how intelligence emerges from the human brain. So, what we have in this particular slide is, I'll give you an example of how something like this may work. We would take an input stimulus.
We'll use images as an example. And we have a 20 by 20 pixel images. That's 400. Let's say we have 400 inputs to our neural network and each would be one of the pixels.
When we take that input and we pass it into our neural network, and each neuron ‑‑ remember, it's a linear combination of the input. It passes the outputs to the next layer. Ultimately, when we get to the very last layer, the particular image was a human, we'd like to say the red dot, which course responds to humans should light up and the rest should stay dark.
What we mean in computer lingo. We hope that the red dot is a high value, and we are hopeful that the others are low values. In this particular neural network, it says that we have three different classes. That means this particular network is capable of recognizing three different types of objects. Only one of those is people. So, we want the people neuron to be hot, and the rest of them to stay zero. Initially, when we compare it to the ground truth, you'll see that we have errors. So, the green dots on the right‑hand side represent the ground truth answer.
For this particular input schema, which is a person, the green dots are all zeros except for the third green dot down, which is meant for humans or pedestrians, and that one would be a high. We're going to have errors. What we do is we propagate that error backwards, updating all the way in that particular network.
Then we'd take a second stimulus and repeat the whole thing again. And then third and fourth and maybe 100,000, maybe a million different samples, each fine tuning and adjusting these weights. Then we would repeat that whole process again over and over and over again. Each time we go through each sample on a training step, we call that one epic. And we use something called best propagation to update that. I'm not going to get into the details of that >> Let's take a second, because I think our second poll question is up, so if people want to weigh in on what Ray was talking about, go ahead and take a few moments and read the responses. I think this is one where there's multiple answers that are possible. >> You can check more than one answer. >> You can pick more than one.
In the interim, we do have one question that was asked, and that was, do you believe that machines will be able to develop ‑‑ I'm sorry. Let me read it again. Do you think it is possible for machines to have emotional intelligence? >> That's the long‑term goal, and where we're going.
Certainly, machines do not have the ability to do that today. We have something called effective computing, which is the ability of machines to, say, direct or steer human emotion and certainly they are getting good at that. A machine can look at your face, you know, can find the features in your face and make pretty accurate guesses about your mood if you're happy or sad or angry, for example. >> And there are actually ‑‑ they are working on that here at RIT in the school for behavioral health. >> Yes. >> As methods of treating and working with people with mental health issues; is that correct? >> Yes. Especially ‑‑ we call these things micro expressions. Machines are better at detecting the micro expressions than humans themselves.
There is no question that a machine will be able to detect your emotion or perhaps if you're lying or not better than a human would be able to do. Some machines are arguably better than humans today. >> And one other question is with regard to the self‑driving cars and the function ‑‑ the sensing of that, can you pull the questions up there? Great. Similar questions, how do you think having automation across civil society such as self‑driving cars will be regulated and can a self‑driving car be processed and void of ramifications from law enforcement perspective? >> Right now that's probably the biggest impediment is legal ramifications. That's being filled out right now. Ironically, New York state is the only state in the union ‑‑ you have to have at least one hand on the wheel at all times. Other states are much more free about that.
For example, states like Nevada and California, for example, allow autonomous driving, as long as there is a human driver in the car who is able to take over at any point in time. As we get more experience, we're going to come ‑‑ I think in the very near future, within the next ten years, we will be in a state where machines are so much better than humans at driving, that there will be a quick switch to wow, why would you ever drive the car yourself when a machine is so much safer? And when that happens, that's when we will have very swift legislate and when laws will be enacted. >> All right.
So, we have the poll question answers? >> Yes, we do. >> All right. So, you can see... we're trying to open up the screen so we can read those questions. They both have the concept of neurons which produced a weighted sum of inputs.
Several people selected that they are hierarchical in nature and abstract representations of an object. In one said they have a similar number of connections. >> The first two are true. And the second two, I would say ‑‑ the last one definitely is not true. You can have a hundred trillion connections in the human brain and deep neural networks may only have a couple hundred million networks inside them. So, I guess the issue is the hierarchical is something that is subjective. There is no sense of hierarchy like maybe we had before.
When we talk about deep learning, there is definitely this notion of hierarchical abstraction where we had edges and objects into objects. And this is one of the key things that separates deep learning from other things that we have seen so far. Let's just take a quick look at what we call the convolution neural network, probably the most famous. Let's see if we can see how these work. First of all, this is a picture of a convolution neural network. This was circa 1980. If you just look at this, you may not understand what it means yet, but it takes input with many layers, and there's a classifier at the output. If you look at the CNN today, they're all very inspired by some of these, you know, futuristic thinkings of the 1980s. As you'll see shortly, in the past five years, it's been all of the discovery in the deep learning.
If you can contrast deep learning with traditional learning, it was really centered around coming up with what we call these human hand‑crafted features, and using those human hand‑crafted features in conjunction with neural networks and cross representation and do image or audio or pattern recognition on top of that. In contrast, we will have deep neural networks do all of that itself. It will learn the best classifier, and it's going to learn both simultaneously. The reason we couldn't do this in 1980 or 1990s or even ten years ago was because we didn't have enough computational power. The other thing they need is lots of training samples. Now with the sharing of data, we have the training samples. If you have 100 million weights that you're trying to solve for, you must need at least 100 million training samples to solve.
Not exactly. There's some shortcuts we use. This is showing GPUs, the quote in the upper right was interesting. In 2016, a single high-end NVIDIA gaming card would have been classified as the most powerful supercomputer in the world before the year 2002.
Since then, the technology has again gotten twice as fast. Now convolution filters for deep learning is really predicated on this concept of convolution filters. I'm talking about the same exact filters that you may have learn from single processing. A single or vertical at the top. The job is to find all of the vertical edges. So, we're going to use that same concept in CNN, and what's going to happen is the CNNs or the networks are going to learn which of those convolution filters are good for feature extraction.
And so, that's what is shown by the filters on the right‑hand side of this particular slide. Those nine filters will take the original image and create nine filtered versions of those images. Then we'll resample those nine to a lower resolution and repeat the process such that we get a hierarchy of images.
That's what we're showing over here. So now the first diagram that we saw? This is actually published in 1998, although the original concepts were ten years or so earlier. So, it takes an original image. It learns filters. It filters those images, and now we have filtered versions of those images. Then we create more filtered images, and eventually, the stripes, the sort of extended vertical stripes, that's what we call the fully connected layers. That's analogous to a vector machine, for lack of a better analogy. The idea is once there is a convolution neural network, it follows that convolution up with pooling. We will complete that step multiple times and have fully connected layers.
A classifier at the end of it. Each of those have weights that we're going to be learning. So, the idea here is that specifically, here's an image which is 28 by 28 pixels, and maybe we'll learn side by side filters across the top. Since we've got 32 filters, we'll get 32 filtered images. You can do the same thing for color, and now they are side by side by three instead of one.
But we still get 32 filters. The other concept is pooling. Pooling allows you to have freedom to exact location of an object, and it gives you the notion of an abstract hierarchy. And the most popular object is something very similar. How do you reduce the image on the left, which is four by four pixels to the image on the right? You split it up and take the maximum pixel value in each region >> Let's pull up our last polling question, if people want to take a look at that and give us your thoughts. We do have one question, also, while the polling question is up. Much of this seems to be centered around images and digital stimuli. Are you familiar with other developments using other senses like sound or touch? >> The only reason we use visual stimuli is because it's easy for humans to relate to.
Everything we're doing here, it doesn't matter the input modality. Whether it's sound or touch or audio samples or signatures from the stock market or signatures we're getting from different sensors on your body or medical images. It doesn't matter.
The deep learning techniques still would use an abstract representation and still would pair it with a classifier at the end. >> Okay. >> All right. If people want to keep finishing up that poll question. How are deep neural networks different from traditional neural networks. >> So the first one is they use combination features to learn spatial and variant features.
Certainly, yes. Convolution filters do that. But traditional neural networks do not do that. >> We had not much, just more layers. >> Yeah.
So really, I would say that's sort of true, because they are more layers. They are what we call fully connected connections, using convolution connections between each. Is that a subtlety? One could argue back and forth. More layers is probably the biggest differentiator. >> And the third option is simultaneously learn features and classifiers? >> Yeah. So, the deep neural networks, this is why the other machine learning techniques can't compete with deep learning techniques.
And I'll throw traditional neural networks into the traditional human classifiers. How can you compete with something that is optimal learning, distinction and extraction at the same time? This gives the edge to the deep neural network. To give you a quick visualization. This is a multilayer deep neural network. In layer one, layer one is learning these filters, which are akin to what I call simple features. The gray shows the filters themselves, and then the colors is a portion of images which are triggered by the filters.
If you go to layer two, you say okay. I'm doing a visualization of the type of things that the second layer triggers on. And you can see that second layer is now triggering on edges and very simple concepts.
If you go further down to the layer four and layer five, in layer four, we're clearly starting to identify parts of objects, and then layer five, the filters are clearly starting to identify those objects themselves. And by the way, we don't tell the network to do that. It just automatically does it. These things are so good that deep neural networks are winning all of the time. Recognition challenges, whether it's image classification or simple things like traffic sign recognition.
They are better than humans in all of these types of contests. I'm take a case study, a computer vision challenge. The idea is that you want to recognize different objects, and there's a thousand different types of objects that you would have to recognize. Well, let's just say that these are simple object categories. Here is one called hammers. You can see all the different variations of hammers that we're asking this network to learn. Each one of these. ‑‑ you see that in 2010, 2011, if we look at the error, those errors in the 20s.
Well in 2012, that was the introduction of deep learning, so we had over 30% reduction in error by introducing deep learning. That took the machine learning community by storm. So only one method used deep learning in 2012. In 2013, all except for two entries used deep learning. And in 2014 and beyond, now only deep learning techniques are used for computer vision challenges as difficult as this. You wonder how humans stack up to the machines? There's a guy from Stanford, and he is what I would call a genius intellect, one of the smartest guys on the planet, and he trained himself in this challenge. He said look, I can get about 5% error, so still better than humans. Google and Baidu came out with results which are on par with his saying computers are just as good as you are. All right. In 2016, the computer now cut his results in half.
Now we can say unequivocally computers are better than humans. I think I'm going to have to wrap things up due to time. But I want to say one thing about artificial intelligence versus intelligence. Right now, artificial intelligence, that's really giving computers the ability ‑‑ remember, one of you mentioned intelligence, emotion, ability to think and reason to independently act on its own, that's intelligence. Today what we call intelligent augmentation, this is doing simple tasks, and each is making your life better, faster, easier, more efficient.
Both artificial intelligence and intelligence augmentation are hot topics. Here are some examples in this slide. I'll just cover one of them. We'll just say enterprise. Intelligent augmentation would be the ability to automate mundane tasks. Automated tellers and virtual assistants. Now you have machines which do all of those tasks themselves without any assistance via human and maybe do it better than humans.
Now here's a quote. An engineer often overestimates what she can do in five years and underestimates what she can do in 20 years. That is true when I was making predictions, and we see two different curves on the plot. One is intelligent augmentation and the other is artificial intelligence. Most people think we're riding this steep increase, when really, we're only on this intelligent augmentation curve. We're really just starting the artificial intelligence, and it will be really many years from now before we start seeing agents just as smart and interacting with us in our everyday environment. All right. So, I'd just like to conclude and say thank you very much, and I wish you all the best.
>> Perfect. Thank you very much. We've got ‑‑ I'm going to take one question. Then I will jump into closing. The question is, is there any core issues with CNN that's currently holding it back? Example, the amount of data needed for training or subseries trans formation? >> So, ironically, you in your garage or basement can buy an NVIDIA GPU for less than $1,000 and pair that up for a desktop, so for a little over $2,000, you can generate a piece of hardware and then you can download software, which the smartest companies in the world are using, you can download the identical software vision, and your $2,000 computer can compete with the super computers that Baidu and Google and Microsoft are using.
Computing power is getting so cheap, that that is probably not holding us back right now. I think we're getting very good at the techniques and about to peak in capability. To get to the new level, to get away from intelligent augmentation and towards true artificial intelligence, we need new invention to happen. What is that? We don't know. Just like 50 years ago, we would have never envisioned people walking down the streets and everybody looking at smart phones and texting. They're doing live video chats wirelessly from people around the world? No way. That's not going to happen.
That's probably what we would have said 50 years ago. We don't know when the new break in technology is going to happen. I'm on the side of within the next 40 years or so, we'll probably see machines which are smarter than humans, and they will be pretty common. >> Okay. Well, that is all the time that we have. If you have additional questions, you can e‑mail them to RIT alum at RIT.edu or tweet it to @RIT_ alumni.
So, we thank Dr. Ptucha for being our panelist today. Please consider joining us on June 28th for the very last me RIT webinar of this academic year. We will offer a provocative presentation, ten countries that don't want to defeat Isis. Robert Gerace is a 27-year veteran in support of the defense intelligence agency. Again, you will receive an e‑mail with a link to the recording for this webinar in a few days and I believe we will also attach thank you all for joining us and please exit by closing the WebEx window. Please let us know what you thought of the webinar through a survey that will pop up when you exit.
Have a great day.
Once you have decided on your narrow and specific research topic, the next step is to turn it into a research question, and to be able to rephrase that research question as a thesis…Views: 267 By: Sarah Morehouse
>> SIG 14 is the special interest group of ASHA that is specifically dealing with issues to do with individuals with communication disorders who belong to different cultures and who…Views: 1 575 By: American Speech-Language-Hearing Association
- [Announcer] This is Duke University. - The Bass Connections project, we're just actually finishing up with our latest iteration, our pinnacle achievement that we've had over…Views: 45 By: Duke Graduate School
>> Let’s get started here. We’re just a couple minutes behind schedule. We have a full agenda here for our Fall ETV workshop. Electronic Thesis & Dissertation Submission Workshop.…Views: 83 By: KULibraries