# Gradient descent, how neural networks learn | Deep learning, chapter 2

### 100 Replies to “Gradient descent, how neural networks learn | Deep learning, chapter 2”

1. 3Blue1Brown says:

Part 3 will be on backpropagation. I had originally planned to include it here, but the more I wanted to dig into a proper walk-through for what it's really doing, the more deserving it became of its own video. Stay tuned!

2. U Cus says:

Why are your values slighty off at 4:16 ? For example first number is (0.43-0)² = 0,1849 not 0,1863.
Every number is slighty off in your example

3. Hassan Umari says:

No, bad YouTube! this video should have been recommended to me way too earlier

4. Nemo D123 says:

One guy will "create" Terminator. 7.5Billion people will "try" to kill it. Astalavista baby.

5. Hakim R says:

15:19 seems interessting, just like you have to train your own (biological) NN to draw a human face, although you saw millions of them

6. Blue Jay says:

but can machines deep understand?

7. Blue Jay says:

the average male brain has 86 billion neurons. the human body is made up of 37 trillion cells.
neurons are made with multiple dendrites and multiple axon terminals.
does deep learning use opthamalogy theory to tell a computer how bad it's doing?
is it possible to embed an AI chatbot inside of an AI to give it an internal monologue in order to simulate thinking in words?
the thoughts could then be displayed on a screen. making their thoughts visible would make AI less scary for People.

8. Anil Kumar Sharma says:

just use equation of every characters so we got the similarity between given data by as given data exist

9. Syed A. Salam says:

How you made these visuals? I need a similar presentation for my project. Thanks

10. Сергей Белоногов says:

Благодарю за труды!

11. Mahoney Technologies says:

Hmmmm Fuzzy Logic Similar

12. Mahoney Technologies says:

No Just consider it a new Start!

13. Luiz Henrique says:

I'm sorry MIT and Harvard but this is what TEACHING is supposed to be like. I watched some Stanford/Harvard/MIT classes and no one comes close to this

14. c gn says:

Here, the pixels are gray scale but what if the image is colored? Then should a layer be added for each color (rgb)?

15. Yasmine Guemouria says:

Thank you 3Blue1Brown for these videos, they're extremely instructive!
However there's one point that I didn't understand, why does the gradient descent algorithm struggle to find local minimas with swapped labels? If anyone could answer my question I'd be very grateful.
Thank you.

16. timothy norman says:

@18:10
… Nah, looks pretty much what I see on the media nowadays. Everything and everyone improperly labeled as seen fit. Lol.

17. 0x2661 says:

If I want to start doing deep learning and AI, should I start learning calculus? and if so, where should I start?

18. Snehal Bhartiya says:

Tell me how much time did you guys invest in each video , also how big is the team ?

19. Osip Vayner says:

3:41 This is the loudest I've heard Grant talk, and it's almost adorable.

20. Arjun Kalidas says:

Your animations are so damn intuitive and amazing. You have developed one of the best teaching strategies! Thanks a lot.

21. Syed Babar says:

22. Stian Fisk says:

so this is how the brain works… 😀

23. 0x2661 says:

Fun Fact:
If the whole equation for the cost function were to be written on paper in a straight line, with all of the variables plugged in, it would be approximately 115,696,496 characters long. If each character, on average, was 6 mm long, then the whole equation written out would be 694.496 km (431.343 miles) long! That's almost long enough to go from NYC to Chicago! And a computer can solve that equation in less than a second!

i did the math

24. William says:

human brain has 100billion neurons, largest neural network 16million. WOW

25. mrsquiiid says:

Im sitting here wondering how godlike this man must be since he somehow made a potato understand this stuff

26. Não Sou Um Robô says:

"It's just calculus" "Even worse!" LOL! I felt from my chair here!

27. calvinthedestroyer says:

Mike Nielson? From MST3K? haha !

28. Lemuel Uhuru says:

29. mooselessness says:

waaait wait wait wait – Ryan Dahl is one of your patreon supporters?

30. umair alvi says:

The more number of times i watch it the more it makes sense.

31. HZER555 says:

5:15, HEY, IM STILL LEARNING 🙁 leave me alone to train.

32. Cody Darst says:

I wish someone would have introduced this to me at a young age back in the 90s. I had no idea neural network have existed for so long

33. Chinmay Rath says:

I had always been daunted by neural networks due to the sheer complicated equations revolving around it this video series did the magic. I'm so extremely thankful to you and glad I found this that I wanna support this channel. I'm an undergrad now but I'll definitely be supporting the channel on Patreon when I start earning

34. kaj kajsen says:

This is one crazy video

35. Dinesh says:

Absolutely incredible…i am in awe….never seen such a beautiful video..omg

36. Zhouyao Xie says:

This is the best introductory video on neural network I have seen so far. Thank you so much for these fantastic videos!

37. Swastik Pathak says:

Amazing explanation. Excellent video quality!! Keep it up sir

38. Razvan Craciun says:

Thank you for your work man 🙂

39. Jin Shikami says:

Verify your electrical circuits on the go! Observe: androidcircuitsolver/app.html

40. Prasanth Suresh says:

It's so heartwarming to listen to a teacher that actually gives you epiphanies!!!! <3

41. MdMDmD says:

Because the sigmoid function has a range of (0,1) doesn't adding more nodes tend to saturate the output? Everything will always be at 1, unless the weights handle that in learning idk

42. HunterWoelfchen says:

i wonder what would happen, if you put in a 1 for the outputneuron and calculated all the way back to the pixelgrid using the weights…

43. Bob Kastner says:

Squishification.

44. Jane Doe says:

45. Nopzu says:

15:19

46. dleddy14 says:

Can you please define "activation". If you did already I missed it. Thanks and thanks for making this great channel.

47. Dustin Jackson says:

You should make a video on this topic that doesn't require advanced calc and geometry knowledge lol

48. Master Hans says:

Thanks I love How you explain, it's really clear, better than any other video. Thank you ! I'm gonna study Deep Learning, so I begin to learn by myself

49. John Byrne says:

18:04 Can anyone explain what they're talking about at this timestamp, re. mixing up the labels? Why would the network care about that? Doesn't it just mean that it'll identify everything with whatever false label you used in training?

50. Abbes abbas says:

Is the cost function output (the cost) the same thing called (loss) in tensorflow while compiling the network?

51. Nobody says:

13:34 yeah that 6 guess is as good as any. It looks more like an H than any number I know of.

52. Jared Roussel says:

You all are phenomenal presenters. You really have a knack for presentations.

53. BigMeanyVids says:

Really well done. I had never looked into neural nets but make sense as just an expansion of control system theory I learned about many years ago during my master's in EE. Will be fun to play with.

54. Shirish Bajpai says:

One of the best video series I've ever seen. Love man!

55. Salamanka says:

Thanks

56. Joan Ferran says:

96% correct, what happens when the 4% incorrect means accident (autonomous cars, Tesla). If the model is 4% wrong and this causes a fatal accident you will be 100% dead 🙂

57. Joan Ferran says:

My Tesla autopilot was 98% correct it only ran over 2 pedestrians out of 100. I am so proud of my Tesla

58. 물살 says:

한국어 자막이 필요없는게 낫다 싶을정도로 엉망이네요.. 업데이트가 필요하다고 봅니다. i think korean subtitle just confusing what this video tell.. it needs update.

59. Jannick Bremm says:

This is sooo coool!

60. Jack McGaughey says:

Instead of squaring the difference between correct answers and given answers, why would it not be more accurate to take an absolute value to get the positive value?

61. Alexei Savitsky says:

How I'd improve the program:
1. At first I'd also give it some trash to input and train it to say that it is a trash (so all end neurones would be zeros)
2. At second, if I wanted it to work as it is meant to be, namely through finding edges and shapes, I'd make separate programs, like one is searching for edges, and second is searching for shapes of these edges, and third is searching for digits of these shapes.

62. Cesar Gomes says:

Your videos with such wonderful LaTeX animations are just as high level as a BBC awarded documentaries. Very impressive to say the least.

63. Escape Felicity says:

Please, Get rid of the background noise

64. Steven Smith says:

The dislikes are unemployed professors

65. Zuhair Mehdee says:

Why not take modulus of the difference between the desired output and the actual output instead of squaring it. Wouldn't squaring magnify the big differences and minimize the small differences instead of keeping it linear. Taking the modulus feels more intuitive but maybe this is a case where something works better just because.

66. Tim Meschke says:

Hey there. Thanks for the videos, I am taking your course on multivariable calculus on Khan right now. Quick question, what software do you use for making those vids? Especially how do you visualize functions and how do you write freely on them. Thanks.

67. Nick Springer says:

1:49 , did we just get HTML’d?

68. root1657 says:

Your videos have been key in teaching just enough of the topic to my non technical client to allow then to see past the marketing speak of ML vendors that either don't have real ML, or have garbage ML. This has actually lead to the savings in real dollars and several years of getting wrong products and having them fail us. Thank you!

69. Victor Oliveira says:

What if you, instead of using 10 neurons (one for each digit), uses 11, with the new one representing "not a digit" neuron. Wouldn't that "solve" the random input?

"Now, if you're unfamiliar with multi-variable calculus and you want to learn more…"

Nah, I'm good, man.

71. morphman86 says:

"Imagine a single input and a single output. That can be seen as a slope. Now imagine being a ball rolling down that slope."
Ok… fine…
"Now imagine 2 inputs. That can be seen as a hill. Imagine the same ball rolling down that hill."
Yeah, yeah, I can see it. I'm getting what you're saying!
"Now imagine the same thing, but with 13 002 inputs!"
Wait, what? I have to imagine being a ball on a 13 003-dimensional plane now? You promised neural networks wouldn't be that hard if I only saw it as "being a bit of calculus"… I might have to go raid the medicine cabinet to be able to finish this visualisation!

72. Navjot Singh says:

Imagine using a neural network to make a self learning polymorphic computer virus 😳 might have to test that out now that I have a better idea about how neural networks function…

73. steve1978ger says:

https://en.wikipedia.org/wiki/Regional_handwriting_variation#Arabic_numerals
drat.

74. Ahmed Boussabeh says:

18:28 , this is right , adele is a lion , i love her hh
great efforts , gd job <3 <3

75. Olesya Bondar says:

The graphics of this video is absolutely stunning! Thank you for your work ♡

76. Tick Tock World says:

at 3:15 ,bias is mainly used for shifting weight values ,not for dropout .crorrect me if im wroung ,im kind of rewatching all nureal net videos again.

77. Joey George says:

these stuff are complicated*x.

78. Luxurious 03 says:

13:35
Tf an H doing there

79. Intergalactic Gamer says:

7:00 interesting how Newtonian methods predict the application of neutral networks

80. John Wick says:

81. Mahela Munasinghe says:

Thank you!! I just can't describe how valuable your videos are!

82. Crazylalalalala says:

Wow, I am both really sad that i didnt know about any of this and really excited about that at the same time.

Also, who know that linear algebra wasnt a waste of time.. I guess they put it there for a reason.

83. raphael barengo says:

Great, great, great explanation and visualization…

84. NightSkyOutLaw90 says:

Longest 21 minutes of my life lol

85. Rohit Chaudhary says:

It's Newton rapson method . Isn't it ?

86. Ruy Perez Callejas says:

Thank you for such great content.

87. loopsovershit says:

To make this algorithm more trust worthy, I'd suggest to introduce some kind of variability in the first tweaking solutions (the parameters adjustments proposed by the AI) and to compare the results of the different versions of the algorithm. Then, one could create a new kind of gradient vector to measure the effects of the different kinds of variabilities that were introduced, and therefore "deep learn" what are the optimal ways of (more or less ramdomnly) adjusting the initial results of the AI to improve these already self-improving tweakings.

I think one could repeat that process (new level of ramdomnly introduced tweakings, new kind of gradient vector or matrix, new optimisation through the "deep learning" phase) but I don't know what to think of that bs, it's fractal design 😁

88. fio fio says:

great、

89. Rimo A says:

biological neurons are not binary active…

90. Hitsuga Aorusaki says:

Amazing visuals! Can't imagine how you set it up.

91. José RF Junior says:

Fantastico !!

92. Icterine Tech says:

I read an article in which a researcher achieved 99.8% on the mnist data set

93. Josh Beyer says:

I seen thousands of lizards fall on,people today frozen to death wtf

94. Yuan Yuan says:

9:36 not gonna lie it was kind of like a slap to the face to see 3b1b introduce a complex concept and NOT explain it.
anyways, as per usual this was amazingly good.

95. Yuan Yuan says:

also, a question. how do the different ways of learning (reinforcement, unsupervised) affect the cost function? obviously the cost dictates how the different weights and biases change, but how does reinforcement learning change the inherent structure of the function? or, how is something like reinforcement learning programmed? i hope you come back to this series and go deeper, it's fascinating and i love the way you teach.

Youtube should pay the creators like you irrespective of the ad revenue. I mean a genuine monthly salary for providing such first-of-its-kind material. I have watched most of your videos and you are the reason I love mathematics now.

97. yksnimus says:

I love how you explain basic math concepts in different logical ways to interpret it, specially visually. Thats the main problem with math education, too much monkey see monkey do, ppl dont learn how to walk with their own legs, and its all about convoluted jargons

98. Leelahs Playhouse says:

Try out my circuit simulator? Thanks! Stumble Upon: androidcircuitsolver/app.html

99. Ashkan Sheikhi says:

you are absolutely a brilliant teacher

100. Bigyan Chapagain says:

Please name a book from where I can have lucid concepts by doing some math.