- Optimization algorithm
- Trains Machine Learning Model
- Minimizes errors between predicted value and actual value
Iterative process:
- Trains over data over time
- Users parameters with loss function
- Until the function is close or equal to zero to get smallest possible error
Goal of gradient descent: MINIMIZE the cost function (error between predicted and actual value)
Requirements: a Learning Rate and the direction where to go.
Terms:
- Cost function: average error across whole dataset
- Loss Function: can be for a single sample of dataset
Types of Gradient Descents:
- Batch
- Stochastic
- Mini-Batch
https://www.ibm.com/think/topics/gradient-descent
Linear Regression: Gradient Descent (One of the best source)
https://developers.google.com/machine-learning/crash-course/linear-regression/gradient-descent
Math behind gradient descent:
- Initial model weights
- Calculate MSE loss with current model parameters
- Get weights values
- Move according to the learning rate for new weight values
- Perform for number of iterations
https://sebastianraschka.com/faq/docs/gradient-optimization.html
https://www.geeksforgeeks.org/gradient-descent-algorithm-and-its-variants