Deep Reinforcement Learning

Resources: 🛒 The Supermarket Analogy [Plaat et al. Page 25] Imagine you have just moved to a new city, you are hungry, and you want to buy some groceries. There is an unrealistic catch: you have no map and no smartphone. After some random exploration, you find a supermarket. You carefully note the route in ... Read More

Machine Learning Applications – Learning Associations

References: Learning Associations Association Rule Learning is a rule-based method used to discover interesting relations or “hidden patterns” between variables in large databases. It is primarily an unsupervised learning technique because it doesn’t require pre-labeled data; it simply looks for items that frequently occur together. The most famous application is Market Basket Analysis, which retailers ... Read More

Various Bounds in Optimization

Regret Bound:https://www.emergentmind.com/topics/cumulative-regret-analysishttp://sbubeck.com/LecturesALL_Bubeck.pdfhttps://www.jmlr.org/papers/volume11/jaksch10a/jaksch10a.pdfhttps://datascience.stackexchange.com/questions/62141/what-are-regret-boundshttps://arxiv.org/pdf/1403.5556 Lecture 1:Introduction to regret analysisS´ebastien BubeckMachine Learning and Optimization group, MSR AI

Assumptions and Their Meaning in Optimization Problems

1. Lipschitz Continuity: The “Speed Limit”This assumption prevents the function from changing too rapidly over a certain distance. 2. 𝐿-Smoothness: The Curvature CeilingSmoothness ensures the gradient (slope) of the function doesn’t change abruptly. A function is 𝐿-smooth if its gradient is Lipschitz continuous. 3. 𝜇-Strong Convexity : The Curvature Floor – Bowl ShapeStrong convexity guarantees that ... Read More

Federated Learning Algorithm Averaging Methods FedAvgM, FedProx, FedAvg

FedAvg: https://arxiv.org/abs/1602.05629 __________________________________________________________________________________________________________ FedAvgM: https://arxiv.org/pdf/1909.06335 stands for Federated Averaging with Server Momentum. It is an upgrade to the original FedAvg algorithm designed specifically to solve the “Client Drift” problemwhere the model gets confused because different users have very different data (Non-IID). The Formula: How it works mathematically The FedAvgM update happens in three steps on ... Read More

Deep Reinforcement Learning Parameters

Epsilon (ϵepsilon 𝜖): This is a hyperparameter (a value between 0 and 1) that defines the probability of an agent taking a random action (exploration) instead of the action believed to have the highest reward (exploitation). Epsilon Decay: This is the process of gradually reducing the value of 𝜖 over time or a number of ... Read More

All Machine Learning MOdels

https://www.ibm.com/think/topics/machine-learning-types Machine learning models are grouped in one of 5 types, supervised, unsupervised, self-supervised, semi-supervised and reinforcement learning. Machine learning libraries, useful.

MNIST PCA

https://medium.com/@azimkhan8018/a-beginners-guide-to-deep-learning-with-mnist-dataset-0894f7183344https://ranasinghiitkgp.medium.com/principal-component-analysis-pca-with-code-on-mnist-dataset-da7de0d07c22https://github.com/mkosaka1/MNIST_PCA_CNN/blob/main/PCA%20%2B%20CNN.ipynbhttps://gist.github.com/tommct/1490cdf856d745ba41c1ac99ada2b579

Reinforcement Learning

https://au.mathworks.com/content/dam/mathworks/ebook/gated/reinforcement-learning-ebook-all-chapters.pdf Reinforcement Learning: Type of machine learning, where a computer agent learns to perform a task through a process of repeated trial and error within a dynamic environment. The goal is to learn what to do, how to map situations to take certain actions that minimize a numerical reward signal. Eg. RL trained computer gains ... Read More
error: