Shannon Entropy
Intermediate
Measures the average information (surprise) in a random variable—how many bits needed to encode it.
Formal Sciences
Information Theory
Entropy, optimization, and the math behind intelligence.
Measures the average information (surprise) in a random variable—how many bits needed to encode it.
Update parameters by stepping opposite to the gradient of the loss—learning by hill descent.
Measures difference between true labels and predicted probabilities in classification.
Each token attends to all others—weighted by query-key similarity, scaled by dimension.
Ranks web pages by importance based on the quality and quantity of links pointing to them.