Entropy measures the amount of disorder or uncertainty in a dataset. It tells us how mixed or unpredictable the class labels are in our data.
Origin of Entropy
Borrowed from thermodynamics, entropy was introduced to information theory by Claude Shannon. In ML, it quantifies the expected surprise in data outcomes.
High vs Low Entropy
High entropy means more randomness—data is hard to predict. Low entropy means data is organized and easier for models to classify.
Entropy Formula
For a dataset, entropy is calculated as:−∑p(i)log2p(i)−∑p(i)log2p(i)where p(i)p(i) is the probability of class ii.
Entropy in Decision Trees
Decision trees use entropy to decide splits. The algorithm picks features that reduce entropy the most, making classes purer at each step.
Information Gain
Information gain measures the reduction in entropy after a split. Higher information gain means the split makes the data more organized and predictable.
Why Entropy Matters
Understanding entropy helps build accurate models, select features, and evaluate predictions—making it a key concept in machine learning.