Hamming distance

The hamming distance of two vectors a,bFa, b \in \mathrm{F} is the number of elements of aa that differ from bb .

In Python it can be calculated:

def hamming_distance(a, b):
    if len(a) != len(b):
        raise ValueError("a must be the same length than b")
    return sum(x != y for x, y in zip(a, b))

Information theory and statistics

Probability mass function

Supose X:ΩARX: \Omega \to A \subseteq \R a discrete random variable, We call PP the mass function of XX defined by:

P(x)=P[X=x] P(x) = P[X = x]

The mass function is the analog of the density function PP were continuous random variable.


Entropy in information theory (also called Shanon Entropy) is a generalization of Thermodynamics Entropy (Boltzmann Entropy).

The entropy HH of a discrete random variable PP with possible values {x1,x2,,xn}\{ x_{1}, x_{2}, \dots, x_{n} \} , with a probability mass function PP is defined by:

H(X)=E(log(P(X))) H(X) = E(-\log(P(X)))

More explicitly:

H(X)=i=1nP(xi)log(P(xi)) H(X) = - \sum_{i=1}^{n} P(x_{i}) \log (P(x_{i}))

Cross Entropy