Skip to main content

Table 1 Mathematical expressions for several similarity and dissimilarity measures

From: Estimating similarity and distance using FracMinHash

Metric name

Notation

Expression

Jaccard similarity

J(AB)

\(\frac{|A \cap B|}{|A \cup B|}\)

Containment index

C(AB)

\(\frac{|A \cap B|}{|A|}\)

Cosine similarity

(also known as Otsuka-Ochiai)

\(\cos \theta\)

\(\frac{|A \cap B|}{\sqrt{|A|\cdot |B|}}\)

Kulczynski 1

\(K_1(A,B)\)

\(\frac{|A \cap B|}{|A \Delta B|}\)

Kulczynski 2

\(K_2(A,B)\)

\(\frac{1}{2} \Big ( \frac{|A \cap B|}{|A|} + \frac{|A \cap B|}{|B|} \Big )\)

Whittaker distance

W(AB)

\(1 - \frac{1}{2} \Big ( \frac{|A \cap B|}{|A|} + \frac{|A \cap B|}{|B|} \Big )\)

Sorensen index

S(AB)

\(2 \frac{|A \cap B|}{|A| + |B|}\)

Bray-Curtis dissimilarity

BC(AB)

\(1 - 2 \frac{|A \cap B|}{|A| + |B|}\)