- Source: Tversky index
The Tversky index, named after Amos Tversky, is an asymmetric similarity measure on sets that compares a variant to a prototype. The Tversky index can be seen as a generalization of the Sørensen–Dice coefficient and the Jaccard index.
For sets X and Y the Tversky index is a number between 0 and 1 given by
S
(
X
,
Y
)
=
|
X
∩
Y
|
|
X
∩
Y
|
+
α
|
X
∖
Y
|
+
β
|
Y
∖
X
|
{\displaystyle S(X,Y)={\frac {|X\cap Y|}{|X\cap Y|+\alpha |X\setminus Y|+\beta |Y\setminus X|}}}
Here,
X
∖
Y
{\displaystyle X\setminus Y}
denotes the relative complement of Y in X.
Further,
α
,
β
≥
0
{\displaystyle \alpha ,\beta \geq 0}
are parameters of the Tversky index. Setting
α
=
β
=
1
{\displaystyle \alpha =\beta =1}
produces the Jaccard index; setting
α
=
β
=
0.5
{\displaystyle \alpha =\beta =0.5}
produces the Sørensen–Dice coefficient.
If we consider X to be the prototype and Y to be the variant, then
α
{\displaystyle \alpha }
corresponds to the weight of the prototype and
β
{\displaystyle \beta }
corresponds to the weight of the variant. Tversky measures with
α
+
β
=
1
{\displaystyle \alpha +\beta =1}
are of special interest.
Because of the inherent asymmetry, the Tversky index does not meet the criteria for a similarity metric. However, if symmetry is needed a variant of the original formulation has been proposed using max and min functions
.
S
(
X
,
Y
)
=
|
X
∩
Y
|
|
X
∩
Y
|
+
β
(
α
a
+
(
1
−
α
)
b
)
{\displaystyle S(X,Y)={\frac {|X\cap Y|}{|X\cap Y|+\beta \left(\alpha a+(1-\alpha )b\right)}}}
a
=
min
(
|
X
∖
Y
|
,
|
Y
∖
X
|
)
{\displaystyle a=\min \left(|X\setminus Y|,|Y\setminus X|\right)}
,
b
=
max
(
|
X
∖
Y
|
,
|
Y
∖
X
|
)
{\displaystyle b=\max \left(|X\setminus Y|,|Y\setminus X|\right)}
,
This formulation also re-arranges parameters
α
{\displaystyle \alpha }
and
β
{\displaystyle \beta }
. Thus,
α
{\displaystyle \alpha }
controls the balance between
|
X
∖
Y
|
{\displaystyle |X\setminus Y|}
and
|
Y
∖
X
|
{\displaystyle |Y\setminus X|}
in the denominator. Similarly,
β
{\displaystyle \beta }
controls the effect of the symmetric difference
|
X
△
Y
|
{\displaystyle |X\,\triangle \,Y\,|}
versus
|
X
∩
Y
|
{\displaystyle |X\cap Y|}
in the denominator.