Search Results for “mixture distribution”

Source: Mixture distribution

In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection according to given probabilities of selection, and then the value of the selected random variable is realized. The underlying random variables may be random real numbers, or they may be random vectors (each having the same dimension), in which case the mixture distribution is a multivariate distribution.
In cases where each of the underlying random variables is continuous, the outcome variable will also be continuous and its probability density function is sometimes referred to as a mixture density. The cumulative distribution function (and the probability density function if it exists) can be expressed as a convex combination (i.e. a weighted sum, with non-negative weights that sum to 1) of other distribution functions and density functions. The individual distributions that are combined to form the mixture distribution are called the mixture components, and the probabilities (or weights) associated with each component are called the mixture weights. The number of components in a mixture distribution is often restricted to being finite, although in some cases the components may be countably infinite in number. More general cases (i.e. an uncountable set of component distributions), as well as the countable case, are treated under the title of compound distributions.
A distinction needs to be made between a random variable whose distribution function or density is the sum of a set of components (i.e. a mixture distribution) and a random variable whose value is the sum of the values of two or more underlying random variables, in which case the distribution is given by the convolution operator. As an example, the sum of two jointly normally distributed random variables, each with different means, will still have a normal distribution. On the other hand, a mixture density created as a mixture of two normal distributions with different means will have two peaks provided that the two means are far enough apart, showing that this distribution is radically different from a normal distribution.
Mixture distributions arise in many contexts in the literature and arise naturally where a statistical population contains two or more subpopulations. They are also sometimes used as a means of representing non-normal distributions. Data analysis concerning statistical models involving mixture distributions is discussed under the title of mixture models, while the present article concentrates on simple probabilistic and statistical properties of mixture distributions and how these relate to properties of the underlying distributions.

Finite and countable mixtures

distribution

mixture

distribution

mixture

Uncountable mixtures

Mixtures within a parametric family

mixture

Properties

= Convexity

mixture

= Moments

mixture

distribution

mixture

= Modes

mixture

showing a one-to-one correspondence of modes of the mixture and those on the ridge elevation function

h
(
α
)
=
q
(

x

∗

(
α
)
)

{\displaystyle h(\alpha )=q(x^{*}(\alpha ))}

thus one may identify the modes by solving

d
h
(
α
)

d
α

=
0

{\displaystyle {\frac {dh(\alpha )}{d\alpha }}=0}

with respect to

α

{\displaystyle \alpha }

and determining the value

x

∗

(
α
)

{\displaystyle x^{*}(\alpha )}

.
Using graphical tools, the potential multi-modality of mixtures with number of components

n
∈
{
2
,
3
}

{\displaystyle n\in \{2,3\}}

is demonstrated; in particular it is shown that the number of modes may exceed

n

{\displaystyle n}

and that the modes may not be coincident with the component means. For two components they develop a graphical tool for analysis by instead solving the aforementioned differential with respect to the first mixing weight

w

1

{\displaystyle w_{1}}

(which also determines the second mixing weight through

w

2

=
1
−

w

1

{\displaystyle w_{2}=1-w_{1}}

) and expressing the solutions as a function

Π
(
α
)
,

α
∈
[
0
,
1
]

{\displaystyle \Pi (\alpha ),\,\alpha \in [0,1]}

so that the number and location of modes for a given value of

w

1

{\displaystyle w_{1}}

corresponds to the number of intersections of the graph on the line

Π
(
α
)
=

w

1

{\displaystyle \Pi (\alpha )=w_{1}}

. This in turn can be related to the number of oscillations of the graph and therefore to solutions of

d
Π
(
α
)

d
α

=
0

{\displaystyle {\frac {d\Pi (\alpha )}{d\alpha }}=0}

leading to an explicit solution for the case of a two component mixture with

Σ

1

=

Σ

2

=
Σ

{\displaystyle \Sigma _{1}=\Sigma _{2}=\Sigma }

(sometimes called a homoscedastic mixture) given by

1
−
α
(
1
−
α
)

d

M

(

μ

1

,

μ

2

,
Σ

)

2

{\displaystyle 1-\alpha (1-\alpha )d_{M}(\mu _{1},\mu _{2},\Sigma )^{2}}

where

d

M

(

μ

1

,

μ

2

,
Σ
)
=

(

μ

2

−

μ

1

)

T

Σ

−
1

(

μ

2

−

μ

1

)

{\displaystyle d_{M}(\mu _{1},\mu _{2},\Sigma )={\sqrt {(\mu _{2}-\mu _{1})^{T}\Sigma ^{-1}(\mu _{2}-\mu _{1})}}}

is the Mahalanobis distance between

μ

1

{\displaystyle \mu _{1}}

and

μ

2

{\displaystyle \mu _{2}}

.
Since the above is quadratic it follows that in this instance there are at most two modes irrespective of the dimension or the weights.
For normal mixtures with general

n
>
2

{\displaystyle n>2}

and

D
>
1

{\displaystyle D>1}

, a lower bound for the maximum number of possible modes, and – conditionally on the assumption that the maximum number is finite – an upper bound are known. For those combinations of

n

{\displaystyle n}

and

D

{\displaystyle D}

for which the maximum number is known, it matches the lower bound.

Examples

= Two normal distributions

=
Simple examples can be given by a mixture of two normal distributions. (See Multimodal distribution#Mixture of two normal distributions for more details.)
Given an equal (50/50) mixture of two normal distributions with the same standard deviation and different means (homoscedastic), the overall distribution will exhibit low kurtosis relative to a single normal distribution – the means of the subpopulations fall on the shoulders of the overall distribution. If sufficiently separated, namely by twice the (common) standard deviation, so

|

μ

1

−

μ

2

|

>
2
σ
,

{\displaystyle \left|\mu _{1}-\mu _{2}\right|>2\sigma ,}

these form a bimodal distribution, otherwise it simply has a wide peak. The variation of the overall population will also be greater than the variation of the two subpopulations (due to spread from different means), and thus exhibits overdispersion relative to a normal distribution with fixed variation

σ
,

{\displaystyle \sigma ,}

though it will not be overdispersed relative to a normal distribution with variation equal to variation of the overall population.
Alternatively, given two subpopulations with the same mean and different standard deviations, the overall population will exhibit high kurtosis, with a sharper peak and heavier tails (and correspondingly shallower shoulders) than a single distribution.

= A normal and a Cauchy distribution

=
The following example is adapted from Hampel, who credits John Tukey.
Consider the mixture distribution defined by

F(x) = (1 − 10−10) (standard normal) + 10−10 (standard Cauchy).
The mean of i.i.d. observations from F(x) behaves "normally" except for exorbitantly large samples, although the mean of F(x) does not even exist.

Applications

Mixture densities are complicated densities expressible in terms of simpler densities (the mixture components), and are used both because they provide a good model for certain data sets (where different subsets of the data exhibit different characteristics and can best be modeled separately), and because they can be more mathematically tractable, because the individual mixture components can be more easily studied than the overall mixture density.
Mixture densities can be used to model a statistical population with subpopulations, where the mixture components are the densities on the subpopulations, and the weights are the proportions of each subpopulation in the overall population.
Mixture densities can also be used to model experimental error or contamination – one assumes that most of the samples measure the desired phenomenon, with some samples from a different, erroneous distribution.
Parametric statistics that assume no error often fail on such mixture densities – for example, statistics that assume normality often fail disastrously in the presence of even a few outliers – and instead one uses robust statistics.
In meta-analysis of separate studies, study heterogeneity causes distribution of results to be a mixture distribution, and leads to overdispersion of results relative to predicted error. For example, in a statistical survey, the margin of error (determined by sample size) predicts the sampling error and hence dispersion of results on repeated surveys. The presence of study heterogeneity (studies have different sampling bias) increases the dispersion relative to the margin of error.

= Mixture

=
Mixture (probability)
Mixture model

= Hierarchical models

=
Graphical model
Hierarchical Bayes model

Notes

References

Frühwirth-Schnatter, Sylvia (2006), Finite Mixture and Markov Switching Models, Springer, ISBN 978-1-4419-2194-9
Lindsay, Bruce G. (1995), Mixture models: theory, geometry and applications, NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5, Hayward, CA, USA: Institute of Mathematical Statistics, ISBN 0-940600-32-3, JSTOR 4153184
Seidel, Wilfried (2010), "Mixture models", in Lovric, M. (ed.), International Encyclopedia of Statistical Science, Heidelberg: Springer, pp. 827–829, arXiv:0909.0389, doi:10.1007/978-3-642-04898-2, ISBN 978-3-642-04898-2
Yao, Weixin; Xiang, Sijia (2024), Mixture Models: Parametric, Semiparametric, and New Directions, Boca Raton, FL: Chapman & Hall/CRC Press, ISBN 978-0367481827.

Finite and countable mixtures

Uncountable mixtures

Mixtures within a parametric family

Properties

= Convexity

= Moments

= Modes

Examples

= Two normal distributions

= A normal and a Cauchy distribution

Applications

See also

= Mixture

= Hierarchical models

Notes

References

Kata Kunci Pencarian:

Recent Movies

Recent Movies

Categories

Recent Movies