Search Results for “censoring (statistics)”

Source: Censoring (statistics)

In statistics, censoring is a condition in which the value of a measurement or observation is only partially known.
For example, suppose a study is conducted to measure the impact of a drug on mortality rate. In such a study, it may be known that an individual's age at death is at least 75 years (but may be more). Such a situation could occur if the individual withdrew from the study at age 75, or if the individual is currently alive at the age of 75.
Censoring also occurs when a value occurs outside the range of a measuring instrument. For example, a bathroom scale might only measure up to 140 kg. If a 160 kg individual is weighed using the scale, the observer would only know that the individual's weight is at least 140 kg.
The problem of censored data, in which the observed value of some variable is partially known, is related to the problem of missing data, where the observed value of some variable is unknown.
Censoring should not be confused with the related idea truncation. With censoring, observations result either in knowing the exact value that applies, or in knowing that the value lies within an interval. With truncation, observations never result in values outside a given range: values in the population outside the range are never seen or never recorded if they are seen. Note that in statistics, truncation is not the same as rounding.

Types

censoring

Analysis

censoring

= Epidemiology

= Operating life testing

= Censored regression

= Likelihood

censoring

, where

F
(
x
)

{\displaystyle F(x)}

is the CDF of the probability distribution, and the two special cases are:

left censoring:

P
r
(
−
∞
<
x
⩽
b
)
=
F
(
b
)
−
F
(
−
∞
)
=
F
(
b
)
−
0
=
F
(
b
)
=
P
r
(
x
⩽
b
)

{\displaystyle Pr(-\infty

right censoring:

P
r
(
a
<
x
⩽
∞
)
=
F
(
∞
)
−
F
(
a
)
=
1
−
F
(
a
)
=
1
−
P
r
(
x
⩽
a
)
=
P
r
(
x
>
a
)

{\displaystyle Pr(aa)}

For continuous probability distributions:

P
r
(
a
<
x
⩽
b
)
=
P
r
(
a
<
x
<
b
)

{\displaystyle Pr(a

Example

Suppose we are interested in survival times,

T

1

,

T

2

,
.
.
.
,

T

n

{\displaystyle T_{1},T_{2},...,T_{n}}

, but we don't observe

T

i

{\displaystyle T_{i}}

for all

i

{\displaystyle i}

. Instead, we observe

(

U

i

,

δ

i

)

{\displaystyle (U_{i},\delta _{i})}

, with

U

i

=

T

i

{\displaystyle U_{i}=T_{i}}

and

δ

i

=
1

{\displaystyle \delta _{i}=1}

if

T

i

{\displaystyle T_{i}}

is actually observed, and

(

U

i

,

δ

i

)

{\displaystyle (U_{i},\delta _{i})}

, with

U

i

<

T

i

{\displaystyle U_{i}
and

δ

i

=
0

{\displaystyle \delta _{i}=0}

if all we know is that

T

i

{\displaystyle T_{i}}

is longer than

U

i

{\displaystyle U_{i}}

.
When

T

i

>

U

i

,

U

i

{\displaystyle T_{i}>U_{i},U_{i}}

is called the censoring time.
If the censoring times are all known constants, then the likelihood is

L
=

∏

i
,

δ

i

=
1

f
(

u

i

)

∏

i
,

δ

i

=
0

S
(

u

i

)

{\displaystyle L=\prod _{i,\delta _{i}=1}f(u_{i})\prod _{i,\delta _{i}=0}S(u_{i})}

where

f
(

u

i

)

{\displaystyle f(u_{i})}

= the probability density function evaluated at

u

i

{\displaystyle u_{i}}

,
and

S
(

u

i

)

{\displaystyle S(u_{i})}

= the probability that

T

i

{\displaystyle T_{i}}

is greater than

u

i

{\displaystyle u_{i}}

, called the survival function.
This can be simplified by defining the hazard function, the instantaneous force of mortality, as

λ
(
u
)
=
f
(
u
)

/

S
(
u
)

{\displaystyle \lambda (u)=f(u)/S(u)}

so

f
(
u
)
=
λ
(
u
)
S
(
u
)

{\displaystyle f(u)=\lambda (u)S(u)}

.
Then

L
=

∏

i

λ
(

u

i

)

δ

i

S
(

u

i

)

{\displaystyle L=\prod _{i}\lambda (u_{i})^{\delta _{i}}S(u_{i})}

.
For the exponential distribution, this becomes even simpler, because the hazard rate,

λ

{\displaystyle \lambda }

, is constant, and

S
(
u
)
=
exp
⁡
(
−
λ
u
)

{\displaystyle S(u)=\exp(-\lambda u)}

. Then:

L
(
λ
)
=

λ

k

exp
⁡
(
−
λ
∑

u

i

)

{\displaystyle L(\lambda )=\lambda ^{k}\exp(-\lambda \sum {u_{i}})}

,
where

k
=
∑

δ

i

{\displaystyle k=\sum {\delta _{i}}}

.
From this we easily compute

λ
^

{\displaystyle {\hat {\lambda }}}

, the maximum likelihood estimate (MLE) of

λ

{\displaystyle \lambda }

, as follows:

l
(
λ
)
=
log
⁡
(
L
(
λ
)
)
=
k
log
⁡
(
λ
)
−
λ
∑

u

i

{\displaystyle l(\lambda )=\log(L(\lambda ))=k\log(\lambda )-\lambda \sum {u_{i}}}

.
Then

d
l

/

d
λ
=
k

/

λ
−
∑

u

i

{\displaystyle dl/d\lambda =k/\lambda -\sum {u_{i}}}

.
We set this to 0 and solve for

λ

{\displaystyle \lambda }

to get:

λ
^

=
k

/

∑

u

i

{\displaystyle {\hat {\lambda }}=k/\sum u_{i}}

.
Equivalently, the mean time to failure is:

1

/

λ
^

=
∑

u

i

/

k

{\displaystyle 1/{\hat {\lambda }}=\sum u_{i}/k}

.
This differs from the standard MLE for the exponential distribution in that the any censored observations are considered only in the numerator.

References

External links

"Engineering Statistics Handbook", NIST/SEMATEK, [1]

Kata Kunci Pencarian:

Nonton film Moneyball (2011) terbaru sub indo

7.279

134 min

Moneyball (2011)

Bioskop Online, bioskop21, BioskopKeren, Cinemaindo, Dewanonton, Drakor ID, DrakorIndo, Drama, DramaQu, gudangmovie, gudangmovie21, IndoXX1, USA

Watch

No More Posts Available.

No more pages to load.

Types

Analysis

= Epidemiology

= Operating life testing

= Censored regression

= Likelihood

See also

References

Further reading

External links

Kata Kunci Pencarian:

Moneyball (2011)

Recent Movies

Recent Movies

Categories

Recent Movies