- Source: Fujisaki model
The Fujisaki model is a superpositional model for representing F0 contour of speech.
According to the model, F0 contour is generated as a result of the superposition of the outputs of two second order linear filters with a base frequency value. The second order linear filters are for generating the phrase and accent components of speech. The base frequency is the minimum frequency value of the speaker. In other words, F0 contour is obtained by adding base frequency, phrase components and accent components. The model was proposed by Hiroya Fujisaki.
ln
(
F
0
(
t
)
)
=
ln
(
F
b
)
+
∑
i
=
1
I
A
p
i
G
p
i
(
t
−
T
0
i
)
+
∑
j
=
1
J
A
a
j
{
G
a
j
(
t
−
T
1
j
)
−
G
a
j
(
t
−
T
2
j
)
}
{\displaystyle \ln(F_{0}(t))=\ln(F_{b})+\sum _{i=1}^{I}A_{pi}G_{pi}(t-T_{0i})+\sum _{j=1}^{J}A_{aj}\{G_{aj}(t-T_{1j})-G_{aj}(t-T_{2j})\}}
where
G
p
i
(
t
)
=
α
i
2
t
exp
(
−
α
i
t
)
∀
t
≥
0
;
=
0
∀
t
≤
0
{\displaystyle G_{pi}(t)=\alpha _{i}^{2}t\,\exp(-\alpha _{i}t)\quad \forall t\geq 0;=0\forall t\leq 0}
G
a
i
(
t
)
=
min
[
γ
j
,
1
−
(
1
+
β
j
t
)
exp
(
−
β
j
t
)
]
∀
t
≥
0
;
=
0
∀
t
≤
0
{\displaystyle G_{ai}(t)=\min[\gamma _{j},\,1-(1+\beta _{j}t)\,\exp(-\beta _{j}t)]\quad \forall t\geq 0;=0\forall t\leq 0}
Where,
F
b
{\displaystyle F_{b}}
: bias level upon which all the phrase and accent components are superposed to form an
F
0
{\displaystyle F_{0}}
contour,
I
{\displaystyle I}
: number of phrase commands,
J
{\displaystyle J}
: number of accent commands,
A
p
i
{\displaystyle A_{pi}}
: magnitude of the ith phrase command,
A
a
j
{\displaystyle A_{aj}}
: amplitude of the jth accent command,
T
0
i
{\displaystyle T_{0i}}
: instant of occurrence of the ith phrase
command,
T
1
j
{\displaystyle T_{1j}}
: onset of the jth accent command,
T
2
j
{\displaystyle T_{2j}}
: end of the jth accent command,
α
i
{\displaystyle \alpha _{i}}
: natural angular frequency of the phrase control mechanism to the ith phrase command,
β
j
{\displaystyle \beta _{j}}
: natural angular frequency of the accent control mechanism to the jth accent command, and
γ
j
{\displaystyle \gamma _{j}}
: ceiling level of the accent component for the jth accent command.
References
An Introduction to Text-to-Speech Synthesis
Keikichi Hirose; Hiroya Fujisaki; Mikio Yamaguchi (1984). "Synthesis by rule of voice fundamental frequency contours of spoken Japanese from linguistic information". IEEE.
Kata Kunci Pencarian:
- Yua Shinkawa
- Nanako Matsushima
- Inuyasha
- Kereta rel listrik Toei seri 6000
- Mirei Kiritani
- Fujisaki model
- Miyu Uehara
- List of Equinox episodes
- Fujisaki Hachimangū
- Adaptive chosen-ciphertext attack
- List of Shugo Chara! characters
- Momel
- List of Danganronpa characters
- Carina Faris
- Rina Ōta