Griffin-Lim Vocoder

A traditional vocoder based on iterative algorithm.

Link: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1164317

Goal

Reconstruct voice waveform from spectrum.

amplitude spectrum -> phase spectrum

Method

  1. Initialize phase spectrum randomly
  2. Synthesize voice waveform from amplitude spectrum and phase spectrum by ISTFT(Inverse Short Time Fourier Transform)
  3. Apply STFT to new voice waveform, get new amplitude spectrum and phase spectrum.
  4. Drop new amplitude spectrum, and goto 2.
1
2
3
4
5
6
phases = np.exp(2j * np.pi * np.random.rand(*S.shape))
S_complex = np.abs(S).astype(np.complex)
y = _istft(S_complex * phases, hparams)
for i in range(hparams.griffin_lim_iters):
phases = np.exp(1j * np.angle(_stft(y, hparams)))
y = _istft(S_complex * phases, hparams)

Why?

In the appendix of this paper, the difference between estimated value and true value decreases continuously in the iterative process.

But it is hard to understand the proving process for me right now. So maybe I will look back to it later.

  • Copyrights © 2021 BakerBunker

请我喝杯咖啡吧~

支付宝
微信