Skip to main content

Subjective Evaluation and Comparison of the Speech Quality of Text-to-Speech Systems for the German Language Auditive Bestimmung und Vergleich der Sprachqualität von Sprachsynthesesystemen für die deutsche Sprache

Buy Article:

$33.00 plus tax (Refund Policy)


In this paper, methodology and results of two extensive listening experiments for the quality evaluation of speech samples from 13 German text-to-speech systems are discussed. The experiments were performed at the Institute for Telecommunications of the Technical University of Berlin in cooperation with the Berlin Research Centre of the German Telekom. The first experiment was a multidimensional category rating test, which has been standardized by the International Telecommunication Union (ITU). Its results display the wide term "speech quality" more precisely by a complementary set of eight quality descriptors. In the second experiment, dictation sessions were performed to obtain additional results for the intelligibility of the synthesized speech. The results of both experiments show that the TD-PSOLA technique (TD-PSOLA = Time Domain Pitch Synchronous Overlap and Add) produces better speech quality than others, above all compared with formant synthesis techniques.

Dieser Beitrag diskutiert Methodik und Ergebnisse zweier umfangreicher Hörversuche zur Qualitätsbestimmung der Sprachproben von insgesamt 13 Sprachsynthesesystemen für die deutsche Sprache. Die Experimente wurden im Rahmen eines gemeinsamen Projekts des Instituts für Fernmeldetechnik der Technischen Universität Berlin und des Forschungszentrums Berlin der Deutschen Bundespost Telekom durchgeführt. Das erste Experiment war ein von der International Telecommunication Union (ITU) standardisierter mehrdimensionaler Opinion-Test. Es führte zu Ergebnissen, die den sehr weiten Begriff "Sprachqualität" durch acht sich ergänzende Beurteilungskriterien präzisieren. Das zweite Experiment, ein Diktattest, diente zur genaueren Bestimmung der Sprachverständlichkeit. Die Ergebnisse beider Versuche zeigen, daß Sprachsynthesesysteme, die nach dem Zeitbereichs-PSOLA-Verfahren (PSOLA = Pitch Synchronous Overlap and Add) arbeiten, vor allem gegenüber dem Formantsyntheseprinzip deutlich besser abschneiden.

Language: German

Document Type: Research Article

Publication date: January 1, 1997

More about this publication?
  • Acta Acustica united with Acustica, published together with the European Acoustics Association (EAA), is an international, peer-reviewed journal on acoustics. It publishes original articles on all subjects in the field of acoustics, such as general linear acoustics, nonlinear acoustics, macrosonics, flow acoustics, atmospheric sound, underwater sound, ultrasonics, physical acoustics, structural acoustics, noise control, active control, environmental noise, building acoustics, room acoustics, acoustic materials, acoustic signal processing, computational and numerical acoustics, hearing, audiology and psychoacoustics, speech, musical acoustics, electroacoustics, auditory quality of systems. It reports on original scientific research in acoustics and on engineering applications. The journal considers scientific papers, technical and applied papers, book reviews, short communications, doctoral thesis abstracts, etc. In irregular intervals also special issues and review articles are published.
  • Editorial Board
  • Information for Authors
  • Subscribe to this Title
  • Information for Advertisers
  • Terms & Conditions
  • Ingenta Connect is not responsible for the content or availability of external websites

Access Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Partial Open Access Content
Partial Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content
Cookie Policy
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more