Skip to main content
padlock icon - secure page this page is secure

Development of Modified Analytical Model for Investigating Acceptable Delay of TCP-Based Speech Recognition

Buy Article:

$106.34 + tax (Refund Policy)

Many studies have proposed solutions to overcome the degradation of network speech recognition (NSR) caused by packet loss and jitter. The most popular cloud-based speech recognition systems, such as Google speech recognition and Apple Siri, have currently been employing TCP in cooperation with HTTP. TCP as a reliable transport protocol appropriately deliver all speech data to the server but may have delays due to unexpected network condition. In order to achieve a satisfactory performance of NSR against TCP delay, an acceptable delay should be fulfilled. In this paper, a scheme of TCP-based NSR with a speech segmenter at the client side is proposed and an analytical model to investigate the acceptable delay is developed on the basis of a study of the stored streaming via TCP employing a discrete-time Markov model. The speech segmenter allows TCP to send the speech signal sentence by sentence, so that the resulted texts are recognized using a language model. The acceptable delay is defined as a specified length of time required for the server to receive the entire data of sentence. A negative value of the number of early packets within the acceptable delay bound indicates that the sentence streaming is slow. Our model is validated via ns-3 simulations. Moreover, the model is verified for a distribution of 2500 Indonesian sentences using TANGRAM-II to prove the real-time factor (RTF) of TCP-based speech recognition and to identify its working region. The model advises that the real-time factor (RTF) is not achieved when loss rate is 0.014 and RTT is 100 ms. The streaming over TCP leads to a satisfactory performance within an acceptable delay of eight seconds when the loss rate is smaller than 0.05 and round-trip-time is 100 ms. When the round-trip-time is doubled, the streaming works within an acceptable delay of 17 seconds.
No Reference information available - sign in for access.
No Citation information available - sign in for access.
No Supplementary Data.
No Article Media
No Metrics

Keywords: Acceptable Delay; Network Speech Recognition; Satisfactory Performance; Streaming Over TCP

Document Type: Research Article

Affiliations: 1: Department of Electrical Engineering, Faculty of Engineering, Universitas Indonesia Depok 16424, Indonesia 2: Center for Information and Communication Technology, Agency for the Assessment and Application of Technology PUSPIPTEK, Serpong, Tangerang Selatan 15314, Indonesia 3: Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Indonesia Depok 16424, Indonesia

Publication date: April 1, 2017

More about this publication?
  • ADVANCED SCIENCE LETTERS is an international peer-reviewed journal with a very wide-ranging coverage, consolidates research activities in all areas of (1) Physical Sciences, (2) Biological Sciences, (3) Mathematical Sciences, (4) Engineering, (5) Computer and Information Sciences, and (6) Geosciences to publish original short communications, full research papers and timely brief (mini) reviews with authors photo and biography encompassing the basic and applied research and current developments in educational aspects of these scientific areas.
  • Editorial Board
  • Information for Authors
  • Subscribe to this Title
  • Ingenta Connect is not responsible for the content or availability of external websites
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more