A cascaded neuro-computational model for spoken word recognition
In human speech recognition, words are analysed at both pre-lexical (i.e., sub-word) and lexical (word) levels. The aim of this paper is to propose a constructive neuro-computational model that incorporates both these levels as cascaded layers of pre-lexical and lexical units. The layered
structure enables the system to handle the variability of real speech input. Within the model, receptive fields of the pre-lexical layer consist of radial basis functions; the lexical layer is composed of units that perform pattern matching between their internal template and a series of labels,
corresponding to the winning receptive fields in the pre-lexical layer. The model adapts through self-tuning of all units, in combination with the formation of a connectivity structure through unsupervised (first layer) and supervised (higher layers) network growth. Simulation studies show
that the model can achieve a level of performance in spoken word recognition similar to that of a benchmark approach using hidden Markov models, while enabling parallel access to word candidates in lexical decision making.
Keywords: artificial neural networks; automatic speech recognition; human speech recognition; network growing model; spoken word recognition
Document Type: Research Article
Affiliations: 1: Department of Mathematics, College of Science & Technology, Nihon University, Tokyo, Japan,Laboratory for Advanced Brain Signal Processing/Laboratory for Perceptual Dynamics, Brain Science Institute, RIKEN, Saitama, Japan 2: Laboratory for Perceptual Dynamics, Brain Science Institute, RIKEN, Saitama, Japan
Publication date: 01 March 2010
- Editorial Board
- Information for Authors
- Subscribe to this Title
- Ingenta Connect is not responsible for the content or availability of external websites
- Access Key
- Free content
- Partial Free content
- New content
- Open access content
- Partial Open access content
- Subscribed content
- Partial Subscribed content
- Free trial content