Automatic classification of multi-word expressions in print dictionaries

$39.11 plus tax (Refund Policy)

Buy Article:


SummaryThis work demonstrates the assignment of multi-word expressions in print dictionaries to POS classes with minimal linguistic resources. In this application, 32,000 entries from the W├Ârterbuch der deutschen Idiomatik (H. Schemann 1993) were classified using an inductive description of POS sequences in conjunction with a Brill Tagger trained on manually tagged idiomatic entries. This process assigned categories to 86% of entries with 88% accuracy. This classification supplies a meaningful preprocessing step for further applications: the resulting POS-sequences for all idiomatic entries might be used for the automatic recognition of multi-word lexemes in unrestricted text.

Document Type: Research Article


Affiliations: 1: Berlin-Brandenburg Academy of Sciences 2: California Institute of Technology

Publication date: January 1, 2003

Related content



Share Content

Access Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content
Cookie Policy
Cookie Policy
ingentaconnect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more