Skip to main content

Automatic classification of multi-word expressions in print dictionaries

Buy Article:

$33.43 plus tax (Refund Policy)

SummaryThis work demonstrates the assignment of multi-word expressions in print dictionaries to POS classes with minimal linguistic resources. In this application, 32,000 entries from the W├Ârterbuch der deutschen Idiomatik (H. Schemann 1993) were classified using an inductive description of POS sequences in conjunction with a Brill Tagger trained on manually tagged idiomatic entries. This process assigned categories to 86% of entries with 88% accuracy. This classification supplies a meaningful preprocessing step for further applications: the resulting POS-sequences for all idiomatic entries might be used for the automatic recognition of multi-word lexemes in unrestricted text.
No Reference information available - sign in for access.
No Citation information available - sign in for access.
No Supplementary Data.
No Data/Media
No Metrics

Document Type: Research Article

Affiliations: 1: Berlin-Brandenburg Academy of Sciences 2: California Institute of Technology

Publication date: 2003-01-01

  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more