Skip to main content

Automatic classification of multi-word expressions in print dictionaries

Buy Article:

$36.18 plus tax (Refund Policy)

Abstract:

SummaryThis work demonstrates the assignment of multi-word expressions in print dictionaries to POS classes with minimal linguistic resources. In this application, 32,000 entries from the W├Ârterbuch der deutschen Idiomatik (H. Schemann 1993) were classified using an inductive description of POS sequences in conjunction with a Brill Tagger trained on manually tagged idiomatic entries. This process assigned categories to 86% of entries with 88% accuracy. This classification supplies a meaningful preprocessing step for further applications: the resulting POS-sequences for all idiomatic entries might be used for the automatic recognition of multi-word lexemes in unrestricted text.

Document Type: Research Article

DOI: http://dx.doi.org/10.1075/li.26.2.03gey

Affiliations: 1: Berlin-Brandenburg Academy of Sciences 2: California Institute of Technology

Publication date: January 1, 2003

jbp/li/2003/00000026/00000002/art00002
dcterms_title,dcterms_description,pub_keyword
6
5
20
40
5

Access Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content
Cookie Policy
X
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more