Handling conjunctions in named entities

Authors: Mazur, Pawel; Dale, Robert

Source: Lingvisticae Investigationes, Volume 30, Number 1, 2007 , pp. 49-68(20)

Publisher: John Benjamins Publishing Company

Buy & download fulltext article:

OR

Price: $37.41 plus tax (Refund Policy)

Abstract:

Although the literature contains reports of very high accuracy figures for the recognition of named entities in text, there are still some named entity phenomena that remain problematic for existing text processing systems. One of these is the ambiguity of conjunctions in candidate named entity strings, an all-too-prevalent problem in corporate and legal documents. In this paper, we distinguish four uses of the conjunction in these strings, and explore the use of a supervised machine learning approach to conjunction disambiguation trained on a very limited set of 'name internal' features that avoids the need for expensive lexical or semantic resources. We achieve 84% correctly classified examples using k-fold evaluation on a data set of 600 instances. We argue that further improvements are likely to require the use of wider domain knowledge and name external features.

Keywords: NAMED ENTITY RECOGNITION; CONJUNCTIONS; MACHINE LEARNING

Document Type: Research article

DOI: http://dx.doi.org/10.1075/li.30.1.05maz

Publication date: 2007-01-01

Related content

Tools

Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content

Text size:

A | A | A | A
Share this item with others: These icons link to social bookmarking sites where readers can share and discover new web pages. print icon Print this page