A survey of named entity recognition and classification

Notice

The full text article is available externally.

Authors: Nadeau, David; Sekine, Satoshi

Source: Lingvisticae Investigationes, Volume 30, Number 1, 2007, pp. 3-26(24)

Publisher: John Benjamins Publishing Company

DOI: https://doi.org/10.1075/li.30.1.03nad

This survey covers fifteen years of research in the Named Entity Recognition and Classification (NERC) field, from 1991 to 2006. We report observations about languages, named entity types, domains and textual genres studied in the literature. From the start, NERC systems have been developed using hand-made rules, but now machine learning techniques are widely used. These techniques are surveyed along with other critical aspects of NERC such as features and evaluation methods. Features are word-level, dictionary-level and corpus-level representations of words in a document. Evaluation techniques, ranging from intuitive exact match to very complex matching techniques with adjustable cost of errors, are an indisputable key to progress.

Keywords: EVALUATION; FEATURE SPACE; LEARNING METHOD; NAMED IDENTITY; SURVEY

Document Type: Research Article

Publication date: 01 January 2007

Access Key
Free content
Partial Free content
New content
Open access content
Partial Open access content
Subscribed content
Partial Subscribed content
Free trial content

A survey of named entity recognition and classification

Notice

Sign-in

Tools

Share Content