If you are experiencing problems downloading PDF or HTML fulltext, our helpdesk recommend clearing your browser cache and trying again. If you need help in clearing your cache, please click here . Still need help? Email help@ingentaconnect.com

INTEX pour l’annotation semi-automatique d’un corpus d’anaphores

$39.11 plus tax (Refund Policy)

Buy Article:


Anaphors constitute a well-known problem in automatic text generation and natural language understanding. Using corpora to deal with such phenomena could help to develop robust processing techniques. Building such resources is, though, a tedious and time-consuming task and could more easily be accomplished by partial automation.

In this paper, we show how the intex system can be used for this task. We show that in a newspaper corpus (in this case, le Monde Diplomatique), discursive grammatical anaphors can easily be located via associated linguistic features. A series of transducers generating tags for categories and functions can thus be built, and constitutes an efficient pre-processing stage (though manual checking remains necessary). The heuristics, quickly and easily developed, are specific to the task. The study goes on to show, however, that discarding non-anaphoric pronouns is not straightforward in the case of non-referential personal pronouns or indefinite pronouns, and that the tagging of the grammatical function seems limited in the absence of real syntactic processing.

Document Type: Research Article

DOI: http://dx.doi.org/10.1075/li.22.11tut

Affiliations: Équipe CRISTAL-GRESEC, Université Stendhal - Grenoble 3

Publication date: October 1, 2000

Related content



Share Content

Access Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content
Cookie Policy
Cookie Policy
ingentaconnect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more