Traitement des expressions figées avec INTEX
Abstract:The intex lexical parser processes linguistic units of four formal types: morphemes, simple words, compound words and frozen expressions. Frozen expressions are units spelled in the form of non-contiguous sequences of tokens (eg. take ... into account), and their recognition requires computation traditionally performed by syntactic parsers. Over 30,000 French frozen expressions have been described in the tables Cxx of the lexicon-grammar. We show how to use these tables to automatically construct FSTs that are capable of recognizing and tagging frozen expressions in texts. Representing the result as tags poses some formal problems that we discuss.
Document Type: Research Article
Affiliations: IBM T.J. Watson Research Center
Publication date: 2000-10-01