Hybrid Approaches for Automatic Segmentation and Annotation of a Chinese Text Corpus
Author: Zhiwei F.1
Source: International Journal of Corpus Linguistics, Volume 6, Special Issue, 2001 , pp. 35-42(8)
Publisher: John Benjamins Publishing Company
Abstract:
This paper describes the hybrid approaches for automatic segmentation and annotation of a Chinese text corpus. Some experiment results are given. Hybrid approaches combine the rule-based method, the statistic-based method, and the automatic learning method. It is a good approach, and it can obviously improve the precision of segmentation and annotation of a Chinese text corpus.
Keywords: segmentation; tagging; hybrid approach; rule-based approach; HMM (Hidden Markov Model); CLAWS (Constituent-Likelihood Automatic Word-tagging System) algorithm; TBED (Transform Based Error Driven); Brill method
Language: English
Document Type: Regular paper
DOI: 10.1075/ijcl.6.3.04fen

Click here for Page Help