Evolution and present situation of corpus research in China

Author: Feng, Zhiwei1

Source: International Journal of Corpus Linguistics, Volume 11, Number 2, 2006 , pp. 173-207(35)

Publisher: John Benjamins Publishing Company

Key:
Free Content - Free Content
New Content - New Content
Subscribed Content - Subscribed Content
Free Trial Content - Free Trial Content

Abstract:

In this paper, the author introduces in detail the development and present situation of corpus linguistics in China: earlier corpora, large-scale & authentic text corpora, national corpora, speech corpora, bilingual corpora and corpora of minority languages in China. The various processing techniques for corpora are also introduced: automatic word segmentation of Chinese text, automatic PoS tagging, automatic tagging of phrase structure and automatic alignment of bilingual corpora. This paper is a bird's-eye view of corpus linguistics of China. Finally, the author discusses several problems in present corpus research: standardization of corpus specifications, commonly sharing of language resources, knowledge properties, etc.

Keywords: automatic alignment of bilingual corpora; automatic PoS tagging; automatic tagging of phrase structure; automatic word segmentation; bilingual corpora; corpora of minority languages in China; corpus; large-scale & authentic text; speech corpora

Document Type: Research article

DOI: 10.1075/ijcl.11.2.03fen

Affiliations: 1: Institute of Applied Linguistics, China

The full text electronic article is available for purchase. You will be able to download the full text electronic article after payment.

$38.34 plus tax      Refund Policy

 

OR

Back to top

Key:
Free Content - Free Content
New Content - New Content
Subscribed Content - Subscribed Content
Free Trial Content - Free Trial Content
Share this item with others: These icons link to social bookmarking sites where readers can share and discover new web pages.
Page Help Click here for Page Help
Shopping cart
Tools
Sign in






Need to register?
Sign up here
Text size: A | A | A | A