|
|
Article: On the development of a tagset for Northern Sotho with special reference to the issue of standardisation/Die ontwikkeling van 'n stel annoteringsmerkers vir Noord-Sotho, met spesiale verwysing na standaardiseringsaangeleenthede.(Report)
- Article from:
- Literator: Journal of Literary Criticism, comparative linguistics and literary studies
- Article date:
- April 1, 2008
- Author:
CopyrightCOPYRIGHT 2008 Literator Society of South Africa. This material is published under license from the publisher through the Gale Group, Farmington Hills, Michigan. All inquiries regarding rights should be directed to the Gale Group. (Hide copyright information)
|
Abstract
Working with corpora in the South African Bantu languages has up till now been limited to the utilisation of raw corpora. Such corpora, however, have limited functionality. Thus the next Iogical step in any NLP application is the development of software for automatic tagging of electronic texts. The development of a tagset is one of the first steps in corpus annotation. The authors of this article argue that the design of a tagset cannot be isolated from the purpose of the tagset, or from the place of the tagset and its design within the bigger picture of the architecture of corpus annotation. Usage-related aspects therefore feature prominently in the ...
Related newspaper, magazine, and journal articles:
|
|
Article: Patterns of substance use in South Africa: results from the ...
South African Medical Journal;
May 1, 2009 ;
700+ words
... ... representative of the diverse South African population. This paper ... from the 2002-2004 South African Stress and Health ... CIDI into six other South African languages (Afrikaans, Zulu, Xhosa, Northern Sotho, Southern Sotho and ...
|
|