Welcome to the Centre for Text Technology (CTexT®), of the North-West University (Potchefstroom Campus).
This page shares interesting information about new developments, outputs and opportunities at CTexT.
Launch of African Wordnets
African languages achieved yet another victory in the quest to increase their relevance in the digital age when wordnets for Setswana, isiZulu, isiXhosa and Sesotho sa Leboa were officially launched at UNISA’s Muckleneuk Campus.
Well-known international wordnet specialists, Professors Christiane Fellbaum (Princeton University, USA) and Karel Pala (Masaryk University, Czech Republic), presented lectures at the launch seminar on 19 January 2011. They also shared their expertise at a two-day workshop preceding this event, where they advised the African Wordnet team on opportunities for further development and future research.
A wordnet is a lexical database consisting of words that are grouped into sets of synonyms called “synsets”. Wordnets are valuable resources for a large number of automatic text processing tasks and applications, including machine translation, information retrieval, text classification and summarisation. Development of the first wordnet started at Princeton University in the late 1980s. Since then, the Princeton WordNet has been translated from English into numerous languages across the world and linked to other semantic databases.
Development of the African Wordnets commenced in 2008 and is a collaboration between the Department of African Languages at UNISA and the Centre for Text Technology (CTexT®) at the North-West University. This groundbreaking project is funded by the Department of Science and Technology (DST) via the National HLT Network (NHN).
“The project is an exciting development that will in future lead to the creation of many other applications and resources for our African languages”, said Handré Groenewald (CTexT®).
According to Prof. Sonja Bosch (UNISA), “the work done so far on African Wordnets afforded African language specialists a unique opportunity to contribute to the technological development of these languages, and should serve as inspiration for the inclusion of further African languages in the next phase of the project.”
The number of synsets varies between 5 000 and 15 000 per language and will be completed by the end of January 2011.
- Open Source Translation Tools
Funded by the Department of Arts and Culture.
Developed by CTexT® in collaboration with the University of Pretoria.
Click here to download version 1.1.0 of the Autshumato ITE.
- The CTexT® Alignment Interface and CTexT® Alignment Interface Pro is now available. Read more
- Press Release
Language learning CD-ROMs:
With the click of a keyboard and a click of the tongue, joining new cliques is easy. Use your computer to learn a new language and click with more people.
Developed by Centre for Text Technology (CTexT®) - 018 299 1541. Distributed by BlueTek Computers - 018 297 0164 and www.spel.co.za. Photography by Topcolor Kodak Express - 018 297 0835.
Spelling Checkers for South African Languages!
CTexT®, in collaboration with several linguistic partners, recently completed spelling checkers for nine South African languages, to be used in Microsoft® Office programmes such as Microsoft Word.
Afrikaanse SkryfGoed 2008:
- CTexT proudly presents: Afrikaanse SkryfGoed 2008 - a collection of writing aids to make the click of Afrikaans on your keyboard crystal clear..
(Read more about Afrikaanse SkryfGoed 2008)
- Afrikaanse SkryfGoed 2008 is available from www.spel.co.za, or contact +27 18 297 0164 or +27 800 203 048 to order.
- Find Afrikaanse SkryfGoed 2008 on Facebook!
Other important information:
PSearch is based on Paramsearch, a tool created by Antal van den Bosch for automatic algorithmic parameter optimisation for TiMBL and other machine learning algorithms...(read more)
TurboAnnotate is a user-friendly annotating environment (i.e. tool) for bootstrapping linguistic data for machine-learning purposes, or for manually creating gold standards or other annotated lists... (read more)
Post Doctoral Positions:
The Centre for Text Technology at the North-West University (Potchefstroom Campus) in South Africa is looking for candidates to fill 2 postdoctoral positions in Natural Language Processing...