NWU Home  

 

 

 
Contact us  |  Search
 
 
Centre for Text Technology

 
Name

NWU Bible corpus                                           

Developer(s) Julia Trushkina
Affiliation(s) Centre for Text Technology (CTexT), North-West University, Potchefstroom, South Africa
Description A trilingual parallel corpus, consisting of the 1983 version of the Afrikaans Bible, the Dutch Statenvertaling Bible, the World English Bible. The corpus is fully aligned on sentence and word level.
Category(ies) Multilingual parallel corpus
Language(s): In Afrikaans, Dutch, English
Source Source The 1983 version of the Afrikaans Bible, the Dutch Statenvertaling Bible, the World English Bible.
Distribution DVD
Documentation • Technical Report: Development of Tools and Resources for South African Languages
• Article: North-West University Bible Corpus. Submitted to “Language Matters”.
Operating Systems(s) Linux
Programming Language -
Size: File Size: Linguistic Appr.    830 000 words per language
Size: File Alignment:     36 MB. English analysis: 15 MB. Dutch syntactic analysis: 30 MB. Dutch POS analysis: 62 MB. Afrikaans POS analysis: 30 MB + 11 MB
Execution Location Local
Required Software

C5, Perl

Avaiability Commercial
Pricing: Academic n/a
Pricing: Multiple Users n/a
Pricing: Commercial To negotiate
Licence To negotiate
Other Information Academics interested in analysing data for research purposes could contact us to discuss such possibilities.
 

© 2009 North-West University. All rights reserved.
Terms and Conditions