NWU Home  

 

 

 
Contact us  |  Search
 
 
Centre for Text Technology

 

Core Technologies

Name

TurboAnnotate

Developer(s)

GB van Huyssteen, MJ Puttkammer, M Schlemmer

Affiliation(s)

Centre for Text Technology (CTexT), North-West University, Potchefstroom, South Africa

 

 

Description

TurboAnnotate is a user-friendly annotating environment (i.e. tool) for bootstrapping linguistic data for machine-learning purposes, or for manually creating gold standards or other annotated lists.
This first version of TurboAnnotate was developed with the specific the task of hyphenation for South African languages in mind.
In the annotation GUI, the annotator simply drags the mouse over the part of the word to be annotated, and on release of the mouse button, the selection changes colour.
The machine learning system that we use in our system is the well-known Tilburg Memory-Based Learner (TiMBL; Daelemans et al, 2004).
Van Huyssteen & Puttkammer (2007) reports that TurboAnnotate could not only ensure higher accuracy in human annotations, but could also save on human effort required (at least in the case of Afrikaans).
Work on TurboAnnotate continues.

Category(ies)

Morphological Analysis to Annotation

Language(s): In

Languages using the character set of the Latin alphabet

Language(s): Out

Languages using the character set of the Latin alphabet

Distribution

Online

Documentation

  • Van Huyssteen, GB & Puttkammer, MJ. Accelerating the Annotation of Lexical Data for Less-Resourced Languages. Proceedings: Interspeech 2007 - Eurospeech, 10th European Conference on Speech Communication and Technology. Antwerp, Belgium, August 27-31, 2007.
  • Daelemans, W, Van den Bosch, A, Zavrel, J & Van der Sloot, A. "TiMBL: Tilburg Memory Based Learner, Version 5.1, Reference Guide", ILK Technical Report, February 4, 2004.

Operating
System(s
)

Linux

Programming Language

Perl

Execution Location

Local

Required Software

TiMBL 5.02 , Perl

Pricing:
Academic

n/a

Pricing:
Multiple Users

n/a

Pricing:
Commercial

n/a

Licence

Open Source (GPL) 

Other Information

To acquire the source code, please send an e-mail to Martin Puttkammer (Martin.Puttkammer@nwu.ac.za)

 

© 2009 North-West University. All rights reserved.
Terms and Conditions