Heideltime and corpora downloads

The UIMA HeidelTime kit, the HeidelTime stand-alone version, and the corpora preparation and evaluation scripts as well as all the accompanying materials are made available under the terms of the GNU General Public License.

The licensing terms of WikiWarsDE and WikiWarsVN are as follows:
All the documents in the corpus are sources from German and Vietnamese Wikipedia. As a consequence, the corpora are released under the Creative Commons Attribution-ShareAlike 3.0 License - see the license at http://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License.

By downloading any of the following files, you agree to the licensing terms.

HeidelTime

Corpora

  • AncientTimes Corpus: [tar.gz] or [zip]
  • WikiWarsDE: [tar.gz] or [zip]
  • WikiWarsVN: [tar.gz] or [zip], v1.1: [tar.gz] or [zip]
  • Time4SMS: if you are interested in this corpus, please send us an email
  • Time4SCI: if you are interested in this corpus, please send us an email
  • Arabic training-203test-50, test-150, as well as test-50-star corpora can be automatically generated using our scripts package (see below) together with the ACE 2005 Evaluation corpus. General usage for the scripts package and setting up the corpora folder structure can be viewed here, the relevant command to create all of the corpora is:
    # bash prepare_corpus.sh ace2005trainingArabic $EVALPATH
  • TempEval-2 Chinese Improved and Clean data sets can be generated using the official TempEval-2 corpus with the scripts package linked below. The scripts package contains the annotation data underlying the Creative Commons Attribution-ShareAlike 3.0 Unported license.
    To create the Improved and Clean data sets, please take a look at our Wiki Page on the issue. The commands to create the two additional chinese TempEval-2 corpora are:
    # bash prepare_corpus.sh tempeval2train-cn $EVALPATH
    # bash prepare_corpus.sh tempeval2eval-cn $EVALPATH

Supplementary scripts

  • Corpora Preparation and Evaluation Scripts: [tar.gz] [zip]