“Cesax” is a computer program that handles several different types of syntactically parsed files. It allows semi-automatic coreference resolution for its own native psdx format. Cesax can handle files of the following types:


1.      Psd      - Text files that are part of the parsed series of English corpora (YCOE, PPCME2 etc)

2.      Psdx    - XML equivalents of psd files

3.      Tig      - Negra (Tiger) format (stand-off xml).

4.      Xml     - Alpino-produced xml format


v  Conversion options

Ø  Treebank psd files to psdx:

§  Individual psd files: use “Cesax” (File/Import)

§  Batch conversion program “TreebankToXml” (see below).

§  Use the “Cesax” program (Tools/BatchConvertToPsdx)

Ø  Treebank psdx files to psd:

§  Use “Cesax” (Tools/ProducePsd or Tools/ProduceSimplePsd)

Ø  Different language text files to psd:

§  English: use “Cesax” (Tools/EnglishTxtToPsd)

§  German: use “Cesax” (Tools/GermanTxtToPsd)

§  Dependency: use “Cesax” (Tools/ConllXtoPsdx)


v  Documentation

Ø  Paper on Cesax (submitted to the International Journal of Corpus Linguistics)

Ø  Cesax manual (htm, pdf)

Ø  Helsinki corpus festival (handout)

Ø  TreebankToXml manual (htm, pdf)

v  Download

Ø  Install Cesax

Ø  Install TreebankToXml

v  Descriptions of formats used

Ø  Psdx format (xsd)

Ø  Referential chain dictionary (xsd)





Comments: E.komen@Let.ru.nl


Version information:





Version information is now available under “Help” in Cesax


Facilitate automatic download of period definitions file from CorpusStudio website


Added “Tree” tab


Resolved error with wildcards in chain dictionary


Improved calculation of SubjecSwitch using coreferential chains


Added feature Reference/List_Coreferential_Chains (see manual)


Repaired bug in ChainDictHas

Added feature View/Text