Introduction to the DTABf


The structural annotation of all DTA texts is done according to the DTA ›base format‹ (DTABf). The DTABf was developed in accordance with which is based on the P5-Guidelines of the Text Encoding Initiative (TEI). Since the TEI Guidelines are offering solutions for a huge amount of tagging requirements and are thus rather extensive and flexible, they are meant to be adjusted to the individual necessities of projects working with the TEI. For the DTA this was achieved by creation of the DTABf, a proper subset of the TEI/P5 tagset, which offers not only fixed sets of elements but also of corresponding attributes and (where applicable) values. The DTABf tagset is fully conformant with the TEI/P5-Guidelines, i.e. the TEI tagset was only reduced not extended in any way.

The DTABf is part of the DTA Guidelines, which also contain General Guidelines and the Transcription Guidelines. It is supposed to allow for unrestricted tagging regarding possible structural phenomena while at the same time avoiding ambiguities regarding the tagging of similar phenomena. This way we want to ensure coherence in text structuring within the whole DTA corpus. Regarding the wide temporal coverage of the DTA corpus as well as the diversity of text types and genres this named intend of the DTABf turns out to be a huge challenge due to the fact that the heterogeneity of texts is accompanied by a huge structural variability among the original text sources.

With the DTABf we are proposing a standardized format for the structural annotation of digitized historical texts. The advantage of such an approach is that diverse TEI texts become analyzable not only by similar methods but also in comparison with one another. The underlying annotation guidelines of the DTABf are documented extensively, this way ensuring that the tagging remains comprehensive. Thus, DTABf conformity not only facilitaes the integration of TEI texts into the DTA infrastructure but also their re-use inside other full text archives.

DTABf Documentation (German)

DTABf Schema

Useful Tools and Applications

Webform for Metadata Entry:

The DTA provides a web form, which facilitates the creation of DTABf conformant TEI Headers. This way, users do not have to write the quite complex TEI-Headers by themselves but can fill out the form and automatically generate a DTABf conformant TEI Header.

Framework for Text Entry:

For text transcription and DTABf conformant annotation, the DTA offers a framework for the author mode of the oXygen XML-Editor. This DTA-oXygen-Framework DTAoX enables users to obtain an immediate visualization of their annotated texts as well as to transcribe and annotate texts from scratch in a WYSIWYG-like environment. DTAoX is available under the GNU Lesser General Public License (LGPL). The current version has been optimized for the oXygen versions 14.2 and 15.

  • Version 1.1.1 (November 29th, 2013): Framework (.zip)