LangueDOC
Home | Archives | Languages | TechHelp | Movie | Course materials | Team | Publications | RUSSIAN
 

Document transformations (Workflows)

The document workflow is as follows (we use Archi texts as an example):

  • Gloss the recorded text in Toolbox.
  • Apply the program BoxReader to transform the Toolbox document into an XHTML formtat that uses nested tags to represent structure. The resulting XHMTL document is viewable in the browser.
  • Apply an XSLT stylesheet to transform the XHTML file into the content.xml of the OpenOffice format. (An OpenDocument .odt file is a ZIP archive whose main component is a content.xml file.) There are two versions of the XSLT stylesheet:
    • The first one completely preserves the hierarchical structure, itself a representation of the original Toolbox structure;
    • The second one concatenates all morphemes of a word into a single element (a table cell).
  • Add the resulting content.xml into a complete OpenOffice text document replacing the old content.xml. This can be done with a free ZIP utility such as 7-Zip run from command line.
  • Now the document can be edited in OpenOffice Writer, or saved from there as MS Word document.

Reverse transformations:

  • If any editing has been done in MSWord, open the final document in OpenOffice Writer and save in the OpenOffice format.
  • Close the OpenOffice Writer, open the .odt document in WinZip or equivalent program and extract the file content.xml .
  • Apply the converse XSLT transform to obtain the document in the XHTML nested-span format that can be displayed in the browser.
  • If desired, run the program BoxWriter to obtain a Toolbox document that contains all the intermediate changes in various formats.