See other examples too: [[z:ath-philologist|Creating canonical URNs in Philologist]]. ====== Athenaeus in oXygen --- examples and exercises ====== The XML editor [[http://www.oxygenxml.com/|oXygen]] (which we had to have available for the [[http://www.e-humanities.net/events/athenaeus-hackathon.html|Leipzig Hackathon]] in October 2012) offers several very efficient tools and utilities for exploring any TEI XML file. For example, the file of Athenaeus' Deipnosophists. Here you can see how to use oXygen to: * find a set of tags in Athenaeus * manipulate this found set (through a simple XSL sheet) ===== Finding a set of tags in an XML file with oXygen ===== - Get a complete text of Deipnosophists in TEI XML (available as a Google Drive document shared with Hackathon members) - Open it from oXygen. Everything should be OK, i. e. the file should [[http://www.oxygenxml.com/xml_editor/validation.html|validate]] without red signals on the right-hand side scrollbar beside the text. - Use the oXygen XPath toolbar, located in the upper left-hand corner above the text. There you can enter any XPath expression. XPath is a language for finding specific parts of XML files (treated as XML, and not only as text). Here is a screenshot of XPath toolbar region:((Cf. [[http://www.oxygenxml.com/xml_editor/xpath.html|the oXygen help pages on XPath]].)) {{ http://www.oxygenxml.com/img/xpathToolbar.gif |oXygen XPath toolbar}} - Enter the following into the XPath toolbar: //name This XPath expression means: find anything, anywhere in the XML file, marked with the ''name'' tag. After the search (reported as **XPath - in progress** on the bottom of the window), a list of results will be shown as a lower pane. Note that from there you can jump from one result in the XML to another, that you can save the results with a right-hand mouse click, etc. Try moving between different results. Also examine the "XPath location" column in the results list. Then we can additionally filter "anything tagged with ''name''", selecting just the tags that have a ''type'' attribute (as in ''''). Enter this into the XPath toolbar: //name[@type] To filter further, we can select all ''name'' tags with ''type'' attribute that has value ''month''. Take care to close all the square brackets, as shown: //name[@type[. eq 'month']] So, how many names of the "month" type are there in Athenaeus' XML? If we want to find names of another type, we just change what we write after ''eq'', e. g. ''@type[. eq'' //'person'//'']'' etc. **While you're doing this, you're both mastering XPath and examining the markup in Athenaeus' XML.**((An idiosyncratic selection of XML and XPath tools and information can be found in [[http://www.bibsonomy.org/user/filologanoga/xml|this collection of bookmarks]].)) ===== Extract a found set from an XML file ===== Once we have found a set of tags (and data) that interests us, we can transform the XML file, e. g. discarding everything else but the interesting set. Let's say we want to find names which are incorrectly tagged in Athenaeus. An examination will show that there are several types of "incorrectness"; one of them is that the names are marked with the tag ''rs'' (TEI shorthand for "referring string", marking any kind of reference to something else) with the ''type="nomorph"'' attribute and value. As an exercise, use the XPath toolbar to find all ''rs'' tags. To "export" the tagset we have found (i. e. to discard from the XML file everything else), we have to write a set of instructions --- a program --- that is known as XSL stylesheet. XPath is of great importance for such stylesheets, as it tells the program //which// tags to include, and which to discard. The oXygen has everything we need to write XSL and process XML files with it. The XSL we'll be using is here (it includes comments on key instructions): ==== Create an XSL file with oXygen ==== First, create an XSL file with oXygen: - From the main oXygen menu, select //File / New / XSLT Stylesheet//. Select "Create". - Delete the default elements created by oXygen, paste the XSL quoted above - See if it validates OK - Name the file and save it somewhere where you can find it (//File / Save//) ==== Create an XSL transformation with oXygen ==== Now we have to connect the XML file (our Athenaeus) with XSL instructions. See [[http://www.oxygenxml.com/doc/ug-editor/topics/defining-new-transformation-scenario.html|oXygen help on transformation scenarios]] for more details. - From the XML file window, select //Document / Transformation / Configure transformation scenario//. Alternatively, click on the wrench (spanner) symbol with a triangle next to it ({{http://www.oxygenxml.com/doc/ug-editor/img/bt_transform_config.png|XSL spanner}}). - Select //New / XML transformation with XSLT//. - As "name", type the name you want for the scenario. - As "XSL URL", give the (local) address of our XSL file. You can browse for the file through the folder symbol ({{http://www.oxygenxml.com/doc/ug-editor/img/bt_open_local_file.png|Folder-local-transformation}}). - As "Transformer", select Saxon-EE 9... (for XSL 2.0) - Click OK - Click "Apply associated". You'll see the message "Transformation in progress" - A lower pane opens where you'll see the results. You can save them as a new file, select them and paste them elsewhere, etc If everything went well, you should have gotten a long list (how many lines are there?) of Greek words, starting with a capital letter, in order in which they appear in the //Deipnosophists//.