Lagarto Format

Minimal example

Part of file for epitome 142

id:142
title:Observationes de arte grammatica
%%
t1:Georgius ſymler Vuimpinenſis natione theuthonic <lb/> grâmatices obseruationes côpilauit
la:Georgius Symler Vuimpinensis natione teutonica grammatices obseruationes compilauit
es:Georg Symler, natural de Wimpfen, de la nación teutona, compiló unas 'Observaciones de gramática',
%%
t1:quib cuncta <lb/> fere grâmaticalia fundamenta potius agregata quâ <lb/> digesta uidentur
la:quibus cuncta fere grammaticalia fundamenta potius aggregata quam digesta uidentur,
es:en las que prácticamente todos los fundamentos gramaticales parecen entremezclados, antes que estructurados,
%%
t1:Vnde pro uectiorib potius quam <lb/> nouis tyrunculis illud lectitand censuerit
la:unde pro uectioribus potius quam nouis tirunculis illud lectitandum censuerit
es:de lo que habría aconsejado, más en favor de los más avanzados que de los principiantes jóvenes, leerlo repetidas veces,
%%

Lagarto abstract data type (ADT)

Informal skeleton overview:

- book                          [root elemet]
     - section                  [section 1 - ordered list - section = epitome]
          - head                [header - metadata - full list below]
               - id             [section identifier = epitome number]
               - title          [section title = epitome title]
          - body                [sets of aligned segments]
               - align          [multilingual aligned segments - align 1 - number implicit by the order]
                    - t0        [segment - clean or consolidated texts of t1 and t2]
                    - t1        [segment - transliteration from source 1 - there might be only one source]
                    - t2        [segment - transliteration from source 2 - up to N]
                    - es        [segment - Spanish translation]
                    - note-t0   [note - for segment 0]
                    - note-t1   [note - for segment 1]
                    - note-t2   [note - for segment 2 - up to N]
                    - note-es   [note-  for segment es]
               - align          [multilingual aligned segments - align 2 - up to N]
     - section                  [section N]

XML skeleton examples: book - section ¶

Lagarto Fomat

It is an implementation of the ADT where each section is in a separated section file, filenames as per the id. In the case of El Libro, section correspond to epitome; for example, the filename for epitome 142 is 142.txt; there are 1877 epitomes.

Section file

It is a plain text record-jar format file: a serie of records separated by %% at the beginning of the line, where each line is a key-value pair. The first record is the head and the rest the body where each record is a align, aligned multilingual parallel segments (§10) and the corresponding notes; a full example.

The rationale is lowering IT technical barriers: transcribers should focus on transcription and do not worry about IT aspects. All IT aspects are taken care off. A plain text editor is sufficient and there is no requirement to install additional specific programs. When saving the file select character encoding UTF-8 without BOM; often this is done by default, though some editors might not offer these options.

`head`

The first record in the file. It contains the header elements. Unused elements might be left blank or removed. Other elements might be added as needed. Note:

elements might have several contents separated by "|"; might ordered or unordered
content of related elements must be synchronised such as print and uriprint
LC: two characters language code

Element name	Content	Example	Description
id	string	142	Epitome number
regb	Registrum B number	1386	Catálogo Concordado
regb-uri	id parameter in the query part of the Permalink	091E8E601D10651B670F9F0	Catálogo Concordado
status	valis status	FINAL	status values
date	YYMMDD[-HHMM]	240305	Timestamp of the last update
ustc	USTC number	689207	Universal Short Title Catalogue (USTC)
mei	URI MEI	https://data.cerl.org/mei/_search?query=Liber+decretorum+sive+Panormia	URI to the Material Evidence in Incunabula (MEI)
lic	CC BY-SA - Creative Commons Attribution-ShareAlike - see also http://lagarto.top/about		BY-SA is the preferred license; others might apply
ipr-head	name	John Doe	Intelectual Property Rights of the `head`
ipr-body	name1,nameN	John Doe	Intelectual Property Rights of the `body`. Singular includes plural: the owner of the transcription and translation, if "sumista" (author) does not have the IPR due to payment or other.
curator	name1\|nameN	John Doe	IT curator, ordered
sum-t1	name1\|nameN	John Doe	Transcribers are called sumistas in honor of the original sumistas. Ordered. The first one is also the translator, except if indicated in a "sum-LC". Also the IPR owner(s), except if otherwise indicated in `ipr-body`. It might anonymous or empty.
title	string	Marie Virginis corona	Title of book
title-uri	URI	http://example.com	URI to the book
inc	string	Signum magnum	included; additional works included in the book
lang	LC	la	Language; main language of the book
materia	string	184	Reference to the Libro de las Materias, the number in a box next to some epitome number
author	name1\|nameN	Georgius Simler Vuimpinensis	Author(s) of the book
author-uri	URI1\|URIn	https://www.deutsche-biographie.de/pnd119832216.html	URI for the author(s)
tran	name1\|nameN	Lorenzo Valla	Translator(s)
tran-uri	URI1\|URIn	https://en.wikipedia.org/wiki/Lorenzo_Valla	URI for the translator(s)
print	name1\|nameN	Iodocus Badius	Printer(s)/editor(s)
print-uri	URI1\|URIn	https://en.wikipedia.org/wiki/Jodocus_Badius	URI for the printer(s)/editor(s)
per	name1\|nameN	Bonifacii de Ceva	Any other person related to the book
per-uri	URI	http://example.com	URI for the person(s)
note-head	string	Lorem ipsum dolor sit amet	Header notes
c-pfrom	number	9	From page number in the Copenhague manuscript (LE-C) - empty means not in LE-C
c-pto	number	9	To page number in the digitalised LE-C
c-rfrom	rvreference	3r	From page reference in recto-verso notation in digitalised LE-C
c-ro	rvreference	3r	To page reference in recto-verso notation in digitalised LE-C
c-size	number	0.25	Size of the epitome in number of pages, mostly calculated programatically - 0 means less than a page
s-pfrom	string		Not empty means in the Sevilla manuscript (LE-S), more related fields might be added later

`body`

The rest of the records are align, aligned multilingual parallel segments that should be as small as possible. Two characters keys are reserved for segments containing transcribes and languages data. t1 is "translated" into the clean version of the same language; as most of texts are in Medieval Latin with abbreviations, they are transformed into Classical Latin, la.

Element name	Example	Description
t0	Lorem ipsum	consolidated t1-tN sources; it might be absent; if only one source, it might be used for cleaned-up text; if present, translate all LC from this text
t1	Lorem ipsum	text of source 1
tn	Lorem ipsum	text of source N (1,2,3...); there might be only one source
LC	Lorem ipsum	translation into language LC; this element might be repeated with different LC, for example `la en`
note-t0	Lorem ipsum	notes for t0
note-tN	Lorem ipsum	notes for tN
note-LC	Lorem ipsum	notes for language LC
note-seg	Lorem ipsum	segment notes