Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
> Cheers, Russell
CrossWire publishes the OSIS XML file of its KJV2006
As far as OSIS capabilities go, this is a pretty good one. It uses
extensive markup around words/Strongs/morph lemmata etc. But, much more
is possible - it has no paragraphing etc.
There is though an example OSIS file on the CrossWire wiki.
The first thing that comes to mind are the "original" OSIS examples:
http://www.bibletechnologies.net./osistext/ I haven't looked at these in years, but they were intended to be examples of the markup.
And I'm attaching my take on the book of Ephesians, to give you another point of view. The nice thing about OSIS is that it allows a fair amount of latitude. This also means it requires some design decisions.
> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
> The nice thing about OSIS is that it allows a fair amount of latitude.
> This also means it requires some design decisions.
As an aside, I personally don't see why it is nice since it means that it
is extremely difficult to develop an application that can consume an
arbitrary OSIS XML document since all of the possible variations have to be
accounted for. I feel there should be a strict subset of the current OSIS
schema that specifies one and only one way to mark up a text.
> And I'm attaching my take on the book of Ephesians, to give you another
> point of view. The nice thing about OSIS is that it allows a fair amount
> of latitude. This also means it requires some design decisions.
> Peace,
> David
> On 5/11/2012 4:53 AM, Russell Allen wrote:
>> Hi guys,
>> Does anyone know of an example OSIS file that fully uses its
>> capabilities? It wouldn't need to be a whole Bible, even just a single
>> book, but something which is more than just unformatted chapters and verses
>> but shows headers, paragraphs, poetry etc...
>> Cheers, Russell
> --
> You received this message because you are subscribed to the Google Groups
> "Open Scriptures" group.
> To post to this group, send email to openscriptures@googlegroups.**com<openscriptures@googlegroups.com>
> .
> To unsubscribe from this group, send email to openscriptures+unsubscribe@*
> *googlegroups.com <openscriptures%2Bunsubscribe@googlegroups.com>.
> For more options, visit this group at http://groups.google.com/** > group/openscriptures?hl=en<http://groups.google.com/group/openscriptures?hl=en>
> .
> The first thing that comes to mind are the "original" OSIS examples:
> http://www.bibletechnologies.net./osistext/ > I haven't looked at these in years, but they were intended to be examples of the markup.
> And I'm attaching my take on the book of Ephesians, to give you another point of view. The nice thing about OSIS is that it allows a fair amount of latitude. This also means it requires some design decisions.
> Peace,
> David
> On 5/11/2012 4:53 AM, Russell Allen wrote:
>> Hi guys,
>> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
>> Cheers, Russell
> -- > You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
> To post to this group, send email to openscriptures@googlegroups.com.
> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
> At the moment I'm using usfm and I have some homemade python which does usfm->html, pdf,text etc transformations.
> I'm looking into redoing my workflow to use osis, primarily to make verification of the files easier.
> Are there existing transform tools? I looked around but osis seems to exist in a vacuum - there is a spec, but that's it.
> Or should I be sticking with usfm?
> Cheers, Russell
> On 12/05/2012, at 7:35 AM, David Troidl wrote:
> > Hi Russell,
> > The first thing that comes to mind are the "original" OSIS examples:
> > http://www.bibletechnologies.net./osistext/ > > I haven't looked at these in years, but they were intended to be examples of the markup.
> > And I'm attaching my take on the book of Ephesians, to give you another point of view. The nice thing about OSIS is that it allows a fair amount of latitude. This also means it requires some design decisions.
> > Peace,
> > David
> > On 5/11/2012 4:53 AM, Russell Allen wrote:
> >> Hi guys,
> >> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
> >> Cheers, Russell
> > -- > > You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
> > To post to this group, send email to openscriptures@googlegroups.com.
> > To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
> > For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
On Sat, 2012-05-12 at 12:38 +1000, Russell Allen wrote:
> Thanks David!
> At the moment I'm using usfm and I have some homemade python which does usfm->html, pdf,text etc transformations.
> I'm looking into redoing my workflow to use osis, primarily to make verification of the files easier.
> Are there existing transform tools? I looked around but osis seems to exist in a vacuum - there is a spec, but that's it.
> Or should I be sticking with usfm?
> Cheers, Russell
> On 12/05/2012, at 7:35 AM, David Troidl wrote:
> > Hi Russell,
> > The first thing that comes to mind are the "original" OSIS examples:
> > http://www.bibletechnologies.net./osistext/ > > I haven't looked at these in years, but they were intended to be examples of the markup.
> > And I'm attaching my take on the book of Ephesians, to give you another point of view. The nice thing about OSIS is that it allows a fair amount of latitude. This also means it requires some design decisions.
> > Peace,
> > David
> > On 5/11/2012 4:53 AM, Russell Allen wrote:
> >> Hi guys,
> >> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
> >> Cheers, Russell
> > -- > > You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
> > To post to this group, send email to openscriptures@googlegroups.com.
> > To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
> > For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
Is osis sufficiently specified that I can be reasonably sure that if I am in compliant osis that you're tools will work on it? Or do they work best on osis from usfm2osis.pl ?
Last time I tried usfm2osis.pl it didn't work on my usfm perfectly, but if I did a one off conversion then I could either hand fix or write my own usfm->osis converter (which wouldn't be hard as I've already got the backend)
Russell
On 12/05/2012, at 4:04 PM, Peter von Kaehne wrote:
> usfm is the most common format we receive at Crosswire from outside
> agencies.
> So we have a reasonable set of transform scripts, which are under active
> development. usfm2osis.pl, title_cleanup.pl and xreffix.pl
> You find them in our svn repos under sword-tools
> Peter
> On Sat, 2012-05-12 at 12:38 +1000, Russell Allen wrote:
>> Thanks David!
>> At the moment I'm using usfm and I have some homemade python which does usfm->html, pdf,text etc transformations.
>> I'm looking into redoing my workflow to use osis, primarily to make verification of the files easier.
>> Are there existing transform tools? I looked around but osis seems to exist in a vacuum - there is a spec, but that's it.
>> Or should I be sticking with usfm?
>> Cheers, Russell
>> On 12/05/2012, at 7:35 AM, David Troidl wrote:
>>> Hi Russell,
>>> The first thing that comes to mind are the "original" OSIS examples:
>>> http://www.bibletechnologies.net./osistext/ >>> I haven't looked at these in years, but they were intended to be examples of the markup.
>>> And I'm attaching my take on the book of Ephesians, to give you another point of view. The nice thing about OSIS is that it allows a fair amount of latitude. This also means it requires some design decisions.
>>> Peace,
>>> David
>>> On 5/11/2012 4:53 AM, Russell Allen wrote:
>>>> Hi guys,
>>>> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
>>>> Cheers, Russell
>>> -- >>> You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
>>> To post to this group, send email to openscriptures@googlegroups.com.
>>> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
>>> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
>>> <EphOsis.xml>
> -- > You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
> To post to this group, send email to openscriptures@googlegroups.com.
> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
As Weston writes, the arbitriness of some OSIS constructs makes it
difficult to capture all permutations into software.
At CrossWire we have a loosely (un)defined way we prefer, allowing for a
fair amount of permutation and variation, but not all.
usfm2osis.pl is an incomplete tool in growth. This means
a) there are specific aspects left out on purpose, which are handled in
other scripts
b) it captures what we encounter and if we encounter something else it
gets added. This means if you use USFM tags in a correct fashion (as
defined by the USFM handbook from UBS) and usfm2osis.pl does not deal
with this, we will happily accept patches or (if your work with it is
of sustained and useful nature) permit you to commit directly.
We have tried to keep usfm2osis.pl multiplatform and not dependent on
someone having Sword Perlbindings. Hence some aspects are not dealt with
properly - mainly crossreferences. USFM allows any kind of references,
in whatever language, but osis references are welldefined. So
To achieve that you need to parse the original reference, allowing both
for localised Bible book names and for a huge variety of localised ways
of marking references, including ranges, lists etc - note the comma in
the German reference, in an English one it would likely be a colon.
libsword can do this msotly without hickups, but to incorporate libsword
into a perl script you need to have functioning Sword Perl bindings. So
we left this out of usfm2osis.pl. There are other aspects.
title_cleanup.pl creates titles which conform specifically with what our
frontends expect.
But in summary, I can say, we fix things as we go along and as we get
usfm texts we need to deal with, but can not with the existing tool.
USFM is so huge and of so variable quality, that it would be a gigantic
undertaking to really capture everything in one first go.
> Is osis sufficiently specified that I can be reasonably sure that if I am in compliant osis that you're tools will work on it? Or do they work best on osis from usfm2osis.pl ?
> Last time I tried usfm2osis.pl it didn't work on my usfm perfectly, but if I did a one off conversion then I could either hand fix or write my own usfm->osis converter (which wouldn't be hard as I've already got the backend)
> Russell
> On 12/05/2012, at 4:04 PM, Peter von Kaehne wrote:
>> usfm is the most common format we receive at Crosswire from outside
>> agencies.
>> So we have a reasonable set of transform scripts, which are under active
>> development. usfm2osis.pl, title_cleanup.pl and xreffix.pl
>> You find them in our svn repos under sword-tools
>> Peter
>> On Sat, 2012-05-12 at 12:38 +1000, Russell Allen wrote:
>>> Thanks David!
>>> At the moment I'm using usfm and I have some homemade python which does usfm->html, pdf,text etc transformations.
>>> I'm looking into redoing my workflow to use osis, primarily to make verification of the files easier.
>>> Are there existing transform tools? I looked around but osis seems to exist in a vacuum - there is a spec, but that's it.
>>> Or should I be sticking with usfm?
>>> Cheers, Russell
>>> On 12/05/2012, at 7:35 AM, David Troidl wrote:
>>>> Hi Russell,
>>>> The first thing that comes to mind are the "original" OSIS examples:
>>>> http://www.bibletechnologies.net./osistext/ >>>> I haven't looked at these in years, but they were intended to be examples of the markup.
>>>> And I'm attaching my take on the book of Ephesians, to give you another point of view. The nice thing about OSIS is that it allows a fair amount of latitude. This also means it requires some design decisions.
>>>> Peace,
>>>> David
>>>> On 5/11/2012 4:53 AM, Russell Allen wrote:
>>>>> Hi guys,
>>>>> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
>>>>> Cheers, Russell
>>>> -- >>>> You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
>>>> To post to this group, send email to openscriptures@googlegroups.com.
>>>> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
>>>> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
>>>> <EphOsis.xml>
>> -- >> You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
>> To post to this group, send email to openscriptures@googlegroups.com.
>> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
>> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
Throwing in my $0.02 from an academic perspective. There's a philosophical
question about markup here. My experience is more with TEI than OSIS, so
the goals of OSIS may (probably do) diverge a bit here, but these systems
both address the same general needs: rich markup of texts.
I'd argue that markup allows an editor to describe the structure of a text
that he wants to communicate. Traditionally, that was done with in printed
books with page-layout, typesetting, etc. Embedded markup allows people to
be more precise, detailed, and explicit in what they describe. A markup
vocabulary (e.g., OSIS or TEI) provides a set of tools for describing a
range of features in many different texts according to a variety of
editorial perspectives and goals. For example, an editor might want to
describe the physical material a text was published on, the narrative
structure of the text, textual criticism, rhyming analysis in a poem,
literary allusions, etc. On this view, there is no sense in which a single
text can illustrate all possible features of markup vocabulary.
As for Weston's concerns, from TEI's perspective, this latitude is nice in
that it allows editors the freedom to represent information in a way that
is best suited to their needs. The downside of course, is that you can't
write software that supports "TEI documents", you have to write tools to
support certain features of TEI documents or documents created as part of a
particular encoding project. TEI has gone to great lengths to avoid calling
itself a standard. Instead, the TEI consortium provides a vocabulary for
encoding documents, a set of guidelines for best practice, and a community
of practice.
By focusing on editorial freedom, TEI excludes the possibility of writing a
one size fits all software: you simply can't (er. . . shouldn't) write a
program to handle all possible interesting features of all possible texts
for all possible audiences. Alternatively, you could build a system that
says "this is the way you have to edit your texts" and force editors to
conform to the limitations of your system.
This maximal flexibility for editors approach is what has allowed TEI to
gain wide adoption within the academic humanities circles because it
side-steps all of the editorial debates that the academics find to be the
most fruitful ground for discussion. It can do (almost) everything and you
build custom software to represent your content. With XML, that software
might be a simple as an XSLT to process your document into something
suitable for use in an off the shelf application.
*Bottom line: *my take is that document markup systems should emphasize
editorial freedom (that may be constrained by convention) and place the
burden of interpreting documents on the shoulders of software developers
because we value the intellectual contribution that a skilled editor brings
to the table.
Neal
On Sat, May 12, 2012 at 5:09 AM, Peter von Kaehne <ref...@gmx.net> wrote:
> As Weston writes, the arbitriness of some OSIS constructs makes it
> difficult to capture all permutations into software.
> At CrossWire we have a loosely (un)defined way we prefer, allowing for a
> fair amount of permutation and variation, but not all.
> usfm2osis.pl is an incomplete tool in growth. This means
> a) there are specific aspects left out on purpose, which are handled in
> other scripts
> b) it captures what we encounter and if we encounter something else it
> gets added. This means if you use USFM tags in a correct fashion (as
> defined by the USFM handbook from UBS) and usfm2osis.pl does not deal
> with this, we will happily accept patches or (if your work with it is
> of sustained and useful nature) permit you to commit directly.
> We have tried to keep usfm2osis.pl multiplatform and not dependent on
> someone having Sword Perlbindings. Hence some aspects are not dealt with
> properly - mainly crossreferences. USFM allows any kind of references,
> in whatever language, but osis references are welldefined. So
> To achieve that you need to parse the original reference, allowing both
> for localised Bible book names and for a huge variety of localised ways
> of marking references, including ranges, lists etc - note the comma in
> the German reference, in an English one it would likely be a colon.
> libsword can do this msotly without hickups, but to incorporate libsword
> into a perl script you need to have functioning Sword Perl bindings. So
> we left this out of usfm2osis.pl. There are other aspects.
> title_cleanup.pl creates titles which conform specifically with what our
> frontends expect.
> But in summary, I can say, we fix things as we go along and as we get
> usfm texts we need to deal with, but can not with the existing tool.
> USFM is so huge and of so variable quality, that it would be a gigantic
> undertaking to really capture everything in one first go.
> Peter
> On 12/05/12 09:15, Russell Allen wrote:
> > That's interesting, thanks.
> > Is osis sufficiently specified that I can be reasonably sure that if I
> am in compliant osis that you're tools will work on it? Or do they work
> best on osis from usfm2osis.pl ?
> > Last time I tried usfm2osis.pl it didn't work on my usfm perfectly, but
> if I did a one off conversion then I could either hand fix or write my own
> usfm->osis converter (which wouldn't be hard as I've already got the
> backend)
> > Russell
> > On 12/05/2012, at 4:04 PM, Peter von Kaehne wrote:
> >> usfm is the most common format we receive at Crosswire from outside
> >> agencies.
> >> So we have a reasonable set of transform scripts, which are under active
> >> development. usfm2osis.pl, title_cleanup.pl and xreffix.pl
> >> You find them in our svn repos under sword-tools
> >> Peter
> >> On Sat, 2012-05-12 at 12:38 +1000, Russell Allen wrote:
> >>> Thanks David!
> >>> At the moment I'm using usfm and I have some homemade python which
> does usfm->html, pdf,text etc transformations.
> >>> I'm looking into redoing my workflow to use osis, primarily to make
> verification of the files easier.
> >>> Are there existing transform tools? I looked around but osis seems to
> exist in a vacuum - there is a spec, but that's it.
> >>> Or should I be sticking with usfm?
> >>> Cheers, Russell
> >>> On 12/05/2012, at 7:35 AM, David Troidl wrote:
> >>>> Hi Russell,
> >>>> The first thing that comes to mind are the "original" OSIS examples:
> >>>> http://www.bibletechnologies.net./osistext/ > >>>> I haven't looked at these in years, but they were intended to be
> examples of the markup.
> >>>> And I'm attaching my take on the book of Ephesians, to give you
> another point of view. The nice thing about OSIS is that it allows a fair
> amount of latitude. This also means it requires some design decisions.
> >>>> Peace,
> >>>> David
> >>>> On 5/11/2012 4:53 AM, Russell Allen wrote:
> >>>>> Hi guys,
> >>>>> Does anyone know of an example OSIS file that fully uses its
> capabilities? It wouldn't need to be a whole Bible, even just a single
> book, but something which is more than just unformatted chapters and verses
> but shows headers, paragraphs, poetry etc...
> >>>>> Cheers, Russell
> >>>> --
> >>>> You received this message because you are subscribed to the Google
> Groups "Open Scriptures" group.
> >>>> To post to this group, send email to openscriptures@googlegroups.com.
> >>>> To unsubscribe from this group, send email to
> openscriptures+unsubscribe@googlegroups.com.
> >>>> For more options, visit this group at
> http://groups.google.com/group/openscriptures?hl=en.
> >>>> <EphOsis.xml>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups "Open Scriptures" group.
> >> To post to this group, send email to openscriptures@googlegroups.com.
> >> To unsubscribe from this group, send email to
> openscriptures+unsubscribe@googlegroups.com.
> >> For more options, visit this group at
> http://groups.google.com/group/openscriptures?hl=en.
> --
> You received this message because you are subscribed to the Google Groups
> "Open Scriptures" group.
> To post to this group, send email to openscriptures@googlegroups.com.
> To unsubscribe from this group, send email to
> openscriptures+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/openscriptures?hl=en.
> At the moment I'm using usfm and I have some homemade python which does usfm->html, pdf,text etc transformations.
> I'm looking into redoing my workflow to use osis, primarily to make verification of the files easier.
> Are there existing transform tools? I looked around but osis seems to exist in a vacuum - there is a spec, but that's it.
Looking around, I found:
http://code.google.com/p/osis-converters/wiki/Compatibility Crosswire.org used to have a whole bunch of information on OSIS and a tool for USFM to OSIS conversion. Right now all Google links to their Wiki seem to fail. Maybe someone from Crosswire could shed some light on this.
> Or should I be sticking with usfm?
There has been some discussion on the developer list about this.
Apparently it is easier to program an editor to write USFM, whereas OSIS has all the advantages of XML, for parsing and validation. Personally, I just spent significant time and effort getting my translations from ODF files to OSIS. And I am very happy with the results. I also developed an XSL-FO transformation, so I can still get PDF output, when I need it.
>> The first thing that comes to mind are the "original" OSIS examples:
>> http://www.bibletechnologies.net./osistext/ >> I haven't looked at these in years, but they were intended to be examples of the markup.
>> And I'm attaching my take on the book of Ephesians, to give you another point of view. The nice thing about OSIS is that it allows a fair amount of latitude. This also means it requires some design decisions.
>> Peace,
>> David
>> On 5/11/2012 4:53 AM, Russell Allen wrote:
>>> Hi guys,
>>> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
>>> Cheers, Russell
>> -- >> You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
>> To post to this group, send email to openscriptures@googlegroups.com.
>> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
>> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
I agree with Neal, though I see the problems that it introduces for
software developers. The other day I started digitizing
Abbott-Smith's Manual Greek Lexicon of the New Testament (https://github.com/dowens76/Abbott-Smith),
and I have found TEI (with some help from OSIS, according to the
schema defined by CrossWire.org) to be just the right tool for that
job. There are some limitations, but for the most part it allows me
to create rich markup. This satisfies the editorial interest in rich
markup.
To deal with the lack of support within SWORD for certain far-flung
features of TEI (i.e., the software developer challenge), I will
probably make some changes to simplify the text when moving to
create a SWORD module, perhaps using XSLT. I can leave the base text
in rich TEI but then simplify for the sake of function.
Daniel
On 05/12/2012 08:45 AM, Neal Audenaert wrote:
Throwing in my $0.02 from an academic perspective.
There's a philosophical question about markup here. My experience
is more with TEI than OSIS, so the goals of OSIS may (probably do)
diverge a bit here, but these systems both address the same
general needs: rich markup of texts.
I'd argue that markup allows an editor to describe the
structure of a text that he wants to communicate. Traditionally,
that was done with in printed books with page-layout,
typesetting, etc. Embedded markup allows people to be more
precise, detailed, and explicit in what they describe. A markup
vocabulary (e.g., OSIS or TEI) provides a set of tools for
describing a range of features in many different texts according
to a variety of editorial perspectives and goals. For example,
an editor might want to describe the physical material a text
was published on, the narrative structure of the text, textual
criticism, rhyming analysis in a poem, literary allusions,
etc. On this view, there is no sense in which a single text can
illustrate all possible features of markup vocabulary.
As for Weston's concerns, from TEI's perspective, this
latitude is nice in that it allows editors the freedom to
represent information in a way that is best suited to their
needs. The downside of course, is that you can't write software
that supports "TEI documents", you have to write tools to
support certain features of TEI documents or documents created
as part of a particular encoding project. TEI has gone to great
lengths to avoid calling itself a standard. Instead, the TEI
consortium provides a vocabulary for encoding documents, a set
of guidelines for best practice, and a community of practice.
By focusing on editorial freedom, TEI excludes the
possibility of writing a one size fits all software: you simply
can't (er. . . shouldn't) write a program to handle all possible
interesting features of all possible texts for all possible
audiences. Alternatively, you could build a system that says
"this is the way you have to edit your texts" and force editors
to conform to the limitations of your system.
This maximal flexibility for editors approach is what has
allowed TEI to gain wide adoption within the academic humanities
circles because it side-steps all of the editorial debates that
the academics find to be the most fruitful ground for
discussion. It can do (almost) everything and you build custom
software to represent your content. With XML, that software
might be a simple as an XSLT to process your document into
something suitable for use in an off the shelf application.
Bottom line: my take is that document markup systems
should emphasize editorial freedom (that may be constrained by
convention) and place the burden of interpreting documents on
the shoulders of software developers because we value the
intellectual contribution that a skilled editor brings to the
table.
Neal
On Sat, May 12, 2012 at 5:09 AM, Peter
von Kaehne <refdoc@gmx.net>
wrote:
As Weston writes, the arbitriness of
some OSIS constructs makes it
difficult to capture all permutations into software.
At CrossWire we have a loosely (un)defined way we prefer,
allowing for a
fair amount of permutation and variation, but not all.
usfm2osis.pl is an incomplete tool in
growth. This means
a) there are specific aspects left out on purpose, which are
handled in
other scripts
b) it captures what we encounter and if we encounter
something else it
gets added. This means if you use USFM tags in a correct
fashion (as
defined by the USFM handbook from UBS) and usfm2osis.pl does not deal
with this, we will happily accept patches or (if your work
with it is
of sustained and useful nature) permit you to commit
directly.
We have tried to keep usfm2osis.pl
multiplatform and not dependent on
someone having Sword Perlbindings. Hence some aspects are
not dealt with
properly - mainly crossreferences. USFM allows any kind of
references,
in whatever language, but osis references are welldefined.
So
To achieve that you need to parse the original reference,
allowing both
for localised Bible book names and for a huge variety of
localised ways
of marking references, including ranges, lists etc - note
the comma in
the German reference, in an English one it would likely be a
colon.
libsword can do this msotly without hickups, but to
incorporate libsword
into a perl script you need to have functioning Sword Perl
bindings. So
we left this out of usfm2osis.pl.
There are other aspects. title_cleanup.pl creates titles which
conform specifically with what our
frontends expect.
But in summary, I can say, we fix things as we go along and
as we get
usfm texts we need to deal with, but can not with the
existing tool.
USFM is so huge and of so variable quality, that it would be
a gigantic
undertaking to really capture everything in one first go.
Peter
On 12/05/12 09:15, Russell Allen wrote:
> That's interesting, thanks.
>
> Is osis sufficiently specified that I can be
reasonably sure that if I am in compliant osis that
you're tools will work on it? Or do they work best on
osis from usfm2osis.pl
?
>
> Last time I tried usfm2osis.pl
it didn't work on my usfm perfectly, but if I did a one
off conversion then I
> Crosswire.org used to have a whole bunch of information on OSIS and a
> tool for USFM to OSIS conversion. Right now all Google links to their
> Wiki seem to fail. Maybe someone from Crosswire could shed some light
> on this.
Our server had a catastrophic failure of the harddrive and is getting
rebuilt.
svn is working
ftp too
the wiki should be up and running today or tomorrow.
I agree with Neal, though I see the problems that it introduces for
software developers. The other day I started digitizing
Abbott-Smith's Manual Greek Lexicon of the New Testament (
https://github.com/dowens76/Abbott-Smith),
and I have found TEI (with some help from OSIS, according to the
schema defined by CrossWire.org) to be just the right tool for that
job. There are some limitations, but for the most part it allows me
to create rich markup. I will probably make some changes to simplify
the text when moving to create a SWORD module, perhaps using XSLT.
Daniel
On 05/12/2012 08:45 AM, Neal Audenaert wrote:
Throwing in my $0.02 from an academic perspective.
There's a philosophical question about markup here. My experience
is more with TEI than OSIS, so the goals of OSIS may (probably do)
diverge a bit here, but these systems both address the same
general needs: rich markup of texts.
I'd argue that markup allows an editor to describe the
structure of a text that he wants to communicate. Traditionally,
that was done with in printed books with page-layout,
typesetting, etc. Embedded markup allows people to be more
precise, detailed, and explicit in what they describe. A markup
vocabulary (e.g., OSIS or TEI) provides a set of tools for
describing a range of features in many different texts according
to a variety of editorial perspectives and goals. For example,
an editor might want to describe the physical material a text
was published on, the narrative structure of the text, textual
criticism, rhyming analysis in a poem, literary allusions,
etc. On this view, there is no sense in which a single text can
illustrate all possible features of markup vocabulary.
As for Weston's concerns, from TEI's perspective, this
latitude is nice in that it allows editors the freedom to
represent information in a way that is best suited to their
needs. The downside of course, is that you can't write software
that supports "TEI documents", you have to write tools to
support certain features of TEI documents or documents created
as part of a particular encoding project. TEI has gone to great
lengths to avoid calling itself a standard. Instead, the TEI
consortium provides a vocabulary for encoding documents, a set
of guidelines for best practice, and a community of practice.
By focusing on editorial freedom, TEI excludes the
possibility of writing a one size fits all software: you simply
can't (er. . . shouldn't) write a program to handle all possible
interesting features of all possible texts for all possible
audiences. Alternatively, you could build a system that says
"this is the way you have to edit your texts" and force editors
to conform to the limitations of your system.
This maximal flexibility for editors approach is what has
allowed TEI to gain wide adoption within the academic humanities
circles because it side-steps all of the editorial debates that
the academics find to be the most fruitful ground for
discussion. It can do (almost) everything and you build custom
software to represent your content. With XML, that software
might be a simple as an XSLT to process your document into
something suitable for use in an off the shelf application.
Bottom line: my take is that document markup systems
should emphasize editorial freedom (that may be constrained by
convention) and place the burden of interpreting documents on
the shoulders of software developers because we value the
intellectual contribution that a skilled editor brings to the
table.
Neal
On Sat, May 12, 2012 at 5:09 AM, Peter
von Kaehne <refdoc@gmx.net>
wrote:
As Weston writes, the arbitriness of
some OSIS constructs makes it
difficult to capture all permutations into software.
At CrossWire we have a loosely (un)defined way we prefer,
allowing for a
fair amount of permutation and variation, but not all.
usfm2osis.pl is an incomplete tool in
growth. This means
a) there are specific aspects left out on purpose, which are
handled in
other scripts
b) it captures what we encounter and if we encounter
something else it
gets added. This means if you use USFM tags in a correct
fashion (as
defined by the USFM handbook from UBS) and usfm2osis.pl does not deal
with this, we will happily accept patches or (if your work
with it is
of sustained and useful nature) permit you to commit
directly.
We have tried to keep usfm2osis.pl
multiplatform and not dependent on
someone having Sword Perlbindings. Hence some aspects are
not dealt with
properly - mainly crossreferences. USFM allows any kind of
references,
in whatever language, but osis references are welldefined.
So
To achieve that you need to parse the original reference,
allowing both
for localised Bible book names and for a huge variety of
localised ways
of marking references, including ranges, lists etc - note
the comma in
the German reference, in an English one it would likely be a
colon.
libsword can do this msotly without hickups, but to
incorporate libsword
into a perl script you need to have functioning Sword Perl
bindings. So
we left this out of usfm2osis.pl.
There are other aspects. title_cleanup.pl creates titles which
conform specifically with what our
frontends expect.
But in summary, I can say, we fix things as we go along and
as we get
usfm texts we need to deal with, but can not with the
existing tool.
USFM is so huge and of so variable quality, that it would be
a gigantic
undertaking to really capture everything in one first go.
Peter
On 12/05/12 09:15, Russell Allen wrote:
> That's interesting, thanks.
>
> Is osis sufficiently specified that I can be
reasonably sure that if I am in compliant osis that
you're tools will work on it? Or do they work best on
osis from usfm2osis.pl
?
>
> Last time I tried usfm2osis.pl
it didn't work on my usfm perfectly, but if I did a one
off conversion then I could either hand fix or write my
own usfm->osis converter (which wouldn't be hard as
I've already got the backend)
>
> Russell
Supplementary question, is there an existing tool to take a Bible in either usfm or OSIS and change the chapter/verse structure from NSRV to the standard Jewish system (eg JPS Tanach)? Or can I markup multiple numbering systems in a single OSIS file?
> On 5/11/2012 10:38 PM, Russell Allen wrote:
>> Thanks David!
>> At the moment I'm using usfm and I have some homemade python which does usfm->html, pdf,text etc transformations.
>> I'm looking into redoing my workflow to use osis, primarily to make verification of the files easier.
>> Are there existing transform tools? I looked around but osis seems to exist in a vacuum - there is a spec, but that's it.
> Looking around, I found:
> http://code.google.com/p/osis-converters/wiki/Compatibility > Crosswire.org used to have a whole bunch of information on OSIS and a tool for USFM to OSIS conversion. Right now all Google links to their Wiki seem to fail. Maybe someone from Crosswire could shed some light on this.
>> Or should I be sticking with usfm?
> There has been some discussion on the developer list about this. Apparently it is easier to program an editor to write USFM, whereas OSIS has all the advantages of XML, for parsing and validation. Personally, I just spent significant time and effort getting my translations from ODF files to OSIS. And I am very happy with the results. I also developed an XSL-FO transformation, so I can still get PDF output, when I need it.
> Peace,
> David
>> Cheers, Russell
>> On 12/05/2012, at 7:35 AM, David Troidl wrote:
>>> Hi Russell,
>>> The first thing that comes to mind are the "original" OSIS examples:
>>> http://www.bibletechnologies.net./osistext/ >>> I haven't looked at these in years, but they were intended to be examples of the markup.
>>> And I'm attaching my take on the book of Ephesians, to give you another point of view. The nice thing about OSIS is that it allows a fair amount of latitude. This also means it requires some design decisions.
>>> Peace,
>>> David
>>> On 5/11/2012 4:53 AM, Russell Allen wrote:
>>>> Hi guys,
>>>> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
>>>> Cheers, Russell
>>> -- >>> You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
>>> To post to this group, send email to openscriptures@googlegroups.com.
>>> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
>>> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
>>> <EphOsis.xml>
> -- > You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
> To post to this group, send email to openscriptures@googlegroups.com.
> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
It is possible to mark up multiple numbering systems, see the OSIS manual, but not recommended. In the WLC:
https://github.com/openscriptures/morphhb/downloads I used the Hebrew system, and just added notes where the English system differs.
> Supplementary question, is there an existing tool to take a Bible in either usfm or OSIS and change the chapter/verse structure from NSRV to the standard Jewish system (eg JPS Tanach)? Or can I markup multiple numbering systems in a single OSIS file?
> Best, Russell
> On 13/05/2012, at 1:12 AM, David Troidl wrote:
>> Hi Russell,
>> On 5/11/2012 10:38 PM, Russell Allen wrote:
>>> Thanks David!
>>> At the moment I'm using usfm and I have some homemade python which does usfm->html, pdf,text etc transformations.
>>> I'm looking into redoing my workflow to use osis, primarily to make verification of the files easier.
>>> Are there existing transform tools? I looked around but osis seems to exist in a vacuum - there is a spec, but that's it.
>> Looking around, I found:
>> http://code.google.com/p/osis-converters/wiki/Compatibility >> Crosswire.org used to have a whole bunch of information on OSIS and a tool for USFM to OSIS conversion. Right now all Google links to their Wiki seem to fail. Maybe someone from Crosswire could shed some light on this.
>>> Or should I be sticking with usfm?
>> There has been some discussion on the developer list about this. Apparently it is easier to program an editor to write USFM, whereas OSIS has all the advantages of XML, for parsing and validation. Personally, I just spent significant time and effort getting my translations from ODF files to OSIS. And I am very happy with the results. I also developed an XSL-FO transformation, so I can still get PDF output, when I need it.
>> Peace,
>> David
>>> Cheers, Russell
>>> On 12/05/2012, at 7:35 AM, David Troidl wrote:
>>>> Hi Russell,
>>>> The first thing that comes to mind are the "original" OSIS examples:
>>>> http://www.bibletechnologies.net./osistext/ >>>> I haven't looked at these in years, but they were intended to be examples of the markup.
>>>> And I'm attaching my take on the book of Ephesians, to give you another point of view. The nice thing about OSIS is that it allows a fair amount of latitude. This also means it requires some design decisions.
>>>> Peace,
>>>> David
>>>> On 5/11/2012 4:53 AM, Russell Allen wrote:
>>>>> Hi guys,
>>>>> Does anyone know of an example OSIS file that fully uses its capabilities? It wouldn't need to be a whole Bible, even just a single book, but something which is more than just unformatted chapters and verses but shows headers, paragraphs, poetry etc...
>>>>> Cheers, Russell
>>>> -- >>>> You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
>>>> To post to this group, send email to openscriptures@googlegroups.com.
>>>> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
>>>> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
>>>> <EphOsis.xml>
>> -- >> You received this message because you are subscribed to the Google Groups "Open Scriptures" group.
>> To post to this group, send email to openscriptures@googlegroups.com.
>> To unsubscribe from this group, send email to openscriptures+unsubscribe@googlegroups.com.
>> For more options, visit this group at http://groups.google.com/group/openscriptures?hl=en.
> It is possible to mark up multiple numbering systems, see the OSIS
> manual, but not recommended. In the WLC:
> https://github.com/openscriptures/morphhb/downloads > I used the Hebrew system, and just added notes where the English system
> differs.
> I don't know of any tool that does this directly.
> Peace,
> David
> On 5/13/2012 7:33 AM, Russell Allen wrote:
>> Thanks guys. Interesting points.
>> Supplementary question, is there an existing tool to take a Bible in
>> either usfm or OSIS and change the chapter/verse structure from NSRV
>> to the standard Jewish system (eg JPS Tanach)? Or can I markup
>> multiple numbering systems in a single OSIS file?
>> Best, Russell
>> On 13/05/2012, at 1:12 AM, David Troidl wrote:
>>> Hi Russell,
>>> On 5/11/2012 10:38 PM, Russell Allen wrote:
>>>> Thanks David!
>>>> At the moment I'm using usfm and I have some homemade python which
>>>> does usfm->html, pdf,text etc transformations.
>>>> I'm looking into redoing my workflow to use osis, primarily to make
>>>> verification of the files easier.
>>>> Are there existing transform tools? I looked around but osis seems
>>>> to exist in a vacuum - there is a spec, but that's it.
>>> Looking around, I found:
>>> http://code.google.com/p/osis-converters/wiki/Compatibility >>> Crosswire.org used to have a whole bunch of information on OSIS and a
>>> tool for USFM to OSIS conversion. Right now all Google links to their
>>> Wiki seem to fail. Maybe someone from Crosswire could shed some light
>>> on this.
>>>> Or should I be sticking with usfm?
>>> There has been some discussion on the developer list about this.
>>> Apparently it is easier to program an editor to write USFM, whereas
>>> OSIS has all the advantages of XML, for parsing and validation.
>>> Personally, I just spent significant time and effort getting my
>>> translations from ODF files to OSIS. And I am very happy with the
>>> results. I also developed an XSL-FO transformation, so I can still
>>> get PDF output, when I need it.
>>> Peace,
>>> David
>>>> Cheers, Russell
>>>> On 12/05/2012, at 7:35 AM, David Troidl wrote:
>>>>> Hi Russell,
>>>>> The first thing that comes to mind are the "original" OSIS examples:
>>>>> http://www.bibletechnologies.net./osistext/ >>>>> I haven't looked at these in years, but they were intended to be
>>>>> examples of the markup.
>>>>> And I'm attaching my take on the book of Ephesians, to give you
>>>>> another point of view. The nice thing about OSIS is that it allows
>>>>> a fair amount of latitude. This also means it requires some design
>>>>> decisions.
>>>>> Peace,
>>>>> David
>>>>> On 5/11/2012 4:53 AM, Russell Allen wrote:
>>>>>> Hi guys,
>>>>>> Does anyone know of an example OSIS file that fully uses its
>>>>>> capabilities? It wouldn't need to be a whole Bible, even just a
>>>>>> single book, but something which is more than just unformatted
>>>>>> chapters and verses but shows headers, paragraphs, poetry etc...
>>>>>> Cheers, Russell
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Open Scriptures" group.
>>>>> To post to this group, send email to openscriptures@googlegroups.com.
>>>>> To unsubscribe from this group, send email to
>>>>> openscriptures+unsubscribe@googlegroups.com.
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/openscriptures?hl=en.
>>>>> <EphOsis.xml>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Open Scriptures" group.
>>> To post to this group, send email to openscriptures@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> openscriptures+unsubscribe@googlegroups.com.
>>> For more options, visit this group at
>>> http://groups.google.com/group/openscriptures?hl=en.