<Hebrew translator>
--
--
Hi,Have you seen the ODT spec? :-)
On 10/19/2012 01:10 AM, E L wrote:
Hi,
I started reading it, IMHO this file format is scary.
Seriously, though -- comments accepted, but know that the spec on the wiki is more complex than the current one.
It's actually quite a bit simpler than what's in that doc, and it's relatively easy to digest simple documents. I'm attaching 3 example documents from Tanach:
I don't see any non expert being able to understand what's going on there.
Which reduces the amount of people that will be able to program for this file format.
1. A prose text
2. A poetic text
3. A book
(The split into chapters is because the original source - the WLC - is like that, and, while it's not ideal, it is somewhat standard.)
Actually, it doesn't. In the current incarnation, the names are arbitrary -- they're filenames. A reference either takes the form filename#id or filename#range(id1,id2). (The @cref portion has been removed because it's more difficult to implement).
I have a lot of comments about different parts, for example the reference to the bible uses non jewish names and splitting to chapters.
Actually, the references are completely generic. As long as there's a unique id, there's the ability to reference it.
Do you really want to use it? the citation allows you to do whole psukim, but for example
in the gmara and rashi it's really important to know which words they chose to quote.
RelaxNG is a schema language. EPUB is essentially a formatting language (a limited version of HTML for ebooks).
The problem is that it's hard to comment about the whole thing. Maybe we should make a cross project file format
that can be discussed in parts? Maybe people have different ideas such as basing on relaxNG or epub formats.
I would welcome discussing everything in parts (and I've been waiting for someone to want to discuss this format issue for a while!). I am certainly open to accepting an in-field de-facto standard over my concoction, but it should support the features that will be needed.
I would strongly recommend reading some parts of the TEI Guidelines at <http://www.tei-c.org>. A lot of the problems you'll encounter trying to work on book formats are solved there.
--
---
Efraim Feinstein
Lead Developer
Open Siddur Project
http://opensiddur.net
http://wiki.jewishliturgy.org
I see, if there a way to do things like aazinu and shirat hayam?
Do you really want to use it? the citation allows you to do whole psukim, but for example
in the gmara and rashi it's really important to know which words they chose to quote.
Actually, the references are completely generic. As long as there's a unique id, there's the ability to reference it.
To be able to use something like that in a program in an efficient way one will need to index it or even database it.
We should consider both the storing/working on the information and how it should be use from within programs.
For example in cell phone it's important that the texts will be shared.
RelaxNG is a schema language. EPUB is essentially a formatting language (a limited version of HTML for ebooks).
They both support of separation between information and presentation. and can use XML for extensions.
But I think what is important is to first set the design goals:
- Simple to use (By everyone, to the point where people can write their dvar tora in that format and use references and stuff when showing it to others).
- Can be used in distributed editing projects
- Jewish oriented, I know it's weird to set it as a goal, but if you are interested I can elaborate on why it's important.
- Supports multiple versions of the same book (e.g. you can reference to a version, see differences in an easy way etc.)
- Support references and quotes efficiently
- Supports search and indexing.
I also think that keeping what is now should not be a design goal. For example I see no reason to split the gmara
by pages from 300 years ago when allowing people to reference to them or any other weird decision like that.
The bible IMHO should be split by jewish books and inside the book it should be split parshiut ptochut. (this is the way it was given from sinai or by roach hajkodesh).
--
Hi,Let's discuss specifics!
On 10/26/2012 03:21 AM, E L wrote:
Hi,
I read a bit about TEI and the extra markup you suggested.
There are lot of small things that I think we should add or do different but in general
I'm not sure which project you're talking about? Link?It is a bit complicated, but beside the basic books the rest should be simpler to handle.
I think we could use one of those metadata wiki project to integrate it with wikibooks.
The format I'm suggesting is an *archival* format, suitable for a database. The advantage of using a standardized and well-specified archival format is that the output format could then take advantage of device-specific issues/features instead of cramming information about every device's preferred output into the archive. Separation of concerns is one of the fundamental principles of well-designed XML.
I do think that we should also think about ways to efficiently use it, especially on mobile devices
such as tablets and phones. If applications could use the same database of books and the same
library for rendering the information it could spore a lot more mobile applications.
--
---
Efraim Feinstein
Lead Developer
Open Siddur Project
http://opensiddur.net
http://wiki.jewishliturgy.org
On Fri, Oct 26, 2012 at 10:47 PM, Efraim Feinstein <efraim.f...@gmail.com> wrote:
Hi,Let's discuss specifics!
On 10/26/2012 03:21 AM, E L wrote:
Hi,
I read a bit about TEI and the extra markup you suggested.
There are lot of small things that I think we should add or do different but in general
Sure, but where do you start with such a big format?:)
I agree. Separation is very important. But since we are very small community we need to provide tools
for everything from viewing editing and anything else or no one will ever use that format.
--
On 11/02/2012 05:01 AM, E L wrote:I attached a few chapters of Tanach to a previous email.
almost a week after and my head hurts from reading the TEI manual;)
Can we start in a more simple way? Can you show an example for a simple book?
Like lets say a chapter from the rambam or something?
The TEI manual is actually one of the better-written specs I've read :-)
On Fri, Nov 2, 2012 at 6:15 PM, Efraim Feinstein <efraim.f...@gmail.com> wrote:
On 11/02/2012 05:01 AM, E L wrote:I attached a few chapters of Tanach to a previous email.
almost a week after and my head hurts from reading the TEI manual;)
Can we start in a more simple way? Can you show an example for a simple book?
Like lets say a chapter from the rambam or something?
It had tags on every word. Is that the usual use case of tei? I assumed that the tanach is special in some sense.
The TEI manual is actually one of the better-written specs I've read :-)
Yes, and still very complicated for us mortals :)
note that no open content Hebrew content beside opensiddur is using that spec.
And since
It's a good spec in general it means that something is scaring people away.
I think that if I go over the spec on the mailing list no one beside you will bother reading it.
If we want to get more people to support and participate we to somehow simplify it.