Guidelines for transcribing/proofreading/marking punctuation

8 views
Skip to first unread message

Aharon Varady

unread,
Nov 4, 2012, 7:42:16 PM11/4/12
to open...@googlegroups.com, Open Siddur Technical Discussion List
I'd like to revisit our guidelines for transcribing punctuation.

Our goals are to,

1) record the semantic data preserved in all textual witnesses.
2) preserve the variation between texts in so far as the variation is actually semantic data.


I'm going to assume that preserving punctuation differences is important -- both their presence and their absence. This, of course, doesn't make our task any more easy in transcription/proofreading. Nevertheless, id punctuation is semantic data, then differences here are important. But perhaps some questions would be helpful in thinking about how to author guidelines for folks proofreading and tagging transcriptions with semantic markup.

While I'm transcribing texts, I'm wondering whether the absence of pauses or breaks in the printing of a psalms is semantic data. I'm wondering whether a period is synonymous with a comma in some texts, and whether a colon is synonymous with a line break in others.

Text by text, it might be good to ask:

What does a comma signify?
What does a period signify?
What does a colon signify?
What other punctuation is used and what does it signify?

Aharon

E L

unread,
Nov 5, 2012, 2:19:55 AM11/5/12
to Aharon Varady, open...@googlegroups.com, Open Siddur Technical Discussion List
Hi,
In my humble opinion it is very much depended on the text.
I saw colon and period change meaning between texts from different periods of time.
Note that if you treat it as semantic data, then in books like Tehilim you will need to treat
the Teamim as semantic data. For example atnach is very much like a comma, ravia is sometimes
used as comma or colon and so on.

Ely



Aharon

--
 
 

Dovi Jacobs

unread,
Nov 5, 2012, 4:44:22 AM11/5/12
to E L, Aharon Varady, open...@googlegroups.com, Open Siddur Technical Discussion List
Hi Aharon, I certainly agree that at the transcription stage the style of the edition should be followed. I thought you were talking about a generic version of Sefer Tehillim itself.

In general, though, there needs to be some systematic thought about numerous parallel editions of the same text. When you dozens of editions that are extremely similar but not identical in terms of text and style, what is it that we actually want to do? Transcribe many dozens of siddurim (same for Tanakh and other books) individually? Create one giant coded text that incorporates the data from all of them in all of their myriad variations?

I personally lean towards something in the middle: Creation of some well-edited, documented and formatted basic text-types that stand on their own (e.g. Siddur Nosah Ashkenaz or Tanakh with Teamim) and document within themselves a limited amount of small variations. These can then be utilized in different directions: As the basis for working on the transcription of a very specific and important edition, or as independent components within a larger mega-text that can show and deal with major variations.


From: E L <nak...@gmail.com>
To: Aharon Varady <aharon...@gmail.com>
Cc: open...@googlegroups.com; Open Siddur Technical Discussion List <opensid...@googlegroups.com>
Sent: Monday, November 5, 2012 9:19 AM
Subject: Re: Guidelines for transcribing/proofreading/marking punctuation

--
 
 


Reply all
Reply to author
Forward
0 new messages