oral annotations outside of SayMore

Joyce in PNG

unread,

Jan 29, 2017, 8:10:36 PM1/29/17

to SayMore

Hello everyone,
it's been a long while since I looked at the SayMore list.

The oral transcription and oral translation abilities are two of the best features of SayMore. However, in preparing for a brief upcoming fieldwork trip, in which we won't be bringing a computer or solar power, I am considering the idea of using two H2n Zoom recorders for oral transcription (and/or oral translation): one for the original story, and then starting & stopping (with the pause button) the first recorder, asking someone to give a phrase-by-phrase oral transcription/translation, with the second H2n recording.

Can you tell me if there is a way to put that oral transcription/translation file into SayMore, when we come back to the computer and the electrical power, in a way that SayMore knows what it is? Or will we be stuck with useful audio files that are a little less useful because we can't use SayMore to do more with them? Has anyone done anything like this already?

thanks very much,
Joyce

(question is urgent until Feb 7th.)

John Hatton

unread,

Jan 30, 2017, 8:22:08 AM1/30/17

to say...@googlegroups.com

Hi Joyce,

On Sun, Jan 29, 2017 at 6:10 PM, Joyce in PNG <joyce...@sil.org> wrote:

Can you tell me if there is a way to put that oral transcription/translation file into SayMore, when we come back to the computer and the electrical power, in a way that SayMore knows what it is?

Hopefully someone else on the list can give you more specific steps, but if not you should have time before you leave work something out and verify that the following is not too much work.

You can in theory just produce exactly what SayMore produces, but using Audacity labels and "export multiple". You can then put those files into your session and SayMore should use them.

First, look at a SayMore session that has the careful speech and translation done, or make up a sample one. Look at the files. You'll see an "oralAnnotations.wav" file, then a folder of wav files with the same name. You will need break your transcription and translation files up to look like that.

Open all three files in Audacity. Now go through making labels (Ctrl+b) in the careful speech and translation tracks. The challenge will be to name those labels with the time periods of the original recording. For example:

0_to_3.45_Careful.wav

0_to_3.45_Translation.wav

3.45_to_13.85_Careful.wav

3.45_to_13.85_Translation.wav

etc.

Finally, do an "Export Multiple..." for each of the two annotation files, so that you get files like that which SayMore would produce.

Please try this out an let me know if you run into problems; I haven't tried this out myself so I could be missing something.

--

John Hatton

Senior Software Engineer/Program Manager
Language Software Development
SIL International

Will

unread,

Jan 30, 2017, 9:58:29 AM1/30/17

to SayMore

Hi, Joyce. I (along with Brenda Boerger, Sarah Moeller, and Stephen Self) have recently e-published a manual that goes into explicit detail on this exact issue (and many others). Find it here:

https://leanpub.com/languageandculturedocumentationmanual

I strongly urge you to look at the relevant sections of the book; it should answer your questions, and if it doesn't I'd really like to know about it for the next edition! (And you would get updated editions for free, once you've purchased a copy!)

I briefly looked at John's reply. His idea and approach are correct (I've done them); I'll look more closely at the process today or tomorrow to see if I can add anything.

I realize my opinion is biased, but I really think the book is your best way forward for this and for your data trip in general!

Reply and let me/us know how you decide and how you get along; I am very interested in your venture!

Will

unread,

Feb 1, 2017, 11:27:30 AM2/1/17

to SayMore

Hi, again, Joyce. I ran some test data through SayMore this morning, and that helped clear cobwebs away from that section of my brain. Here's the lowdown: there are several things you can do with your audio files to make them more suitable for your SayMore project set, but you really do not have to do any of them.

If you've followed the BOLD approach, you will be coming back to your computer with a Source Recording, a Careful Speech Annotation Recording, and a Translation Annotation Recording. (You will actually have dozens of each, but I'll just exemplify the process by talking about one set.) Drag all three of them into a SayMore session . Each of the annotation recordings will have used the stereo channels to specific purpose - my convention is to have the old information on the left channel, and the annotation on the right channel, but there is really no problem with having it the other way around. These breaks in the data provide really nice fodder for SayMore's auto-segmentation tool. Run that tool on each annotation. Once it's done, inspect its work by clicking on the "Segment..." tool. The auto segmenter should have gotten nearly everything segmented correctly, but you may need to move a segment or two, or tell it to ignore slated metadata, etc. Now when you close the Segmentation Tool, your Careful Speech and Translation recordings are ready for written transcription.

The down side here (one of them) is that you now have twice the number of segments you need: segments on the old information as well as on the Careful Speech or Translation parts. That's okay; at worst the presentation is a little cumbersome, but it may actually be helpful to have the old information segments on hand as the annotator keyboards the annotations.

So let me say here that doing what I've described can be fairly quick and needs no use of software external to SayMore. But . . .

Let me speak of the trade off. By going to your data gathering location without your computer, your team is more lightweight, nimble, and needs less electrical power. However, SayMore's method of annotating creates annotation recordings that are half (in the case of Careful Speech Recordings) to a quarter (in the case of Translation Recordings) the size of what you would have when you use the workflow I described above. But there are things you can do to reduce the size of all three types of data files, after recording, and before pulling the data into SayMore Sessions. These involve utilizing an audio editor program like Audacity. With Audacity you can cut the size of each of your files by at least a half, while maintaining the full functionality of your data.

If you'd like ideas on how to so utilize Audacity, let me know. But I think I've now addressed your posted concern.

-Will

thesarah

unread,

Feb 6, 2017, 6:18:32 PM2/6/17

to SayMore

If you use the book, please send (or post on its website) any feedback or not errors/typos!

Also, especially if you put the book to use on the field, we'd appreciate some published reviews :)

Thanks! *end of self-promo*

Reply all

Reply to author

Forward