recommended method for archiving FLEx data in SIL's REAP?

220 views
Skip to first unread message

Kevin Warfel

unread,
Jul 24, 2014, 11:53:52 AM7/24/14
to flex...@googlegroups.com

Can someone tell me the best way for an SIL member to archive his FLEx data in REAP?

 

Thanks,

Kevin

 

Kevin Warfel

Associate Dictionary and Lexicography Services Coordinator

a.k.a. Dictionary Development Coordinator

SIL International

 

Current technology makes it possible to provide those translating into just about any language with both a dictionary and a thesaurus in the target language, the standard tools of the trade for professional translators, so why are mother-tongue translators in minority languages still expected to do their work without these tools?  Ask me about Rapid Word Collection after reading about it at rapidwords.net.

 

Hugh Paterson III

unread,
Jul 24, 2014, 2:24:27 PM7/24/14
to flex...@googlegroups.com
There are some Resource in the REAP materials. However these are in my opinion insufficient because you have not described what is in the FLEx database. Since this is not a FLEx issue but an SIL internal issue I suggest that this topic is not appropriate for this venue. 

- Hugh
-- 
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
--- 
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/2ea434f808f55bf130fc3b9077223718%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Eric Jackson

unread,
Jul 24, 2014, 2:40:04 PM7/24/14
to flex...@googlegroups.com
Funny you should ask! I've just been having a conversation about how to archive complete Fieldworks backups with the REAP administrators and some of the other curators.

The simplest answer is that there is a "How to" at

https://www.help.insitehome.org/reap/how-tos/how-to-submit-your-flex-data

If your web access (or the web access of anyone else who needs this information) isn't sufficient to view this relatively small webpage, then I'm sure there are ways to get it as a PDF. (For instance, if you need it, I can send it to you off-list.)

A few other things to note, though: users are encouraged to use RAMP (a downloadable Adobe AIR package) to prepare uploads to REAP, rather than using the REAP web interface directly. I believe that Fieldworks 8 allows users to connect directly to RAMP, which helps in preparing the archival package. If you (or whoever is requesting the information) don't have sufficient Internet access to be able to upload the archival package from RAMP, one of the great features of RAMP is that it can prepare a package offline, and that package can then be passed or mailed (on a thumb drive, CD-R, or whatever is available) to someone else with better network access for uploading -- or even mailed to the archive in Dallas, if need be.

I'd suggest that as you prepare to archive this data, you also think through exactly why you're going to be doing this, that is, what your purpose in archiving your FLEx database is. Are you archiving it to serve as just a backup? (and if so, are you including other Fieldworks data besides just the FLEx lexicon, like texts, anthropology notes, or TE data?) Or are you trying to archive your lexicon so that it can be shared to other interested parties through the SIL.org website? Digital lexicons are very complex data objects, and if you're trying to archive it so that others can discover it and use that data, there are some other issues to be thinking about. An archiving specialist for the Americas Area has a lot of discussion of this on a page at the Insite wiki.

--Eric
--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flex-list/2ea434f808f55bf130fc3b9077223718%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


-- 
Eric M. Jackson, Ph.D.          芮建生博士
SIL Intl, East Asia Group       世界少数民族语文研究院东亚部
http://sil.academia.edu/EricJackson/

Jon C

unread,
Jul 24, 2014, 5:22:01 PM7/24/14
to flex...@googlegroups.com
I think that is all very good advice. I would add a few points:
- As Eric said, this is not an online backup system, nor much of an online publishing solution. You should generally submit to the archive whenever you reach a major milestone in the lexicon project. (RAMP helps with this because it can keep a copy of your previous submission on your hard drive. With REAP, I believe you still have to start from scratch each time, unless an administrator helps you.)
- Although REAP is an SIL archive, my understanding is that we are happy to receive submissions of non-SIL lexicons, and I've seen this done.
- I would add one step to that how-to: include an exported LIFT file in addition to the .fwbackup file. A LIFT file is logically organized and fairly human-readable, so it could be interpreted without special software (FLEx) or special knowledge. (It is not as complete as .fwbackup, however.)

Jon

Beth Bryson

unread,
Jul 24, 2014, 9:40:42 PM7/24/14
to flex...@googlegroups.com
The "easy" answer to the question is: "Use the 'Archive to REAP' function in the File menu available in FLEx 8.0.x."

But as others have noted, you do need to think about *why* you are archiving.  When I was at CoLang a couple weeks ago, representatives of two other archives were asking me about what they should ask for from people submitting FLEx data.

 - One of them is targeted specifically at language communities, and they want their archive artifacts to be readable by them.  They will probably ask for a PDF of FLEx output to be submitted along with the FLEx project.

 - Another archive was not comfortable with bundled files, or a .zip format.  They prefer flat text, like an XML file.  I pointed out that the FLEx .fwdata file is just an XML file, and originally it was designed to be at least somewhat readable.  They might ask for that file to be submitted separately along with the .fwbackup.

 - For SIL purposes, it's noted that the files related to a FLEx project make the most sense when they are taken all together, and so submitting a bundle (.fwbackup) is appropriate for us.

However, I don't believe we currently have the "publish" function (yet?), whereby outsiders can search the archives and see what is in them, and I was told that the SIL archive is only open to SIL submitters.

Basically, each archive can specify what artifacts they want a submitter to provide.  And for REAP, the best way to work with that currently is via the command in the FLEx File menu.  (Note that it is only active if you have downloaded and installed RAMP.)

-Beth

Hugh Paterson III

unread,
Jul 25, 2014, 12:01:21 AM7/25/14
to flex...@googlegroups.com
Correction to Jon C
- Current SIL Archive policy (as of 2014) prevents the archive from accepting submission which are not the intellectual property of SIL. Generally when non-SIL persons create digital works SIL has no claim to those works. Therefore these works (including FLEx and toolbox databases) are not currently able per SIL policy to be archived in SIL's digital repository. Some organizations have contractual agreements with SIL for archiving services. In these cases the materials are put in a special collection for that organization and are treated as property of that organization, stewarded for the duration of the contract by SIL in SIL's digital repository. When that contract for archiving services ends, then it is logical that those objects will be removed from the archive as they are not SIL's responsibility or property.

Again, this is not a FLEx issue, but rather an SIL internal issue. Perhaps there is a better venue for this discussion.

- Hugh Paterson III


Hugh Paterson III

unread,
Jul 25, 2014, 12:52:30 AM7/25/14
to flex...@googlegroups.com
@Beth,

It seems like there is a lot of guess work going on here and there is a lot of miss information going around. The second use case you mention below sounds like it came from Nicholas Thieberger and Paradisiec. I say this because I was involved in a discussion with Nick and with a submitter to Paradisec about the submission of a FLEx 8 dataset (by a non-SIL person). Nick in my opinion is on a purist kick where he espouses that "XML is the only archive-able option". I pointed out that other things can be included in the .fwbackup file, which when I have discussed this with LSDev folks in the past have said that font data and keyboards, images and sound files and other such data can be included in the .fwbackup file. These data are not included in the LIFT file format (links to them might be but not the data files themselves). The approach in archiving, contrary to Nick's approach needs to be to preserve the digital artifact. This is what archivists do with physical artifacts - preserve their integrity, digital archivists need to do the same. By stripping the xml file from its rapper you also remove any contextual metadata which might be found in the .fwbackup file. This destroys the integrity of the file. What Paradisec was doing was rendering the file useless to future users because future users could not download the file and then load it into FLEx, sure they could download the file and then they could access the remnant XML, but there was no restoration to the original context.  - Which also shows that the Paradisec policy setters does not understand what is capable of being contained in the .fwbackup. But where are they to get that information? This rolls back to being an SIL International communications failure. Here is what I told Nick and the submitter:


The .fwbackup is the archivable object/artifact. The SIL Language and Culture Archive has this file type associated with FLEx so that when the archive ingests it the MIME type check plus file file extension check it associates it with the FLEx application. I suppose that your archive has a similar check. As the .fwbackup is the most basic format which functions as a single unit please do not open it or try to parse it into pieces as that will effectively destroy the artifact for future users. There are two bits of metadata I suggest your archive also maintain (aside from the name of the application which created the file, which currently is only FLEx) the first is the version number of the application which made the artifact. That is, the internals of FLEx back-up files can change from version to version, but there is a script in each version of the application which will update the back-up to the next version. Future users of the artifact need to know at which FLEx version to start the update process. In terms of content, my understanding of current versions (version 8+) of FLEx back-up files is that they can contain fonts, keyboard layouts, FLEx XML databases (which are not the same as LIFT files), parsing rules created in FLEx, anthropological notes and annotation for stories/texts, and FLEx texts - Not all of these things are necessarily in a FLEx Back-up. Therefore a good artifact description would indicate if these things are present or absent. *Parsing rules, anthropological annotation of texts, and FLEx texts are never included in LIFT format exports (and LIFT is designed as an XML interchange format not an archival format such as TEI or LMF). The second bit of meta-data which would be useful is whither the .fwbackup file was created on a Windows OS or on Linux as FLEx runs on both OSes. This would be relevant to technicians who (in a disastrous case) might be called on to repair a damaged/corrupted backup file.

You can verify how SIL's language and culture archive is handling .fwbackup files by contacting archive...@sil.org. Your best bet in reaching the FLEx developers to verify what can and might not be included is to ask on the FLEx google group. https://groups.google.com/forum/#!forum/flex-list

I hope you find this helpful. And I apologize that this information is not available on sil.org. Any encouragement to make this kind of information more publicly available should be made to either the archive...@sil.org email address, or the google group.

What is interesting is that Kevin is asking this on an open forum. Kevin is also in a leadership role in the lexicography support group, yet he is asking this question in an open forum. So, either he is genuinely asking, and doesn't know (to which all this info is likely helpful), or this is some sort of social experiment to determine what people are already thinking and doing. LSDev has not taken the initiative to post on the FLEx website how archives should treat digital artifacts produced by FLEx. Nor has the SIL archive produced such a statement. This seems to me to be irresponsible on the part of SIL International. In January 2014 I had a joint meeting with LSDev leadership Rob Scebold, Lexicography support leadership Verna Stutzman, and Archiving leadership Jeremy Nordmoe, and Language Program Training Leadership Fraser Bennett. The topic of this meeting was that SIL needs a clear set of open and accessible guidance for archiving and the lexicography services rendered because the people being serviced by these organizational units are not just SIL people. This statement needs to come from across these department. I have yet to see any response to this "suggestion". I took the knowledge I have (recognizing that I have limits and I need some other people to provide certain kinds of technical details) and created a wiki page in the SIL internal wiki. I have recently ask Jon C to comment on that page (no response yet) I have also asked verna to add stuff, and I have also asked Jeremy to add stuff. My intent is that this content needs to be on the SIL.org website where people will be able to find it via google. But perhaps this conversation is the impetus to move that discussion forward.

I invite further discussion on that internal wiki page (which has already been linked to by Eric). I hope that the ensuing discussion results in a communications output which is helpful for the various audiences which SIL International endeavors to serve.

It should also be noted that SIL's Archiving tutorials on how to archive FLEx data says that if the team is using LanguageDepot that Archiving is not necessary. I strongly feel that this is errant for similar reasons as outlined to Nick and articulated above. I think it also shows that SIL archive managers do not understand the actual data differences between what is stored in .fwbackup and on LangaugeDepot servers. 

- Hugh Paterson III


Kevin Warfel

unread,
Jul 25, 2014, 9:28:46 AM7/25/14
to flex...@googlegroups.com

My apologies for opening the proverbial can of worms. In response to the excerpt below from Hugh’s most recent message, let me clarify my motive for the initial question.

 

A FLEx user recently contacted me with this question: “I am being urged to get my backups into REAP and I thought I had heard that Lg Explorer Version 8 had a strategy for doing that?  I have 8.0.9.416879 and I haven’t seen any reference to REAP or RAMP.  Do I need to update again?” I didn’t have any specific information about this capability, but was aware that the FW Development team was working on features to facilitate dictionary publication, and thought that this rumored feature might be part of that, or that it might already be available in FLEx. In either case, I’ve found that asking the FLEx list is the quickest way to get an answer, so that’s why I wrote.

 

My focus in Dictionary and Lexicography Services is on Rapid Word Collection. I know quite a bit about that, but there are many other areas of lexicography where my knowledge is minimal. Archiving is one of those.

 

I hope that clarifies the reason for my question, and why I felt the FLEx list was an appropriate venue for it.

 

Thanks,

Kevin

 

Kevin Warfel

Associate Dictionary and Lexicography Services Coordinator

a.k.a. Dictionary Development Coordinator

SIL International

 

Current technology makes it possible to provide those translating into just about any language with both a dictionary and a thesaurus in the target language, the standard tools of the trade for professional translators, so why are mother-tongue translators in minority languages still expected to do their work without these tools?  Ask me about Rapid Word Collection after reading about it at rapidwords.net.

 

From: flex...@googlegroups.com [mailto:flex...@googlegroups.com] On Behalf Of Hugh Paterson III
Sent: Friday, July 25, 2014 12:52 AM
To: flex...@googlegroups.com
Subject: Re: [FLEx] recommended method for archiving FLEx data in SIL's REAP?

 

<snip>

What is interesting is that Kevin is asking this on an open forum. Kevin is also in a leadership role in the lexicography support group, yet he is asking this question in an open forum. So, either he is genuinely asking, and doesn't know (to which all this info is likely helpful), or this is some sort of social experiment to determine what people are already thinking and doing.

<snip> 

 

- Hugh Paterson III

 

Ann Bush

unread,
Jul 25, 2014, 10:01:32 AM7/25/14
to flex...@googlegroups.com

I’ve not used it, but REAP/RAMP is available through Flex in version 8.  The help topic ‘Archive with RAMP (SIL)’ is in the help file.

 

Ann

--

You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.

Kevin Warfel

unread,
Jul 25, 2014, 11:03:08 AM7/25/14
to flex...@googlegroups.com

Thanks, Ann. I think that’s what this user was looking for, and I think that’s what Beth was referring to, as well. So I’ve sent that information to the inquirer and I’ll see if it helps him move ahead. I suspect that the option is grayed out on his computer and that he didn’t realize that he needed to install RAMP in order to make it become active.

 

Thanks for your helpful input, everyone!

 

Kevin

Eric Jackson

unread,
Jul 25, 2014, 1:48:49 PM7/25/14
to flex...@googlegroups.com
I don't want to beat a dead horse (or beat a can of dead worms?), but since I think open communication about archiving topics is a good value to have, I wanted to address another point that Beth brought up. I realize that some of this information is only relevant for SIL-internal users of FLEx, not an issue for all FLEx users in the world, but this is a place where many SIL FLEx users can be reached. Moreover, the archival storage of Fieldworks data, as well as methods to make that data available to others, is something that is relevant to linguists outside of SIL, and I think our community would benefit from open discussion about that.

On 07/24/2014 06:40 PM, Beth Bryson wrote:
[...]
However, I don't believe we currently have the "publish" function (yet?), whereby outsiders can search the archives and see what is in them, and I was told that the SIL archive is only open to SIL submitters.

The archives are actually set up now to feed the SIL.org website. If you go through the RAMP interface, on the page where you actually attach the bitstreams to the archival package, there is a checkbox for each item that says, "Post to SIL website?" If you check this box (and if other criteria are met, ie, the collection that you're submitting to is a public collection, and if you've set the sensitivity of that file to be public), that file, as well as the metadata about that resource, will be sent to the SIL.org website. You can browse the available content by going to www.sil.org, choosing the menu item "Resources", and then choosing "Language & Culture Archives." (or just go straight here)

As curator for East Asia Group, I just worked with an SIL member who submitted a Fieldworks backup including FLEx data as well as other information. Since this is a complex issue, I discussed it with the archives staff. Their response was in agreement with what others have said here, namely, that currently acceptable practice is to simply submit a Fieldworks backup as-is. However, since Fieldworks backup files are not really intended to be used for discovery or re-distribution of lexical data, the archives staff suggested that backups like this should not be submitted to a public collection, but to a confidential one; the usefulness of this kind of archival package would then be limited to data preservation, not discovery or re-distribution. I haven't done an exhaustive check, but when I searched for Fieldworks backups on the sil.org site just now, I was unable to find any.

If your goal (the generic "you") is to comply with SIL policy concerning archiving, then submitting a monolithic Fieldworks backup is sufficient; but archiving a monolithic Fieldworks backup in this way would not be an ideal method to share your lexical data or make it available to the community -- both the linguistics community and the language community where you work. Even for linguists outside of SIL who use Fieldworks for their lexical data (and possibly other language and culture related data, as well), finding a good way to subsequently make your data discoverable and shareable ought to be a goal. As Hugh points out in his discussion on the Insite wiki (ie, the page that I linked to in my earlier message), finding a good way to make FLEx datasets discoverable and accessible from an archive is a very complicated issue, and perhaps other non-archival outlets (Language Depot, Webonary, and others) are a part of the solution to that. I hope that this is something that language archives in general will discuss and be able to reach a consensus on, but I think part of the solution involves archive submitters (ie, users of Fieldworks) bugging the archive staff to work out a community consensus for how to do this. For the time being, I suggest that a place to start would be to have a discussion about this with the curator of the archive that serves your language or area. If you're in SIL, you can find out who is working to support archiving in your area by looking at this page on the Insite wiki. If you're not in SIL, you can find archives that cover your language or area at the Open Language Archives Community.

--Eric

Hugh Paterson

unread,
Jul 25, 2014, 2:20:47 PM7/25/14
to flex...@googlegroups.com
I should like to point out that in general, for repurposing the the data into other systems Eric is correct. However, in many situations, lexical datasets are passed from one team to another. That is, someone may work on a language from 2010-2013, and then desire some years later (i.e. 2015) to pass their database on to another person working on that same language, that person may be a student, or a colleague, or native speaker of that language (there are a variety of use cases where actually passing a FLEx database intact is the desired and preferred option). One reason archives exist is so that there can be a time gap between the first compiler and the successive user of the data (and that these two people doe not have to actually meet each other). Therefore for these reasons I am not in favor of telling other non-SIL archives that .fwbackup files are not valid files to allow to be downloaded from their archive by their patrons. However, in SIL contexts: if the file qualifies to be added to REAP then that means that it is also copyright SIL International. SIL International retains the right to first publication and to determine appropriate distribution - not the creator of the file. SIL international has chosen not to distribute the .fwbackup files because it is not in SIL International's best interests (or current business model) to do so. Part of SIL International's business model for data is that if successive data users enrich data from SIL's archive then those users (if they are non-SIL) can not resubmit their enriched content to SIL's archive. In terms of data ecology this is a bleeding relationship. Therefore, it is the responsibility of other archives need to determine their own business models for distribution, archiving, and accepting enriched data. Data ecology is an important part of language development and in the long term will affect the perceived usefulness of an archive to a variety of archive patrons.

- Hugh Paterson III


--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.
To post to this group, send email to flex...@googlegroups.com.

Jon C

unread,
Jul 25, 2014, 2:27:22 PM7/25/14
to flex...@googlegroups.com

On 7/24/2014 11:52 PM, Hugh Paterson III wrote:
> It should also be noted that SIL's Archiving tutorials on how to
> archive FLEx data says that if the team is using LanguageDepot that
> Archiving is not necessary. I strongly feel that this is errant for
> similar reasons as outlined to Nick and articulated above. I think it
> also shows that SIL archive managers do not understand the actual data
> differences between what is stored in .fwbackup and on LangaugeDepot
> servers.
>
Are there differences? My understanding is that there are two kinds of
projects in LanguageDepot:
1. A LIFT project, which stores all the things WeSay does (LIFT, ranges,
and settings).
2. A FW project, which stores everything an .fwbackup file does
(send/receive breaks the .fwdata down into pieces during sync; I don't
know how the server stores it).

I would assume that "FLEx data" would refer to the latter, and that it
would be "complete" with regard to the data, and almost totally
incomplete with regard to all the metadata that REAP/RAMP require for an
archive submission. So there might be a gap there, but it may be that
they are storing what is considered the most essential metadata.

Jon

Hugh Paterson

unread,
Jul 25, 2014, 2:30:12 PM7/25/14
to flex...@googlegroups.com
What about supporting files, like images, audio, and video - keyboard files, language settings, writing systems settings, etc.


--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
--- You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+unsubscribe@googlegroups.com.

To post to this group, send email to flex...@googlegroups.com.

Jon C

unread,
Jul 25, 2014, 2:50:00 PM7/25/14
to flex...@googlegroups.com
My guess would be that it would be incomplete if any media files exceed the max file size supported by send/receive (something I wish Chorus would loudly warn the user about.) Project settings are included. I would suggest you ask the Palaso or FW devs whether keyboard files (which are not data but are useful for editing) get synched.
Jon


On 7/25/2014 1:29 PM, Hugh Paterson wrote:
What about supporting files, like images, audio, and video - keyboard files, language settings, writing systems settings, etc.
On Fri, Jul 25, 2014 at 11:27 AM, Jon C <jvco...@gmail.com> wrote:

On 7/24/2014 11:52 PM, Hugh Paterson III wrote:
It should also be noted that SIL's Archiving tutorials on how to archive FLEx data says that if the team is using LanguageDepot that Archiving is not necessary. I strongly feel that this is errant for similar reasons as outlined to Nick and articulated above. I think it also shows that SIL archive managers do not understand the actual data differences between what is stored in .fwbackup and on LangaugeDepot servers.

Are there differences? My understanding is that there are two kinds of projects in LanguageDepot:
1. A LIFT project, which stores all the things WeSay does (LIFT, ranges, and settings).
2. A FW project, which stores everything an .fwbackup file does (send/receive breaks the .fwdata down into pieces during sync; I don't know how the server stores it).

I would assume that "FLEx data" would refer to the latter, and that it would be "complete" with regard to the data, and almost totally incomplete with regard to all the metadata that REAP/RAMP require for an archive submission. So there might be a gap there, but it may be that they are storing what is considered the most essential metadata.

Jon


--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
--- You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.

To post to this group, send email to flex...@googlegroups.com.
--
You are subscribed to the publicly accessible group "FLEx list".
Only members can post but anyone can view messages on the website.
---
You received this message because you are subscribed to the Google Groups "FLEx list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flex-list+...@googlegroups.com.

To post to this group, send email to flex...@googlegroups.com.

Jon C

unread,
Jul 25, 2014, 3:14:21 PM7/25/14
to flex...@googlegroups.com

On 7/25/2014 12:48 PM, Eric Jackson wrote:
The archives are actually set up now to feed the SIL.org website. If you go through the RAMP interface, on the page where you actually attach the bitstreams to the archival package, there is a checkbox for each item that says, "Post to SIL website?"

Thanks, that's good to know! In case it's helpful to have a concrete example, here's part of the metadata for our project, currently visible to SIL users in REAP. Eric, based on your comment, I imagine that changing the "Access to presentation file(s)" would cause the PDF to become freely available at sil.org. Correct?

(Note: The MS Word files were intermediate files rather than a final product, so that's why they're "source" files. I probably should also have included a LIFT export, just for maximum future-safety. If we wanted to make it easier for others to fork/repurpose the lexicon, then I'd make that "presentation", and otherwise "source", correct?)

Again, for those of you wanting to "publish electronically", a PDF in REAP isn't much. Webonary (or even Lexique Pro) would be better options.
-Jon

Sensitivity settings
  Sensitivity of work:     Public
  Access to presentation file(s):     Insite users
  Access to source file(s):     Insite users
Files in this item
  Files     Size     Format     View     Description
  2011-06-kamus-saku-tado-1and2TrimmedCover2.pdf     1.823Mb     PDF     View/Open     Presentation file: dictionary content including front cover (PDF)
  2011-06-kamus-saku-tado.doc.zip     908.6Kb     application/zip     View/Open     Source file: dictionary content (MS Word 2003)
  FieldWorks_backup.zip     9.458Mb     application/zip     View/Open     Source file: This is a recent FieldWorks7 backup (the publication was done from FW6 which will rapidly be obsolete). To view just the pocket dictionary entries, filter the Publish field for saku.)


Randy Regnier

unread,
Jul 25, 2014, 3:26:34 PM7/25/14
to flex...@googlegroups.com
On Jul 25, 2014, at 11:49 AM, Jon C <jvco...@gmail.com> wrote:

> My guess would be that it would be incomplete if any media files exceed the max file size supported by send/receive (something I wish Chorus would loudly warn the user about.) Project settings are included. I would suggest you ask the Palaso or FW devs whether keyboard files (which are not data but are useful for editing) get synched.
> Jon
>

My own keyboarding files are stored in the S\R system, because I copied them into a folder within the project folder. I'm on my phone, and not my computer, so I can't say which folder it is.

Randy Regnier

Eric Jackson

unread,
Jul 25, 2014, 4:02:22 PM7/25/14
to flex...@googlegroups.com
On 07/25/2014 12:14 PM, Jon C wrote:
On 7/25/2014 12:48 PM, Eric Jackson wrote:
The archives are actually set up now to feed the SIL.org website. If you go through the RAMP interface, on the page where you actually attach the bitstreams to the archival package, there is a checkbox for each item that says, "Post to SIL website?"

Thanks, that's good to know! In case it's helpful to have a concrete example, here's part of the metadata for our project, currently visible to SIL users in REAP. Eric, based on your comment, I imagine that changing the "Access to presentation file(s)" would cause the PDF to become freely available at sil.org. Correct?

Yes, that's at least part of the process, but not all of it. I see that this resource is in the Indonesia General Resources Collection, which (as far as I know) ought to be able to push to the website. In order to allow that push to take place, you'd need to make sure that both the "Access to presentation file(s)" and "Sensitivity of work" are set to "Public" (those correspond to the fields sil.sensitivity.presentation and sil.sensitivity.metadata), but you'd need to add another field to actually trigger that push. For another concrete example, you can look at the REAP item

https://www.reap.insitehome.org/handle/9284745/57474?show=full

and compare it to the website item at

http://www.sil.org/resources/archives/57474

Only five files are pushed to the website, even though eleven presentation files are marked as public. The ones that are sent to the website are all marked in a sil.website.silPublic field (to see these, you will need to display the full REAP record, which the link above should get you). If someone were to request one of the other six presentation files directly from the archive, the archive would be permitted to distribute those files to them, but I chose not to send those files to the website because most people will probably be satisfied with the other five.

Whoever is handling curation for Indonesia will be able to make this change for you, but it looks like you might not have a full-time curator yet; in that case, you could try contacting one of the super-curators (Barb Waugh for Asia Area, perhaps?).


(Note: The MS Word files were intermediate files rather than a final product, so that's why they're "source" files. I probably should also have included a LIFT export, just for maximum future-safety. If we wanted to make it easier for others to fork/repurpose the lexicon, then I'd make that "presentation", and otherwise "source", correct?)

You're right that the marking of a file as "presentation" or "source", in this case, just determines the sensitivity settings that will be applied to that file. In packages that I've submitted, I've marked a number of files as "presentation files", even though from an archival perspective I wouldn't normally consider them to be a presentation format; they're often text documents or spreadsheets in OpenDocument formats, which I've felt that others might want to use. If you're looking for a wordlist, for instance, it's much more useful as a researcher to have that list in a spreadsheet (typically a source format) rather than as a PDF (which is a typical presentation format). I have marked them "presentation" just so they'll be distributable, and distinct from what I've marked as "source" files, which might include personally-identifiable information for survey participants, for instance, which we can't distribute.

Jon C

unread,
Jul 25, 2014, 4:37:06 PM7/25/14
to flex...@googlegroups.com
Eric, thanks for the helpful explanation! -Jon

Reply all
Reply to author
Forward
0 new messages