Question about multilingual records

18 views
Skip to first unread message

Frank Bennett

unread,
Oct 29, 2012, 8:17:31 PM10/29/12
to bibliographic-ontolog...@googlegroups.com
I have a product that supports multilingual variants of field content. I would like to map the content of records to Bibliontology RDF for export and import.

Field elements in the application are of two types. In the terms I use locally, these are the "headline entry" (containing the original form) and "supplemental entries" (each containing a variant with an RFC 5646 tag indicating its language and script).

A language tag is optional on the "headline entry": it may or may not be set. However, because the headline entry represents the original value, (without translation or transliteration) it is important that it be correctly identfied when data is exported in RDF and re-imported. I'm not sure how to accomplish that.

RDF allows use of the xml:lang attribute to specify the language of a node or property element:

  http://www.w3.org/TR/REC-rdf-syntax/#section-Syntax-languages

Here is a sample of our current output:

<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:res="http://purl.org/vocab/resourcelist/schema#"
 xmlns:dcterms="http://purl.org/dc/terms/"
 xmlns:bibo="http://purl.org/ontology/bibo/"
 xmlns:z="http://www.zotero.org/namespaces/export#">
    <z:UserItem rdf:about="http://zotero.org/users/67180/items/KBVT9SSW">
        <res:resource>
            <bibo:Book>
                <dcterms:title>坊っちゃん</dcterms:title>
                <dcterms:title xml:lang="ja-alalc97">坊っちゃん</dcterms:title>
                <dcterms:language>ja</dcterms:language>
            </bibo:Book>
        </res:resource>
    </z:UserItem>
</rdf:RDF>

In this example, dcterms:title has two forms. The first has no value for xml:lang, and can safely be treated as the headline entry. However, if the first entry is given an attribute xml:lang="ja", then because (as I understand RDF) there is no sequential relation between the two alternatives, selecting the "first" as the headline entry is not possible. (The dcterms:lang value cannot be used as a hint, since it applies to the language of the underlying source identified by the record, not of the field itself, and the two may differ.)

If I am mistaken and the dcterms:title elements here can be treated as a sequence, that would offer the simplest solution. If not, perhaps I am overlooking something basic in RDF generally or Bibo specifically?

Any advice greatly appreciated.

Frank

Frank Bennett

unread,
Oct 29, 2012, 8:20:47 PM10/29/12
to bibliographic-ontolog...@googlegroups.com
On Tuesday, October 30, 2012 9:17:31 AM UTC+9, Frank Bennett wrote:
I have a product that supports multilingual variants of field content. I would like to map the content of records to Bibliontology RDF for export and import.

Field elements in the application are of two types. In the terms I use locally, these are the "headline entry" (containing the original form) and "supplemental entries" (each containing a variant with an RFC 5646 tag indicating its language and script).

A language tag is optional on the "headline entry": it may or may not be set. However, because the headline entry represents the original value, (without translation or transliteration) it is important that it be correctly identfied when data is exported in RDF and re-imported. I'm not sure how to accomplish that.

RDF allows use of the xml:lang attribute to specify the language of a node or property element:

  http://www.w3.org/TR/REC-rdf-syntax/#section-Syntax-languages

Here is a sample of our current output:

<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:res="http://purl.org/vocab/resourcelist/schema#"
 xmlns:dcterms="http://purl.org/dc/terms/"
 xmlns:bibo="http://purl.org/ontology/bibo/"
 xmlns:z="http://www.zotero.org/namespaces/export#">
    <z:UserItem rdf:about="http://zotero.org/users/67180/items/KBVT9SSW">
        <res:resource>
            <bibo:Book>
                <dcterms:title>坊っちゃん</dcterms:title>
                <dcterms:title xml:lang="ja-alalc97">坊っちゃん</dcterms:title>
                <dcterms:language>ja</dcterms:language>
            </bibo:Book>
        </res:resource>
    </z:UserItem>
</rdf:RDF>


(Sorry, the content in this example may be confusing. The second "坊っちゃん" entry should have the transliterated value "Bottchan".)

Frank Bennett

unread,
Oct 29, 2012, 10:07:12 PM10/29/12
to bibliographic-ontolog...@googlegroups.com
One possible solution would be to always export the "headline entry" with no language tag, and to export a separate node with the same content and the language tag set, where the headline entry has a language tag in the original data. This would permit reconstruction of the original entry on import by matching field content -- which would work so long as the field variants are all unique. I am still unsure whether this is the correct way of handling the use case, though.

FB



On Tuesday, October 30, 2012 9:17:31 AM UTC+9, Frank Bennett wrote:
Reply all
Reply to author
Forward
0 new messages