Modeling a document as a collection of paragraphs

已查看 11 次
跳至第一个未读帖子

Steve Wartik

未读,
2022年9月7日 18:45:482022/9/7
收件人 bfo-d...@googlegroups.com
What is the best practice for modeling a document as a collection of paragraphs?

My question is really more specific. I am creating a knowledge base that can contain classified documents. I want to express the classification level of each paragraph. I assume classification level is a realizable entity.

I can think of two approaches. The first is to model the entire document, and each paragraph within the document, as independent continuants. The paragraphs are parts of the document.

The second approach is to model the entire document as an independent continuant that is the carrier of a generically dependent continuant. Other generically dependent continuants, each corresponding to a paragraph, are parts of this generically dependent continuant.

Conceptually, I prefer the second approach. The decomposition of a document into paragraphs is a human interpretation of the document’s content, not a set of physical relationships in the same way that gears are part of a clock. The parts of a paper document are pages. A digital file doesn’t necessarily have parts that are recognizable as paragraphs.

However, if I use the second approach, I do not see how to express the classification level. The documentation for the bearer-of property (RO_0000053) states that the bearer is an independent continuant. That would preclude expressing the classification level of a paragraph that’s a generically dependent continuant. (The documentation to which I refer is in the OWL version of BFO.)

So: what is the best practice for modeling a document as a collection of paragraphs?

Chris Mungall

未读,
2022年9月7日 19:51:112022/9/7
收件人 bfo-d...@googlegroups.com
I have no opinions on this modeling question but just wanted to clarify something

The documentation for the bearer-of property (RO_0000053) states that the bearer is an independent continuant.


This is in RO, not BFO. It is a generalization of the BFO concept of bearer-of that allows processes and other entities to have characteristics

--
You received this message because you are subscribed to the Google Groups "BFO Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bfo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bfo-discuss/263367013.431532.1662590744299%40mail.yahoo.com.

Bill Duncan

未读,
2022年9月8日 07:26:012022/9/8
收件人 bfo-d...@googlegroups.com
IAO has a high level 'document part' (subtype of information content entity (ICE)) term:  

This seems to me to be the best parent term for 'paragraph' (i.e., paragraph is a type of ICE).

As for defining the paragraph as being classified, I would say the classification level is a quality: the classification level is always present/manifested, not just present/manifested when participating in a process (e.g., reading the information). There is also an argument for the classification level being a type of role, but I think the quality approach is more straightforward.

As for relating the classification quality to the paragraph, there are multiple options:

- Follow @Chris' suggestion to use the 'has characteristic' relation. This could be between the paragraph ICE and the classification quality.

- Create appropriate subtypes of IAO 'material information bearer' (http://purl.obolibrary.org/obo/IAO_0000178) that concretize the paragraph ICE and bear the classification quality.

Hope this helps,
Bill

回复全部
回复作者
转发
0 个新帖子