What is the best practice for modeling a document as a
collection of paragraphs?
My question is really more specific. I am creating a
knowledge base that can contain classified documents. I want to express the
classification level of each paragraph. I assume classification level is a
realizable entity.
I can think of two approaches. The first is to model the
entire document, and each paragraph within the document, as independent
continuants. The paragraphs are parts of the document.
The second approach is to model the entire document as an
independent continuant that is the carrier of a generically dependent
continuant. Other generically dependent continuants, each corresponding to a
paragraph, are parts of this generically dependent continuant.
Conceptually, I prefer the second approach. The decomposition of a
document into paragraphs is a human interpretation of the document’s content,
not a set of physical relationships in the same way that gears are part of a
clock. The parts of a paper document are pages. A digital file doesn’t
necessarily have parts that are recognizable as paragraphs.
However, if I use the second approach, I do not see how to
express the classification level. The documentation for the bearer-of property (RO_0000053)
states that the bearer is an independent continuant. That would preclude
expressing the classification level of a paragraph that’s a generically
dependent continuant. (The documentation to which I refer is in the OWL version of BFO.)
So: what is the best practice for modeling a document as a
collection of paragraphs?