
On Aug 11, 2020, at 10:57 AM, Fan Li <lifa...@gmail.com> wrote:
Suppose I am dealing with financial data related to two retail companies "W" and "T" and each of them has a set of departments."W":
- Apparel
- Electronics
- Sports
"T":
- Clothes
- Electronics
- Sports & Recreation
As you see, some departments from different companies may have the same name.I have modeled the companies and the departments as instances of Organizations and OrganizationalUnits and their relations similar to the W3C Organization Ontology (the specific ontology choice probably doesn't matter here).Now I have the following financial data to import into EDG:
<Annotation 2020-08-11 105105.png>I am using the "pattern-based" import strategy and trying to match the company and department entities based on their label. The challenge I am facing is that some department names are ambiguous (e.g. "Electronics") and EDG would, understandably, get it wrong sometimes.I am asking the community what is the best practice to solve this problem. Thank you!
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/3aed4890-f945-4bbb-bb9d-3bcc52e551dcn%40googlegroups.com.
<Annotation 2020-08-11 105105.png>
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/7a62cbf1-cfe2-4bf5-8d63-f4442af9e733n%40googlegroups.com.
On 12/08/2020 02:32, Irene Polikoff wrote:
For this to work, the spreadsheet will need to have a column with these concatenated values.
As for the data in EDG, you can use SHACL rule to “infer” property values - concatenating a company name with the department. While the rule could be defined as a property value rule that infers a value on demand, I don’t know if such dynamic inference would work for the import matching logic. You could confirm it by trying, but I suspect that for the match to work, you will need to execute the rules and store the results prior to running the import. Use the Transform tab to do this.
The out-of-the-box spreadsheet importer of EDG will not use inferred values for matching (and this might be slow). As Irene said, you would first need to materialize the inferences.
If you are somehow familiar with JavaScript then 6.4 offers more
flexibility for spreadsheet importing, including the ability to
use inferred values and building helper data structures for
efficient matching. See
http://www.datashapes.org/active/import.html
and in particular the demo video linked from that page
https://www.youtube.com/watch?v=Dn7O8siZpTc&feature=youtu.be
Holger
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/016FA346-C934-4B20-9EC4-89F0B91AD668%40topquadrant.com.