Bug in ARC2 when more the same uri is used more than 7 times

63 views
Skip to first unread message

Knurg

unread,
Mar 2, 2011, 8:45:09 AM3/2/11
to arc-dev
Some time ago I ran into a bug I also stated on the old mailing list.
Please try the following in your ARC2-SPARQL-Endpoint:
SELECT * WHERE {
?x1 rdf:type <http://erlangen-crm.org/101001/
E84_Information_Carrier> .
?x1 <http://erlangen-crm.org/101001/P108i_was_produced_by> ?x2 .
?x2 rdf:type <http://erlangen-crm.org/101001/E12_Production> .
?x2 <http://erlangen-crm.org/101001/P14_carried_out_by> ?x3 .
?x3 rdf:type <http://erlangen-crm.org/101001/E21_Person> .
?x3 <http://erlangen-crm.org/101001/P1_is_identified_by> ?x4 .
?x4 rdf:type <http://erlangen-crm.org/101001/
E82_Actor_Appellation> .
?x5 rdf:type <http://erlangen-crm.org/101001/
E84_Information_Carrier> .
?x5 <http://erlangen-crm.org/101001/P108i_was_produced_by> ?x6 .
?x6 rdf:type <http://erlangen-crm.org/101001/E12_Production> .
?x6 <http://erlangen-crm.org/101001/P14_carried_out_by> ?x7 .
?x7 rdf:type <http://erlangen-crm.org/101001/E21_Person> .
?x7 <http://erlangen-crm.org/101001/P1_is_identified_by> ?x8 .
?x8 rdf:type <http://erlangen-crm.org/101001/
E82_Actor_Appellation> .
}
LIMIT 10

You don't need to have the according triples in your store. The
ARC2_StoreSelectQueryHandler quits with the following message:
Not all patterns could be rewritten to SQL JOINs in
ARC2_StoreSelectQueryHandler
Unknown column 'T_0_0_0.s' in 'field list' via
ARC2_StoreSelectQueryHandler

I tried to find out why this happens. it works well if the query is
executed the way it is. However bengee seems to try to optimize the
querys and therefore changes the indexes in function function getSQL()
{ line 56. The problem is that in line 72 he tries to find a better
query five times. If no better query was found, the best found will be
used.

As you can see this part of the current code makes problems. I
therefore suggest line 72 to 87 in ARC2_StoreSelectQueryHandler.
Another solution would be to try the rearranging and get back to the
old version if an error was found. However I tried this and the
remaining query throws the wrong result as line 77 preferes shorter
versions of the query if the same number of dependencies is selected.
A "wrong" query is always shorter therefore giving wrong results.

Sincerely,

Mark Fichtner

Knurg

unread,
Mar 4, 2011, 5:04:46 AM3/4/11
to arc-dev
Upon doing some more research we found the source of the problem.

Line 619 and 620 of ARC2_StoreSelectQueryHandler:
$deps[$id]['rank'] += ($id != $other_id) && preg_match('/' .
$other_id . '/', $code) ? 1 : 0;
$deps[$id][$other_id] = ($id != $other_id) && preg_match('/' .
$other_id . '/', $code) ? 1 : 0;
The preg_match must be:
$deps[$id]['rank'] += ($id != $other_id) && preg_match('/' .
$other_id . '\\D/', $code) ? 1 : 0;
$deps[$id][$other_id] = ($id != $other_id) && preg_match('/' .
$other_id . '\\D/', $code) ? 1 : 0;
because otherwise T_0_0_1 matches T_0_0_10 or T_0_0_13 although there
is no connection between them.

Sincerely,

Mark Fichtner

zd

unread,
Mar 11, 2011, 7:19:07 PM3/11/11
to arc-dev
Hello,

Has there been a patch applied or a release in which this has been
fixed?

Mark Fichtner

unread,
Mar 12, 2011, 11:01:06 AM3/12/11
to arc...@googlegroups.com
Hello,

it has been fixed here:
https://github.com/Knurg/arc2/commit/6659da00256d5ad7897a371fb5ca94b50a80c977

sincerely,

Mark Fichtner
-------- Original-Nachricht --------
> Datum: Fri, 11 Mar 2011 16:19:07 -0800 (PST)
> Von: zd <perfec...@gmail.com>
> An: arc-dev <arc...@googlegroups.com>
> Betreff: [arc-dev] Re: Bug in ARC2 when more the same uri is used more than 7 times

--
Schon gehört? GMX hat einen genialen Phishing-Filter in die
Toolbar eingebaut! http://www.gmx.net/de/go/toolbar

Olivier Berger

unread,
Mar 19, 2011, 4:07:50 AM3/19/11
to arc...@googlegroups.com
Hi.

Le samedi 12 mars 2011 à 17:01 +0100, Mark Fichtner a écrit :
> Hello,
>
> it has been fixed here:
> https://github.com/Knurg/arc2/commit/6659da00256d5ad7897a371fb5ca94b50a80c977

May I suggest to try and track such bugs and fixes through
https://github.com/semsol/arc2/issues ?

Now that development by original author is stopped, it would be great to
try and avoid splitting completely, and at least keep track of useful
patches...

Any new candidate maintainerto merge these into a single repo ?

Best regards,
--
Olivier BERGER <olivier...@it-sudparis.eu>
http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8
Ingénieur Recherche - Dept INF
Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)

Stéphane Corlosquet

unread,
Mar 23, 2011, 9:49:52 PM3/23/11
to arc...@googlegroups.com, Olivier Berger
On Sat, Mar 19, 2011 at 4:07 AM, Olivier Berger <olivier...@it-sudparis.eu> wrote:
Hi.

Le samedi 12 mars 2011 à 17:01 +0100, Mark Fichtner a écrit :
> Hello,
>
> it has been fixed here:
> https://github.com/Knurg/arc2/commit/6659da00256d5ad7897a371fb5ca94b50a80c977

May I suggest to try and track such bugs and fixes through
https://github.com/semsol/arc2/issues ?

yes, this is a good idea, and people can link to the branch where there are fixing the bug, so it can be tested and maybe pulled into the main repo.
 

Now that development by original author is stopped, it would be great to
try and avoid splitting completely, and at least keep track of useful
patches...

Any new candidate maintainerto merge these into a single repo ?

This question has come up several time now that the code has moved to Github:

1. Before we start modifying the main branch, we really ought to have some solid unit tests to make sure patches don't break anything else in ARC2. I remember Bengee had some tests on his local machine but I don't think he posted them anywhere, so we might want to start from scratch. Anyone familiar with RDF libraries unit testing? maybe we could look at how other libraries do it, and make use of the test suites available for each spec (I know the SPARQL spec have some for example).

2. Since we're using github, we can use their collaborative components like leaving comments under the commits, such as this message Bengee left [1]. This way we could "vote" on whether a commit should be merged in the main repository, or not.

Anyone else have other ideas on how to move forward?

Steph.

Reply all
Reply to author
Forward
0 new messages