$ python clonedigger.py /tmp/example1.py
Parsing /tmp/example1.py ... done
5 sequences
average sequence length: 4.400000
maximum sequence length: 9
Number of statements: 22
Calculating size for each statement... done
Building statement hash... done
Number of different hash values: 3
Building patterns... 4 patterns were discovered
Choosing pattern for each statement... done
Finding similar sequences of statements... 18 sequences were found
Refining candidates... 0 clones were found
Removing dominated clones... 0 clones were removed
I have also tried a variation of command-line options:
$ python clonedigger.py --fast /tmp/example1.py
Parsing /tmp/example1.py ... done
5 sequences
average sequence length: 4.400000
maximum sequence length: 9
Number of statements: 22
Calculating size for each statement... done
Building statement hash... done
Number of different hash values: 3
Marking each statement with its hash value
Finding similar sequences of statements... 19 sequences were found
Refining candidates... 0 clones were found
Removing dominated clones... 0 clones were removed
$ python clonedigger.py --clusterize-using-dcup --hashing-depth=0 /tmp/
example1.py
Parsing /tmp/example1.py ... done
5 sequences
average sequence length: 4.400000
maximum sequence length: 9
Number of statements: 22
Calculating size for each statement... done
Building statement hash... done
Number of different hash values: 3
Marking each statement with its hash value
Finding similar sequences of statements... 19 sequences were found
Refining candidates... 0 clones were found
Removing dominated clones... 0 clones were removed
I have also tried to run clonedigger against a variation of the
original example:
The logs for running "python clonedigger.py", "python clonedigger.py --
fast" and "python clonedigger.py --clusterize-using-dcup --hashing-
depth=0" against this code snippet are identical with the above ones:
0 clones are found.
I wonder why clonedigger is failing to detect a clone here, since I'm
pretty sure that the definition "two sequences of statements form a
clone if one of them can be obtained from the other by replacing some
subtrees" applies to these code snippets.
Peter Bulychev
unread,
Sep 8, 2008, 3:58:32 AM9/8/08
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to clonedigg...@googlegroups.com
Hi.
I wonder why clonedigger is failing to detect a clone here, since I'm
pretty sure that the definition "two sequences of statements form a
clone if one of them can be obtained from the other by replacing some
subtrees" applies to these code snippets.
There are also thresholds. CD is looking for large enough and similar enough clone pairs. Here you have a lot of differences in names and constants.
If you run it with --distance-threshold=30 (or smth like this) argument, probably you'll detect this clone.
-- Best regards, Peter Bulychev.
zpcspm
unread,
Sep 8, 2008, 4:23:17 AM9/8/08
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Clone Digger general
On Sep 8, 10:58 am, "Peter Bulychev" <peter.bulyc...@gmail.com> wrote:
> If you run it with --distance-threshold=30 (or smth like this) argument,
> probably you'll detect this clone.
This works, thank you. CD is reporting false positives (like the help
promises), but fortunately the first clone is the biggest one. So
after a code refactoring all those false positives would vanish.
Peter Bulychev
unread,
Sep 8, 2008, 4:25:31 AM9/8/08
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Clone Digger general
I've just realized that many future threads of this group could be
summarized into typical Frequently Asked Questions. Peter, what do you
think about the idea of CD having a FAQ file?
Summary for this thread:
Q: I'm running CloneDigger against a code snippet and it detects 0
clones, even if I can see that there are clones in the code.
A: To make CloneDigger detect more clones, try variations of command-
line options. Try "clonedigger.py --clusterize-using-dcup --hashing-
depth=0". If this doesn't help, also add the --distance-threshold
option with an explicit value bigger than the default one. Increase
the value of --distance-threshold until you are satisfied with the
result.
Peter Bulychev
unread,
Sep 8, 2008, 4:38:50 AM9/8/08
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to clonedigg...@googlegroups.com
Good idea.
Hopefully I'll do that later and surely your QA will be there :)