LCSubstringSolverUsage how to return indexes and matched length

37 views
Skip to first unread message

Haluk Dogan

unread,
Mar 20, 2014, 4:39:27 AM3/20/14
to concurrent-t...@googlegroups.com
Hi all,

Let's say I have a reference and query strings, and I want to get LCSubstring as start index in reference and matched length.

For instance:

String reference = "GCAAAACAAAGT";
String query = "AAAGGCAAAATA";

So, longestCommonSubstring is "GCAAAA", and I want the result as (0,6).

Is there any way to return start index?

Thanks.

Niall Gallagher

unread,
Mar 25, 2014, 1:10:46 PM3/25/14
to concurrent-t...@googlegroups.com
Hi Haluk,

You can get the length of the common substring by calling CharSequence.length() on the object returned :)

Regarding start index, no it is not possible to return that, because there would be multiple start indexes! Say you had 5 documents - the common substring may occur at various start indexes in each of the documents. The information about start indexes is not actually stored in the tree, so currently the only way to find start indexes is to scan each of the documents, to find the offset in each document, once you know the common substring.

It might be possible to enhance LCSubstringSolver to store suffix offset in addition to the document reference in the tree, but I have not thought about that in detail.

Hope that helps,
Niall


--
-- You received this message because you are subscribed to the "concurrent-trees-discuss" group.
http://groups.google.com/group/concurrent-trees-discuss
---
You received this message because you are subscribed to the Google Groups "concurrent-trees-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to concurrent-trees-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages