Hi I am using neo4j spatial to find out locations near by a lat/long. The results returned are sorted on OrthodromicDistance and are further paginated using the range function (100 results per page). The code for this is
Now to iterate over this GeoPipline, I am using hasNext() / next() functions but for some reason the first call for either of these functions takes long time to execute.
When the application was run, first call to next() or hasNext() took approximately 400 sec to execute, subsequent 99 calls took 0ms.
Can someone please point out where the mistake is and if there is a faster way to find the nearest places. The layer has 10 million entries.
Hi there,
the first run is probably with cold caches. In real world scenarios,
you are running with warm caches, so you should try to warm up the
database by doing a few searches before the real work, or maybe loop
through your interesting nodes or so?
<avdeshpa...@gmail.com> wrote:
> Hi
> I am using neo4j spatial to find out locations near by a lat/long. The
> results returned are sorted on OrthodromicDistance and are further paginated
> using the range function (100 results per page). The code for this is
> Now to iterate over this GeoPipline, I am using hasNext() / next() functions
> but for some reason the first call for either of these functions takes long
> time to execute.
> When the application was run, first call to next() or hasNext() took
> approximately 400 sec to execute, subsequent 99 calls took 0ms.
> Can someone please point out where the mistake is and if there is a faster
> way to find the nearest places. The layer has 10 million entries.
Actually this is not related to the very first run of query where the cache is empty.
The problem is that when ever I try to search places and request for 100 places per page in response, first call to Pipeline.hasNext() / Pipeline.next() always takes about 400 sec. time and this happens for every request.
Hope I am able to convey the problem that I am facing.
On Monday, 15 October 2012 18:35:04 UTC+5:30, Peter Neubauer wrote:
> Hi there, > the first run is probably with cold caches. In real world scenarios, > you are running with warm caches, so you should try to warm up the > database by doing a few searches before the real work, or maybe loop > through your interesting nodes or so?
> On Mon, Oct 15, 2012 at 2:10 PM, Abhijeet Deshpande > <avdes...@gmail.com <javascript:>> wrote: > > Hi > > I am using neo4j spatial to find out locations near by a lat/long. The > > results returned are sorted on OrthodromicDistance and are further > paginated > > using the range function (100 results per page). The code for this is
> > Now to iterate over this GeoPipline, I am using hasNext() / next() > functions > > but for some reason the first call for either of these functions takes > long > > time to execute.
> > When the application was run, first call to next() or hasNext() took > > approximately 400 sec to execute, subsequent 99 calls took 0ms.
> > Can someone please point out where the mistake is and if there is a > faster > > way to find the nearest places. The layer has 10 million entries.
I can suggest that since you have a sorting component in the pipeline, this
will cause the pipeline to internally cache the entire resultset on the
first hasNext or next call, sort it and return the first sorted result.
Subsequent calls will access the internal, sorted, cache. This is a
necessary consequence of the sorting algorithm, and is mostly unavoidable.
So avoid this, you would need to take away the sort. Is it possible to get
the same results you want without the sort? Or at least by moving the sort
into your own code, or perhaps after a filter or some other action that
reduces the resultset size?
On Fri, Oct 19, 2012 at 2:13 PM, Abhijeet Deshpande
<avdeshpa...@gmail.com>wrote:
> Actually this is not related to the very first run of query where the
> cache is empty.
> The problem is that when ever I try to search places and request for 100
> places per page in response, first call to Pipeline.hasNext() /
> Pipeline.next() always takes about 400 sec. time and this happens for every
> request.
> Hope I am able to convey the problem that I am facing.
> Regards
> Abhijeet
> On Monday, 15 October 2012 18:35:04 UTC+5:30, Peter Neubauer wrote:
>> Hi there,
>> the first run is probably with cold caches. In real world scenarios,
>> you are running with warm caches, so you should try to warm up the
>> database by doing a few searches before the real work, or maybe loop
>> through your interesting nodes or so?
>> On Mon, Oct 15, 2012 at 2:10 PM, Abhijeet Deshpande
>> <avdes...@gmail.com> wrote:
>> > Hi
>> > I am using neo4j spatial to find out locations near by a lat/long. The
>> > results returned are sorted on OrthodromicDistance and are further
>> paginated
>> > using the range function (100 results per page). The code for this is
>> > Now to iterate over this GeoPipline, I am using hasNext() / next()
>> functions
>> > but for some reason the first call for either of these functions takes
>> long
>> > time to execute.
>> > When the application was run, first call to next() or hasNext() took
>> > approximately 400 sec to execute, subsequent 99 calls took 0ms.
>> > Can someone please point out where the mistake is and if there is a
>> faster
>> > way to find the nearest places. The layer has 10 million entries.
Mmh,
I think the whole query is re-issued when you request the next page.
in order to avoid that, could you hold the result in cache on the
server side for the next request and page it on your own?
<avdeshpa...@gmail.com> wrote:
> Hi Peter,
> Thank you for the response.
> Actually this is not related to the very first run of query where the cache
> is empty.
> The problem is that when ever I try to search places and request for 100
> places per page in response, first call to Pipeline.hasNext() /
> Pipeline.next() always takes about 400 sec. time and this happens for every
> request.
> Hope I am able to convey the problem that I am facing.
> Regards
> Abhijeet
> On Monday, 15 October 2012 18:35:04 UTC+5:30, Peter Neubauer wrote:
>> Hi there,
>> the first run is probably with cold caches. In real world scenarios,
>> you are running with warm caches, so you should try to warm up the
>> database by doing a few searches before the real work, or maybe loop
>> through your interesting nodes or so?
>> On Mon, Oct 15, 2012 at 2:10 PM, Abhijeet Deshpande
>> <avdes...@gmail.com> wrote:
>> > Hi
>> > I am using neo4j spatial to find out locations near by a lat/long. The
>> > results returned are sorted on OrthodromicDistance and are further
>> > paginated
>> > using the range function (100 results per page). The code for this is
>> > Now to iterate over this GeoPipline, I am using hasNext() / next()
>> > functions
>> > but for some reason the first call for either of these functions takes
>> > long
>> > time to execute.
>> > When the application was run, first call to next() or hasNext() took
>> > approximately 400 sec to execute, subsequent 99 calls took 0ms.
>> > Can someone please point out where the mistake is and if there is a
>> > faster
>> > way to find the nearest places. The layer has 10 million entries.
Removing sorting helped quite a lot and I also reduced the result size from 100 to 10. Now the response times have come down to a few milliseconds but the downside is that the results are not sorted upon the distance.
The objective is to get 10 places per page starting with the closest one. To implement this I tried following GeoPipeline functions
1. startNearestNeighborSearch(layer, point, distance) and 2, startNearestNeighborLatLonSearch(layer, point, distance)
but none of them returned the results sorted on distance even if the function names seem to suggest so. As per the function definition, no additional sort filters are needed so please let me know if I am missing something.
I also observed that the results of startNearestNeighborLatLonSearch are better sorted than startNearestNeighborSearch and think it has got something to do with the propertyFilter that these functions use. I would like to know more on "OrthodromicDistance" and "Distance" properties and the difference between their respecitive property filters.
On Friday, 19 October 2012 19:04:56 UTC+5:30, Craig Taverner wrote:
> I can suggest that since you have a sorting component in the pipeline, > this will cause the pipeline to internally cache the entire resultset on > the first hasNext or next call, sort it and return the first sorted result. > Subsequent calls will access the internal, sorted, cache. This is a > necessary consequence of the sorting algorithm, and is mostly unavoidable.
> So avoid this, you would need to take away the sort. Is it possible to get > the same results you want without the sort? Or at least by moving the sort > into your own code, or perhaps after a filter or some other action that > reduces the resultset size?
> On Fri, Oct 19, 2012 at 2:13 PM, Abhijeet Deshpande <avdes...@gmail.com<javascript:> > > wrote:
>> Hi Peter, >> Thank you for the response.
>> Actually this is not related to the very first run of query where the >> cache is empty.
>> The problem is that when ever I try to search places and request for 100 >> places per page in response, first call to Pipeline.hasNext() / >> Pipeline.next() always takes about 400 sec. time and this happens for every >> request.
>> Hope I am able to convey the problem that I am facing.
>> Regards >> Abhijeet
>> On Monday, 15 October 2012 18:35:04 UTC+5:30, Peter Neubauer wrote:
>>> Hi there, >>> the first run is probably with cold caches. In real world scenarios, >>> you are running with warm caches, so you should try to warm up the >>> database by doing a few searches before the real work, or maybe loop >>> through your interesting nodes or so?
>>> On Mon, Oct 15, 2012 at 2:10 PM, Abhijeet Deshpande >>> <avdes...@gmail.com> wrote: >>> > Hi >>> > I am using neo4j spatial to find out locations near by a lat/long. The >>> > results returned are sorted on OrthodromicDistance and are further >>> paginated >>> > using the range function (100 results per page). The code for this is
>>> > Now to iterate over this GeoPipline, I am using hasNext() / next() >>> functions >>> > but for some reason the first call for either of these functions takes >>> long >>> > time to execute.
>>> > When the application was run, first call to next() or hasNext() took >>> > approximately 400 sec to execute, subsequent 99 calls took 0ms.
>>> > Can someone please point out where the mistake is and if there is a >>> faster >>> > way to find the nearest places. The layer has 10 million entries.
Hi As specified below, can someone please help me understand why startNearestNeighborSearch / startNearestNeighborLatLonSearch functions don't return the results sorted on the distance. I have read some documentation which suggests that these functions return the results sorted on distance starting with the nearest one.
> On Tuesday, 23 October 2012 01:11:02 UTC+5:30, Abhijeet Deshpande wrote:
> Hi Craig, Peter
> Thank you for the inputs.
> Removing sorting helped quite a lot and I also reduced the result size > from 100 to 10. Now the response times have come down to a few milliseconds > but the downside is that the results are not sorted upon the distance.
> The objective is to get 10 places per page starting with the closest one. > To implement this I tried following GeoPipeline functions
> 1. startNearestNeighborSearch(layer, point, distance) and > 2, startNearestNeighborLatLonSearch(layer, point, distance)
> but none of them returned the results sorted on distance even if the > function names seem to suggest so. As per the function definition, no > additional sort filters are needed so please let me know if I am missing > something.
> I also observed that the results of startNearestNeighborLatLonSearch are > better sorted than startNearestNeighborSearch and think it has got > something to do with the propertyFilter that these functions use. I would > like to know more on "OrthodromicDistance" and "Distance" properties and > the difference between their respecitive property filters.
> Regards > Abhijeet
> On Friday, 19 October 2012 19:04:56 UTC+5:30, Craig Taverner wrote:
>> I can suggest that since you have a sorting component in the pipeline, >> this will cause the pipeline to internally cache the entire resultset on >> the first hasNext or next call, sort it and return the first sorted result. >> Subsequent calls will access the internal, sorted, cache. This is a >> necessary consequence of the sorting algorithm, and is mostly unavoidable.
>> So avoid this, you would need to take away the sort. Is it possible to >> get the same results you want without the sort? Or at least by moving the >> sort into your own code, or perhaps after a filter or some other action >> that reduces the resultset size?
>> On Fri, Oct 19, 2012 at 2:13 PM, Abhijeet Deshpande <avdes...@gmail.com>wrote:
>>> Hi Peter, >>> Thank you for the response.
>>> Actually this is not related to the very first run of query where the >>> cache is empty.
>>> The problem is that when ever I try to search places and request for 100 >>> places per page in response, first call to Pipeline.hasNext() / >>> Pipeline.next() always takes about 400 sec. time and this happens for every >>> request.
>>> Hope I am able to convey the problem that I am facing.
>>> Regards >>> Abhijeet
>>> On Monday, 15 October 2012 18:35:04 UTC+5:30, Peter Neubauer wrote:
>>>> Hi there, >>>> the first run is probably with cold caches. In real world scenarios, >>>> you are running with warm caches, so you should try to warm up the >>>> database by doing a few searches before the real work, or maybe loop >>>> through your interesting nodes or so?
>>>> On Mon, Oct 15, 2012 at 2:10 PM, Abhijeet Deshpande >>>> <avdes...@gmail.com> wrote: >>>> > Hi >>>> > I am using neo4j spatial to find out locations near by a lat/long. >>>> The >>>> > results returned are sorted on OrthodromicDistance and are further >>>> paginated >>>> > using the range function (100 results per page). The code for this is
>>>> > Now to iterate over this GeoPipline, I am using hasNext() / next() >>>> functions >>>> > but for some reason the first call for either of these functions >>>> takes long >>>> > time to execute.
>>>> > When the application was run, first call to next() or hasNext() took >>>> > approximately 400 sec to execute, subsequent 99 calls took 0ms.
>>>> > Can someone please point out where the mistake is and if there is a >>>> faster >>>> > way to find the nearest places. The layer has 10 million entries.
Abhijeet,
from the docs and implementation, there is no sorting going on here.
Instead, all of the returned points satisfy the bonuding box you are
requesting
/**
* Extracts Layer items with a distance from the given point that is
less than or equal the given distance.
*
* @param layer with latitude, longitude coordinates
* @param point
* @param maxDistanceInKm
* @return geoPipeline
*/
public static GeoPipeline startNearestNeighborLatLonSearch(Layer
layer, Coordinate point, double maxDistanceInKm) {
Envelope searchWindow =
OrthodromicDistance.suggestSearchWindow(point, maxDistanceInKm);
GeoPipeline pipeline = start(layer, new SearchIntersectWindow(layer,
searchWindow))
.calculateOrthodromicDistance(point);
/**
* Calculates the distance between Layer items nearest to the given
point and the given point.
* The search window created is based on Layer items density and it
could lead to no results.
*
* @param layer
* @param point
* @param numberOfItemsToFind tries to find this number of items
for comparison
* @return geoPipeline
*/
public static GeoPipeline startNearestNeighborSearch(Layer layer,
Coordinate point, int numberOfItemsToFind) {
Envelope searchWindow =
SpatialTopologyUtils.createEnvelopeForGeometryDensityEstimate(layer,
point, numberOfItemsToFind);
return startNearestNeighborSearch(layer, point, searchWindow);
}
For this to happen, you could either contribute a
SortingNearestNeighborSearch (that does the sorting for you) or have a
second sort step?
<avdeshpa...@gmail.com> wrote:
> Hi
> As specified below, can someone please help me understand why
> startNearestNeighborSearch / startNearestNeighborLatLonSearch functions
> don't return the results sorted on the distance. I have read some
> documentation which suggests that these functions return the results sorted
> on distance starting with the nearest one.
> Thanks
> Abhijeet
> On Tuesday, 23 October 2012 01:11:02 UTC+5:30, Abhijeet Deshpande wrote:
>> Hi Craig, Peter
>> Thank you for the inputs.
>> Removing sorting helped quite a lot and I also reduced the result size
>> from 100 to 10. Now the response times have come down to a few milliseconds
>> but the downside is that the results are not sorted upon the distance.
>> The objective is to get 10 places per page starting with the closest one.
>> To implement this I tried following GeoPipeline functions
>> 1. startNearestNeighborSearch(layer, point, distance) and
>> 2, startNearestNeighborLatLonSearch(layer, point, distance)
>> but none of them returned the results sorted on distance even if the
>> function names seem to suggest so. As per the function definition, no
>> additional sort filters are needed so please let me know if I am missing
>> something.
>> I also observed that the results of startNearestNeighborLatLonSearch are
>> better sorted than startNearestNeighborSearch and think it has got something
>> to do with the propertyFilter that these functions use. I would like to know
>> more on "OrthodromicDistance" and "Distance" properties and the difference
>> between their respecitive property filters.
>> Regards
>> Abhijeet
>> On Friday, 19 October 2012 19:04:56 UTC+5:30, Craig Taverner wrote:
>>> I can suggest that since you have a sorting component in the pipeline,
>>> this will cause the pipeline to internally cache the entire resultset on the
>>> first hasNext or next call, sort it and return the first sorted result.
>>> Subsequent calls will access the internal, sorted, cache. This is a
>>> necessary consequence of the sorting algorithm, and is mostly unavoidable.
>>> So avoid this, you would need to take away the sort. Is it possible to
>>> get the same results you want without the sort? Or at least by moving the
>>> sort into your own code, or perhaps after a filter or some other action that
>>> reduces the resultset size?
>>> On Fri, Oct 19, 2012 at 2:13 PM, Abhijeet Deshpande <avdes...@gmail.com>
>>> wrote:
>>>> Hi Peter,
>>>> Thank you for the response.
>>>> Actually this is not related to the very first run of query where the
>>>> cache is empty.
>>>> The problem is that when ever I try to search places and request for 100
>>>> places per page in response, first call to Pipeline.hasNext() /
>>>> Pipeline.next() always takes about 400 sec. time and this happens for every
>>>> request.
>>>> Hope I am able to convey the problem that I am facing.
>>>> Regards
>>>> Abhijeet
>>>> On Monday, 15 October 2012 18:35:04 UTC+5:30, Peter Neubauer wrote:
>>>>> Hi there,
>>>>> the first run is probably with cold caches. In real world scenarios,
>>>>> you are running with warm caches, so you should try to warm up the
>>>>> database by doing a few searches before the real work, or maybe loop
>>>>> through your interesting nodes or so?
>>>>> On Mon, Oct 15, 2012 at 2:10 PM, Abhijeet Deshpande
>>>>> <avdes...@gmail.com> wrote:
>>>>> > Hi
>>>>> > I am using neo4j spatial to find out locations near by a lat/long.
>>>>> > The
>>>>> > results returned are sorted on OrthodromicDistance and are further
>>>>> > paginated
>>>>> > using the range function (100 results per page). The code for this is
>>>>> > Now to iterate over this GeoPipline, I am using hasNext() / next()
>>>>> > functions
>>>>> > but for some reason the first call for either of these functions
>>>>> > takes long
>>>>> > time to execute.
>>>>> > When the application was run, first call to next() or hasNext() took
>>>>> > approximately 400 sec to execute, subsequent 99 calls took 0ms.
>>>>> > Can someone please point out where the mistake is and if there is a
>>>>> > faster
>>>>> > way to find the nearest places. The layer has 10 million entries.
The best solution is to perform the query with a sufficiently large
bounding box to give at least the number of results you expect, and then
soft and limit in the client code afterwards. This works very well, if you
guess the bounding box correctly. That guess is best done with domain
knowledge, something the client code is more likely to have.
The fundamental problem here, and the reason why the sorting is not done
internally, is that the spatial index is based on location, not distance.
While it is possible to make an index based on distance, the origin of the
search would be specific to the index. This means the index would only work
for searches of distance from a particular point always, not generalized to
any point. So, to support searches around any point (the point you pass in
the search query), we need to build a bounding box, query the index on
that, and then filter to points at the right distance.
peter.neuba...@neotechnology.com> wrote:
> Abhijeet,
> from the docs and implementation, there is no sorting going on here.
> Instead, all of the returned points satisfy the bonuding box you are
> requesting
> /**
> * Extracts Layer items with a distance from the given point that
> is
> less than or equal the given distance.
> *
> * @param layer with latitude, longitude coordinates
> * @param point
> * @param maxDistanceInKm
> * @return geoPipeline
> */
> public static GeoPipeline startNearestNeighborLatLonSearch(Layer
> layer, Coordinate point, double maxDistanceInKm) {
> Envelope searchWindow =
> OrthodromicDistance.suggestSearchWindow(point, maxDistanceInKm);
> GeoPipeline pipeline = start(layer, new
> SearchIntersectWindow(layer,
> searchWindow))
> .calculateOrthodromicDistance(point);
> /**
> * Calculates the distance between Layer items nearest to the given
> point and the given point.
> * The search window created is based on Layer items density and it
> could lead to no results.
> *
> * @param layer
> * @param point
> * @param numberOfItemsToFind tries to find this number of items
> for comparison
> * @return geoPipeline
> */
> public static GeoPipeline startNearestNeighborSearch(Layer layer,
> Coordinate point, int numberOfItemsToFind) {
> Envelope searchWindow =
> SpatialTopologyUtils.createEnvelopeForGeometryDensityEstimate(layer,
> point, numberOfItemsToFind);
> return startNearestNeighborSearch(layer, point,
> searchWindow);
> }
> For this to happen, you could either contribute a
> SortingNearestNeighborSearch (that does the sorting for you) or have a
> second sort step?
> On Thu, Oct 25, 2012 at 12:10 PM, Abhijeet Deshpande
> <avdeshpa...@gmail.com> wrote:
> > Hi
> > As specified below, can someone please help me understand why
> > startNearestNeighborSearch / startNearestNeighborLatLonSearch functions
> > don't return the results sorted on the distance. I have read some
> > documentation which suggests that these functions return the results
> sorted
> > on distance starting with the nearest one.
> > Thanks
> > Abhijeet
> > On Tuesday, 23 October 2012 01:11:02 UTC+5:30, Abhijeet Deshpande wrote:
> >> Hi Craig, Peter
> >> Thank you for the inputs.
> >> Removing sorting helped quite a lot and I also reduced the result size
> >> from 100 to 10. Now the response times have come down to a few
> milliseconds
> >> but the downside is that the results are not sorted upon the distance.
> >> The objective is to get 10 places per page starting with the closest
> one.
> >> To implement this I tried following GeoPipeline functions
> >> 1. startNearestNeighborSearch(layer, point, distance) and
> >> 2, startNearestNeighborLatLonSearch(layer, point, distance)
> >> but none of them returned the results sorted on distance even if the
> >> function names seem to suggest so. As per the function definition, no
> >> additional sort filters are needed so please let me know if I am missing
> >> something.
> >> I also observed that the results of startNearestNeighborLatLonSearch are
> >> better sorted than startNearestNeighborSearch and think it has got
> something
> >> to do with the propertyFilter that these functions use. I would like to
> know
> >> more on "OrthodromicDistance" and "Distance" properties and the
> difference
> >> between their respecitive property filters.
> >> Regards
> >> Abhijeet
> >> On Friday, 19 October 2012 19:04:56 UTC+5:30, Craig Taverner wrote:
> >>> I can suggest that since you have a sorting component in the pipeline,
> >>> this will cause the pipeline to internally cache the entire resultset
> on the
> >>> first hasNext or next call, sort it and return the first sorted result.
> >>> Subsequent calls will access the internal, sorted, cache. This is a
> >>> necessary consequence of the sorting algorithm, and is mostly
> unavoidable.
> >>> So avoid this, you would need to take away the sort. Is it possible to
> >>> get the same results you want without the sort? Or at least by moving
> the
> >>> sort into your own code, or perhaps after a filter or some other
> action that
> >>> reduces the resultset size?
> >>> On Fri, Oct 19, 2012 at 2:13 PM, Abhijeet Deshpande <
> avdes...@gmail.com>
> >>> wrote:
> >>>> Hi Peter,
> >>>> Thank you for the response.
> >>>> Actually this is not related to the very first run of query where the
> >>>> cache is empty.
> >>>> The problem is that when ever I try to search places and request for
> 100
> >>>> places per page in response, first call to Pipeline.hasNext() /
> >>>> Pipeline.next() always takes about 400 sec. time and this happens for
> every
> >>>> request.
> >>>> Hope I am able to convey the problem that I am facing.
> >>>> Regards
> >>>> Abhijeet
> >>>> On Monday, 15 October 2012 18:35:04 UTC+5:30, Peter Neubauer wrote:
> >>>>> Hi there,
> >>>>> the first run is probably with cold caches. In real world scenarios,
> >>>>> you are running with warm caches, so you should try to warm up the
> >>>>> database by doing a few searches before the real work, or maybe loop
> >>>>> through your interesting nodes or so?
> >>>>> On Mon, Oct 15, 2012 at 2:10 PM, Abhijeet Deshpande
> >>>>> <avdes...@gmail.com> wrote:
> >>>>> > Hi
> >>>>> > I am using neo4j spatial to find out locations near by a lat/long.
> >>>>> > The
> >>>>> > results returned are sorted on OrthodromicDistance and are further
> >>>>> > paginated
> >>>>> > using the range function (100 results per page). The code for this
> is
> >>>>> > Now to iterate over this GeoPipline, I am using hasNext() / next()
> >>>>> > functions
> >>>>> > but for some reason the first call for either of these functions
> >>>>> > takes long
> >>>>> > time to execute.
> >>>>> > When the application was run, first call to next() or hasNext()
> took
> >>>>> > approximately 400 sec to execute, subsequent 99 calls took 0ms.
> >>>>> > Can someone please point out where the mistake is and if there is a
> >>>>> > faster
> >>>>> > way to find the nearest places. The layer has 10 million entries.
Thank you for the response. As suggested, I have removed the sorting component from the request and now simply fetching paginated results, 100 per page, using the following call.
I have observed that the graph search time is negligible but when I traverse the result in while loop, for lower page ranges (Page 1: 1-100, Page 2: 101-200, Page 3: 201-300) it takes about 800-900 milliseconds to execute the first hasNext() call and for later pages like 8,9,10 it takes about 4000 to 5000 ms. This slows down the overall performance.
Please let me know if it is possible to significantly reduce this result traversal time irrespective of the page being requested.
I suspect it may be related to the way I am creating the layer in graph database and hence I have also attached the java code file that populates graph database. Code steps at a high level are
1. Create graphDatabaseService 2. Create spatialDatabaseService 3. Create SimplePointLayer called places 4. Read a place record that contains lat and long along with other details from file and input file add it to the layer to create SpatialDatabaseRecord 5. To this newly created node add other properties like place name, address, zip code etc
Can the access time be reduced if we create some index which can be used while traversal?
On Thursday, 25 October 2012 17:15:55 UTC+5:30, Craig Taverner wrote:
> The best solution is to perform the query with a sufficiently large > bounding box to give at least the number of results you expect, and then > soft and limit in the client code afterwards. This works very well, if you > guess the bounding box correctly. That guess is best done with domain > knowledge, something the client code is more likely to have.
> The fundamental problem here, and the reason why the sorting is not done > internally, is that the spatial index is based on location, not distance. > While it is possible to make an index based on distance, the origin of the > search would be specific to the index. This means the index would only work > for searches of distance from a particular point always, not generalized to > any point. So, to support searches around any point (the point you pass in > the search query), we need to build a bounding box, query the index on > that, and then filter to points at the right distance.
> On Thu, Oct 25, 2012 at 1:23 PM, Peter Neubauer < > peter.n...@neotechnology.com <javascript:>> wrote:
>> Abhijeet, >> from the docs and implementation, there is no sorting going on here. >> Instead, all of the returned points satisfy the bonuding box you are >> requesting
>> /** >> * Extracts Layer items with a distance from the given point that >> is >> less than or equal the given distance. >> * >> * @param layer with latitude, longitude coordinates >> * @param point >> * @param maxDistanceInKm >> * @return geoPipeline >> */ >> public static GeoPipeline startNearestNeighborLatLonSearch(Layer >> layer, Coordinate point, double maxDistanceInKm) { >> Envelope searchWindow = >> OrthodromicDistance.suggestSearchWindow(point, maxDistanceInKm); >> GeoPipeline pipeline = start(layer, new >> SearchIntersectWindow(layer, >> searchWindow)) >> .calculateOrthodromicDistance(point);
>> /** >> * Calculates the distance between Layer items nearest to the >> given >> point and the given point. >> * The search window created is based on Layer items density and >> it >> could lead to no results. >> * >> * @param layer >> * @param point >> * @param numberOfItemsToFind tries to find this number of items >> for comparison >> * @return geoPipeline >> */ >> public static GeoPipeline startNearestNeighborSearch(Layer layer, >> Coordinate point, int numberOfItemsToFind) { >> Envelope searchWindow = >> SpatialTopologyUtils.createEnvelopeForGeometryDensityEstimate(layer, >> point, numberOfItemsToFind); >> return startNearestNeighborSearch(layer, point, >> searchWindow); >> }
>> For this to happen, you could either contribute a >> SortingNearestNeighborSearch (that does the sorting for you) or have a >> second sort step?
>> On Thu, Oct 25, 2012 at 12:10 PM, Abhijeet Deshpande >> <avdes...@gmail.com <javascript:>> wrote: >> > Hi >> > As specified below, can someone please help me understand why >> > startNearestNeighborSearch / startNearestNeighborLatLonSearch functions >> > don't return the results sorted on the distance. I have read some >> > documentation which suggests that these functions return the results >> sorted >> > on distance starting with the nearest one.
>> > Thanks >> > Abhijeet
>> > On Tuesday, 23 October 2012 01:11:02 UTC+5:30, Abhijeet Deshpande wrote:
>> >> Hi Craig, Peter
>> >> Thank you for the inputs.
>> >> Removing sorting helped quite a lot and I also reduced the result size >> >> from 100 to 10. Now the response times have come down to a few >> milliseconds >> >> but the downside is that the results are not sorted upon the distance.
>> >> The objective is to get 10 places per page starting with the closest >> one. >> >> To implement this I tried following GeoPipeline functions
>> >> 1. startNearestNeighborSearch(layer, point, distance) and >> >> 2, startNearestNeighborLatLonSearch(layer, point, distance)
>> >> but none of them returned the results sorted on distance even if the >> >> function names seem to suggest so. As per the function definition, no >> >> additional sort filters are needed so please let me know if I am >> missing >> >> something.
>> >> I also observed that the results of startNearestNeighborLatLonSearch >> are >> >> better sorted than startNearestNeighborSearch and think it has got >> something >> >> to do with the propertyFilter that these functions use. I would like >> to know >> >> more on "OrthodromicDistance" and "Distance" properties and the >> difference >> >> between their respecitive property filters.
>> >> Regards >> >> Abhijeet
>> >> On Friday, 19 October 2012 19:04:56 UTC+5:30, Craig Taverner wrote:
>> >>> I can suggest that since you have a sorting component in the pipeline, >> >>> this will cause the pipeline to internally cache the entire resultset >> on the >> >>> first hasNext or next call, sort it and return the first sorted >> result. >> >>> Subsequent calls will access the internal, sorted, cache. This is a >> >>> necessary consequence of the sorting algorithm, and is mostly >> unavoidable.
>> >>> So avoid this, you would need to take away the sort. Is it possible to >> >>> get the same results you want without the sort? Or at least by moving >> the >> >>> sort into your own code, or perhaps after a filter or some other >> action that >> >>> reduces the resultset size?
>> >>> On Fri, Oct 19, 2012 at 2:13 PM, Abhijeet Deshpande < >> avdes...@gmail.com> >> >>> wrote:
>> >>>> Hi Peter, >> >>>> Thank you for the response.
>> >>>> Actually this is not related to the very first run of query where the >> >>>> cache is empty.
>> >>>> The problem is that when ever I try to search places and request for >> 100 >> >>>> places per page in response, first call to Pipeline.hasNext() / >> >>>> Pipeline.next() always takes about 400 sec. time and this happens >> for every >> >>>> request.
>> >>>> Hope I am able to convey the problem that I am facing.
>> >>>> Regards >> >>>> Abhijeet
>> >>>> On Monday, 15 October 2012 18:35:04 UTC+5:30, Peter Neubauer wrote:
>> >>>>> Hi there, >> >>>>> the first run is probably with cold caches. In real world scenarios, >> >>>>> you are running with warm caches, so you should try to warm up the >> >>>>> database by doing a few searches before the real work, or maybe loop >> >>>>> through your interesting nodes or so?
>> >>>>> On Mon, Oct 15, 2012 at 2:10 PM, Abhijeet Deshpande >> >>>>> <avdes...@gmail.com> wrote: >> >>>>> > Hi >> >>>>> > I am using neo4j spatial to find out locations near by a lat/long. >> >>>>> > The >> >>>>> > results returned are sorted on OrthodromicDistance and are further >> >>>>> > paginated >> >>>>> > using the range function (100 results per page). The code for >> this is
> Thank you for the response. As suggested, I have removed the sorting
> component from the request and now simply fetching paginated results, 100
> per page, using the following call.
> I have observed that the graph search time is negligible but when I
> traverse the result in while loop, for lower page ranges (Page 1: 1-100,
> Page 2: 101-200, Page 3: 201-300) it takes about 800-900 milliseconds to
> execute the first hasNext() call and for later pages like 8,9,10 it takes
> about 4000 to 5000 ms. This slows down the overall performance.
> Please let me know if it is possible to significantly reduce this result
> traversal time irrespective of the page being requested.
> I suspect it may be related to the way I am creating the layer in graph
> database and hence I have also attached the java code file that populates
> graph database. Code steps at a high level are
> 1. Create graphDatabaseService
> 2. Create spatialDatabaseService
> 3. Create SimplePointLayer called places
> 4. Read a place record that contains lat and long along with other details
> from file and input file add it to the layer to create SpatialDatabaseRecord
> 5. To this newly created node add other properties like place name,
> address, zip code etc
> Can the access time be reduced if we create some index which can be used
> while traversal?
> Please let me know your thoughts on this.
> Regards
> Abhijeet
> On Thursday, 25 October 2012 17:15:55 UTC+5:30, Craig Taverner wrote:
>> The best solution is to perform the query with a sufficiently large
>> bounding box to give at least the number of results you expect, and then
>> soft and limit in the client code afterwards. This works very well, if you
>> guess the bounding box correctly. That guess is best done with domain
>> knowledge, something the client code is more likely to have.
>> The fundamental problem here, and the reason why the sorting is not done
>> internally, is that the spatial index is based on location, not distance.
>> While it is possible to make an index based on distance, the origin of the
>> search would be specific to the index. This means the index would only work
>> for searches of distance from a particular point always, not generalized to
>> any point. So, to support searches around any point (the point you pass in
>> the search query), we need to build a bounding box, query the index on
>> that, and then filter to points at the right distance.
>> On Thu, Oct 25, 2012 at 1:23 PM, Peter Neubauer <
>> peter.n...@neotechnology.**com> wrote:
>>> Abhijeet,
>>> from the docs and implementation, there is no sorting going on here.
>>> Instead, all of the returned points satisfy the bonuding box you are
>>> requesting
>>> /**
>>> * Extracts Layer items with a distance from the given point
>>> that is
>>> less than or equal the given distance.
>>> *
>>> * @param layer with latitude, longitude coordinates
>>> * @param point
>>> * @param maxDistanceInKm
>>> * @return geoPipeline
>>> */
>>> public static GeoPipeline startNearestNeighborLatLonSear**
>>> ch(Layer
>>> layer, Coordinate point, double maxDistanceInKm) {
>>> Envelope searchWindow =
>>> OrthodromicDistance.**suggestSearchWindow(point, maxDistanceInKm);
>>> GeoPipeline pipeline = start(layer, new
>>> SearchIntersectWindow(layer,
>>> searchWindow))
>>> .calculateOrthodromicDistance(**point);
>>> /**
>>> * Calculates the distance between Layer items nearest to the
>>> given
>>> point and the given point.
>>> * The search window created is based on Layer items density and
>>> it
>>> could lead to no results.
>>> *
>>> * @param layer
>>> * @param point
>>> * @param numberOfItemsToFind tries to find this number of items
>>> for comparison
>>> * @return geoPipeline
>>> */
>>> public static GeoPipeline startNearestNeighborSearch(**Layer
>>> layer,
>>> Coordinate point, int numberOfItemsToFind) {
>>> Envelope searchWindow =
>>> SpatialTopologyUtils.**createEnvelopeForGeometryDensi**tyEstimate(layer,
>>> point, numberOfItemsToFind);
>>> return startNearestNeighborSearch(**layer, point,
>>> searchWindow);
>>> }
>>> For this to happen, you could either contribute a
>>> SortingNearestNeighborSearch (that does the sorting for you) or have a
>>> second sort step?
>>> On Thu, Oct 25, 2012 at 12:10 PM, Abhijeet Deshpande
>>> <avdes...@gmail.com> wrote:
>>> > Hi
>>> > As specified below, can someone please help me understand why
>>> > startNearestNeighborSearch / startNearestNeighborLatLonSear**ch
>>> functions
>>> > don't return the results sorted on the distance. I have read some
>>> > documentation which suggests that these functions return the results
>>> sorted
>>> > on distance starting with the nearest one.
>>> > Thanks
>>> > Abhijeet
>>> > On Tuesday, 23 October 2012 01:11:02 UTC+5:30, Abhijeet Deshpande
>>> wrote:
>>> >> Hi Craig, Peter
>>> >> Thank you for the inputs.
>>> >> Removing sorting helped quite a lot and I also reduced the result size
>>> >> from 100 to 10. Now the response times have come down to a few
>>> milliseconds
>>> >> but the downside is that the results are not sorted upon the distance.
>>> >> The objective is to get 10 places per page starting with the closest
>>> one.
>>> >> To implement this I tried following GeoPipeline functions
>>> >> 1. startNearestNeighborSearch(**layer, point, distance) and
>>> >> 2, startNearestNeighborLatLonSear**ch(layer, point, distance)
>>> >> but none of them returned the results sorted on distance even if the
>>> >> function names seem to suggest so. As per the function definition, no
>>> >> additional sort filters are needed so please let me know if I am
>>> missing
>>> >> something.
>>> >> I also observed that the results of startNearestNeighborLatLonSear**ch
>>> are
>>> >> better sorted than startNearestNeighborSearch and think it has got
>>> something
>>> >> to do with the propertyFilter that these functions use. I would like
>>> to know
>>> >> more on "OrthodromicDistance" and "Distance" properties and the
>>> difference
>>> >> between their respecitive property filters.
>>> >> Regards
>>> >> Abhijeet
>>> >> On Friday, 19 October 2012 19:04:56 UTC+5:30, Craig Taverner wrote:
>>> >>> I can suggest that since you have a sorting component in the
>>> pipeline,
>>> >>> this will cause the pipeline to internally cache the entire
>>> resultset on the
>>> >>> first hasNext or next call, sort it and return the first sorted
>>> result.
>>> >>> Subsequent calls will access the internal, sorted, cache. This is a
>>> >>> necessary consequence of the sorting algorithm, and is mostly
>>> unavoidable.
>>> >>> So avoid this, you would need to take away the sort. Is it possible
>>> to
>>> >>> get the same results you want without the sort? Or at least by
>>> moving the
>>> >>> sort into your own code, or perhaps after a filter or some other
>>> action that
>>> >>> reduces the resultset size?
>>> >>> On Fri, Oct 19, 2012 at 2:13 PM, Abhijeet Deshpande <
>>> avdes...@gmail.com>
>>> >>> wrote:
>>> >>>> Hi Peter,
>>> >>>> Thank you for the response.
>>> >>>> Actually this is not related to the very first run of query where
>>> the
>>> >>>> cache is empty.
>>> >>>> The problem is that when ever I try to search places and request
>>> for 100
>>> >>>> places per page in response, first call to Pipeline.hasNext() /
>>> >>>> Pipeline.next() always takes about 400 sec. time and this happens
>>> for every
>>> >>>> request.
>>> >>>> Hope I am able to convey the problem that I am facing.
>>> >>>> Regards
>>> >>>> Abhijeet
>>> >>>> On Monday, 15 October 2012 18:35:04 UTC+5:30, Peter Neubauer wrote:
>>> >>>>> Hi there,
>>> >>>>> the first run is probably
Hi Peter, The graph database on which I am running these tests contains 10 million place records. Attached here is a sample file that contains about 10,000 records. Each line contains a record and is seperated by "|". The columns are as mentioned below.
id | latitude | longitude | name | address | city | state | zip
> On Wed, Nov 7, 2012 at 10:32 AM, Abhijeet Deshpande <avdes...@gmail.com<javascript:> > > wrote:
>> Hi Craig
>> Thank you for the response. As suggested, I have removed the sorting >> component from the request and now simply fetching paginated results, 100 >> per page, using the following call.
>> I have observed that the graph search time is negligible but when I >> traverse the result in while loop, for lower page ranges (Page 1: 1-100, >> Page 2: 101-200, Page 3: 201-300) it takes about 800-900 milliseconds to >> execute the first hasNext() call and for later pages like 8,9,10 it takes >> about 4000 to 5000 ms. This slows down the overall performance.
>> Please let me know if it is possible to significantly reduce this result >> traversal time irrespective of the page being requested.
>> I suspect it may be related to the way I am creating the layer in graph >> database and hence I have also attached the java code file that populates >> graph database. Code steps at a high level are
>> 1. Create graphDatabaseService >> 2. Create spatialDatabaseService >> 3. Create SimplePointLayer called places >> 4. Read a place record that contains lat and long along with other >> details from file and input file add it to the layer to create >> SpatialDatabaseRecord >> 5. To this newly created node add other properties like place name, >> address, zip code etc
>> Can the access time be reduced if we create some index which can be used >> while traversal?
>> Please let me know your thoughts on this.
>> Regards >> Abhijeet
>> On Thursday, 25 October 2012 17:15:55 UTC+5:30, Craig Taverner wrote:
>>> The best solution is to perform the query with a sufficiently large >>> bounding box to give at least the number of results you expect, and then >>> soft and limit in the client code afterwards. This works very well, if you >>> guess the bounding box correctly. That guess is best done with domain >>> knowledge, something the client code is more likely to have.
>>> The fundamental problem here, and the reason why the sorting is not done >>> internally, is that the spatial index is based on location, not distance. >>> While it is possible to make an index based on distance, the origin of the >>> search would be specific to the index. This means the index would only work >>> for searches of distance from a particular point always, not generalized to >>> any point. So, to support searches around any point (the point you pass in >>> the search query), we need to build a bounding box, query the index on >>> that, and then filter to points at the right distance.
>>> On Thu, Oct 25, 2012 at 1:23 PM, Peter Neubauer < >>> peter.n...@neotechnology.**com> wrote:
>>>> Abhijeet, >>>> from the docs and implementation, there is no sorting going on here. >>>> Instead, all of the returned points satisfy the bonuding box you are >>>> requesting
>>>> /** >>>> * Extracts Layer items with a distance from the given point >>>> that is >>>> less than or equal the given distance. >>>> * >>>> * @param layer with latitude, longitude coordinates >>>> * @param point >>>> * @param maxDistanceInKm >>>> * @return geoPipeline >>>> */ >>>> public static GeoPipeline startNearestNeighborLatLonSear** >>>> ch(Layer >>>> layer, Coordinate point, double maxDistanceInKm) { >>>> Envelope searchWindow = >>>> OrthodromicDistance.**suggestSearchWindow(point, maxDistanceInKm); >>>> GeoPipeline pipeline = start(layer, new >>>> SearchIntersectWindow(layer, >>>> searchWindow)) >>>> .calculateOrthodromicDistance(**point);
>>>> /** >>>> * Calculates the distance between Layer items nearest to the >>>> given >>>> point and the given point. >>>> * The search window created is based on Layer items density >>>> and it >>>> could lead to no results. >>>> * >>>> * @param layer >>>> * @param point >>>> * @param numberOfItemsToFind tries to find this number of items >>>> for comparison >>>> * @return geoPipeline >>>> */ >>>> public static GeoPipeline startNearestNeighborSearch(**Layer >>>> layer, >>>> Coordinate point, int numberOfItemsToFind) { >>>> Envelope searchWindow = >>>> SpatialTopologyUtils.**createEnvelopeForGeometryDensi** >>>> tyEstimate(layer, >>>> point, numberOfItemsToFind); >>>> return startNearestNeighborSearch(**layer, point, >>>> searchWindow); >>>> }
>>>> For this to happen, you could either contribute a >>>> SortingNearestNeighborSearch (that does the sorting for you) or have a >>>> second sort step?
>>>> On Thu, Oct 25, 2012 at 12:10 PM, Abhijeet Deshpande >>>> <avdes...@gmail.com> wrote: >>>> > Hi >>>> > As specified below, can someone please help me understand why >>>> > startNearestNeighborSearch / startNearestNeighborLatLonSear**ch >>>> functions >>>> > don't return the results sorted on the distance. I have read some >>>> > documentation which suggests that these functions return the results >>>> sorted >>>> > on distance starting with the nearest one.
>>>> > Thanks >>>> > Abhijeet
>>>> > On Tuesday, 23 October 2012 01:11:02 UTC+5:30, Abhijeet Deshpande >>>> wrote:
>>>> >> Hi Craig, Peter
>>>> >> Thank you for the inputs.
>>>> >> Removing sorting helped quite a lot and I also reduced the result >>>> size >>>> >> from 100 to 10. Now the response times have come down to a few >>>> milliseconds >>>> >> but the downside is that the results are not sorted upon the >>>> distance.
>>>> >> The objective is to get 10 places per page starting with the closest >>>> one. >>>> >> To implement this I tried following GeoPipeline functions
>>>> >> 1. startNearestNeighborSearch(**layer, point, distance) and >>>> >> 2, startNearestNeighborLatLonSear**ch(layer, point, distance)
>>>> >> but none of them returned the results sorted on distance even if the >>>> >> function names seem to suggest so. As per the function definition, no >>>> >> additional sort filters are needed so please let me know if I am >>>> missing >>>> >> something.
>>>> >> I also observed that the results of startNearestNeighborLatLonSear**ch >>>> are >>>> >> better sorted than startNearestNeighborSearch and think it has got >>>> something >>>> >> to do with the propertyFilter that these functions use. I would like >>>> to know >>>> >> more on "OrthodromicDistance" and "Distance" properties and the >>>> difference >>>> >> between their respecitive property filters.
>>>> >> Regards >>>> >> Abhijeet
>>>> >> On Friday, 19 October 2012 19:04:56 UTC+5:30, Craig Taverner wrote:
>>>> >>> I can suggest that since you have a sorting component in the >>>> pipeline, >>>> >>> this will cause the pipeline to internally cache the entire >>>> resultset on the >>>> >>> first hasNext or next call, sort it and return the first sorted >>>> result. >>>> >>> Subsequent calls will access the internal, sorted, cache. This is a >>>> >>> necessary consequence of the sorting algorithm, and is mostly >>>> unavoidable.
>>>> >>> So avoid this, you would need to take away the sort. Is it possible >>>> to >>>> >>> get the same results you want without the sort? Or at least by >>>> moving the >>>> >>> sort into your own code, or perhaps after a filter or some other >>>> action that >>>> >>> reduces the resultset size?
>>>> >>> On Fri, Oct 19, 2012 at 2:13 PM, Abhijeet Deshpande < >>>> avdes...@gmail.com> >>>> >>> wrote:
>>>> >>>> Hi Peter, >>>> >>>> Thank you for the response.
>>>> >>>> Actually this is not related to the very first run of query where >>>> the >>>> >>>> cache is empty.
>>>> >>>> The problem is that when ever I try to search places and
On Thursday, 8 November 2012 16:04:27 UTC+5:30, Abhijeet Deshpande wrote:
> Hi Peter, > The graph database on which I am running these tests contains 10 million > place records. > Attached here is a sample file that contains about 10,000 records. Each > line contains a record and is seperated by "|". The columns are as > mentioned below.
> id | latitude | longitude | name | address | city | state | zip
> -Abhijeet
> On Thursday, 8 November 2012 04:34:11 UTC+5:30, Peter Neubauer wrote:
>> So, >> this is with only a couple of hundred layer entries? Sounds very slow. Do >> you have a sample data file so I could test it?
>> On Wed, Nov 7, 2012 at 10:32 AM, Abhijeet Deshpande <avdes...@gmail.com>wrote:
>>> Hi Craig
>>> Thank you for the response. As suggested, I have removed the sorting >>> component from the request and now simply fetching paginated results, 100 >>> per page, using the following call.
>>> I have observed that the graph search time is negligible but when I >>> traverse the result in while loop, for lower page ranges (Page 1: 1-100, >>> Page 2: 101-200, Page 3: 201-300) it takes about 800-900 milliseconds to >>> execute the first hasNext() call and for later pages like 8,9,10 it takes >>> about 4000 to 5000 ms. This slows down the overall performance.
>>> Please let me know if it is possible to significantly reduce this result >>> traversal time irrespective of the page being requested.
>>> I suspect it may be related to the way I am creating the layer in graph >>> database and hence I have also attached the java code file that populates >>> graph database. Code steps at a high level are
>>> 1. Create graphDatabaseService >>> 2. Create spatialDatabaseService >>> 3. Create SimplePointLayer called places >>> 4. Read a place record that contains lat and long along with other >>> details from file and input file add it to the layer to create >>> SpatialDatabaseRecord >>> 5. To this newly created node add other properties like place name, >>> address, zip code etc
>>> Can the access time be reduced if we create some index which can be used >>> while traversal?
>>> Please let me know your thoughts on this.
>>> Regards >>> Abhijeet
>>> On Thursday, 25 October 2012 17:15:55 UTC+5:30, Craig Taverner wrote:
>>>> The best solution is to perform the query with a sufficiently large >>>> bounding box to give at least the number of results you expect, and then >>>> soft and limit in the client code afterwards. This works very well, if you >>>> guess the bounding box correctly. That guess is best done with domain >>>> knowledge, something the client code is more likely to have.
>>>> The fundamental problem here, and the reason why the sorting is not >>>> done internally, is that the spatial index is based on location, not >>>> distance. While it is possible to make an index based on distance, the >>>> origin of the search would be specific to the index. This means the index >>>> would only work for searches of distance from a particular point always, >>>> not generalized to any point. So, to support searches around any point (the >>>> point you pass in the search query), we need to build a bounding box, query >>>> the index on that, and then filter to points at the right distance.
>>>> On Thu, Oct 25, 2012 at 1:23 PM, Peter Neubauer < >>>> peter.n...@neotechnology.**com> wrote:
>>>>> Abhijeet, >>>>> from the docs and implementation, there is no sorting going on here. >>>>> Instead, all of the returned points satisfy the bonuding box you are >>>>> requesting
>>>>> /** >>>>> * Extracts Layer items with a distance from the given point >>>>> that is >>>>> less than or equal the given distance. >>>>> * >>>>> * @param layer with latitude, longitude coordinates >>>>> * @param point >>>>> * @param maxDistanceInKm >>>>> * @return geoPipeline >>>>> */ >>>>> public static GeoPipeline startNearestNeighborLatLonSear** >>>>> ch(Layer >>>>> layer, Coordinate point, double maxDistanceInKm) { >>>>> Envelope searchWindow = >>>>> OrthodromicDistance.**suggestSearchWindow(point, maxDistanceInKm); >>>>> GeoPipeline pipeline = start(layer, new >>>>> SearchIntersectWindow(layer, >>>>> searchWindow)) >>>>> .calculateOrthodromicDistance(**point);
>>>>> /** >>>>> * Calculates the distance between Layer items nearest to the >>>>> given >>>>> point and the given point. >>>>> * The search window created is based on Layer items density >>>>> and it >>>>> could lead to no results. >>>>> * >>>>> * @param layer >>>>> * @param point >>>>> * @param numberOfItemsToFind tries to find this number of items >>>>> for comparison >>>>> * @return geoPipeline >>>>> */ >>>>> public static GeoPipeline startNearestNeighborSearch(**Layer >>>>> layer, >>>>> Coordinate point, int numberOfItemsToFind) { >>>>> Envelope searchWindow = >>>>> SpatialTopologyUtils.**createEnvelopeForGeometryDensi** >>>>> tyEstimate(layer, >>>>> point, numberOfItemsToFind); >>>>> return startNearestNeighborSearch(**layer, point, >>>>> searchWindow); >>>>> }
>>>>> For this to happen, you could either contribute a >>>>> SortingNearestNeighborSearch (that does the sorting for you) or have a >>>>> second sort step?
>>>>> On Thu, Oct 25, 2012 at 12:10 PM, Abhijeet Deshpande >>>>> <avdes...@gmail.com> wrote: >>>>> > Hi >>>>> > As specified below, can someone please help me understand why >>>>> > startNearestNeighborSearch / startNearestNeighborLatLonSear**ch >>>>> functions >>>>> > don't return the results sorted on the distance. I have read some >>>>> > documentation which suggests that these functions return the results >>>>> sorted >>>>> > on distance starting with the nearest one.
>>>>> > Thanks >>>>> > Abhijeet
>>>>> > On Tuesday, 23 October 2012 01:11:02 UTC+5:30, Abhijeet Deshpande >>>>> wrote:
>>>>> >> Hi Craig, Peter
>>>>> >> Thank you for the inputs.
>>>>> >> Removing sorting helped quite a lot and I also reduced the result >>>>> size >>>>> >> from 100 to 10. Now the response times have come down to a few >>>>> milliseconds >>>>> >> but the downside is that the results are not sorted upon the >>>>> distance.
>>>>> >> The objective is to get 10 places per page starting with the >>>>> closest one. >>>>> >> To implement this I tried following GeoPipeline functions
>>>>> >> 1. startNearestNeighborSearch(**layer, point, distance) and >>>>> >> 2, startNearestNeighborLatLonSear**ch(layer, point, distance)
>>>>> >> but none of them returned the results sorted on distance even if the >>>>> >> function names seem to suggest so. As per the function definition, >>>>> no >>>>> >> additional sort filters are needed so please let me know if I am >>>>> missing >>>>> >> something.
>>>>> >> I also observed that the results of startNearestNeighborLatLonSear* >>>>> *ch are >>>>> >> better sorted than startNearestNeighborSearch and think it has got >>>>> something >>>>> >> to do with the propertyFilter that these functions use. I would >>>>> like to know >>>>> >> more on "OrthodromicDistance" and "Distance" properties and the >>>>> difference >>>>> >> between their respecitive property filters.
>>>>> >> Regards >>>>> >> Abhijeet
>>>>> >> On Friday, 19 October 2012 19:04:56 UTC+5:30, Craig Taverner wrote:
>>>>> >>> I can suggest that since you have a sorting component in the >>>>> pipeline, >>>>> >>> this will cause the pipeline to internally cache the entire >>>>> resultset on the >>>>> >>> first hasNext or next call, sort it and return the first sorted >>>>> result. >>>>> >>> Subsequent calls will access the internal, sorted, cache. This is a >>>>> >>> necessary consequence of the sorting algorithm, and is mostly >>>>> unavoidable.
>>>>> >>> So avoid this, you would need to take away the sort. Is it >>>>> possible to >>>>> >>> get the same results you want without the sort? Or at least by >>>>> moving the >>>>> >>> sort into your own code, or perhaps after a filter or some other >>>>> action that >>>>> >>> reduces the resultset
I've not tried your data yet, but I did have a chance to write some more
junit test code for the GeoPipes, and I tested two cases, both involving
creating a random sample of 10000 points in a point layer. The first test
times how long each chunk of 1000 points takes to read on the stream, just
iterating over the entire stream. The second uses the range() operation you
mentioned before, to select the page of data. So both tests have the same
final result, but the second one will always re-start the iterator and only
return the 1000 results for the relevant page.
Not surprisingly, the plain iterator runs at linear speed, with each 1000
points taking roughly as long to return, while the paging iterator slows
down, because it keeps restarting at the beginning.
So, this leads me to believe you are also restarting at the beginning. The
trick with using a plain iterator is to maintain state between calls. If
you are running in embedded mode, this is obviously trivial. But if you are
running a web app, I think it gets more subtle.
If you want to see my test code to verify if I am in fact modeling the same
process as you, I can commit it to neo4j-spatial. I plan to do that anyway,
after a little more double-checking.
Regards, Craig
On Thu, Nov 8, 2012 at 2:35 AM, Abhijeet Deshpande <avdeshpa...@gmail.com>wrote:
> On Thursday, 8 November 2012 16:04:27 UTC+5:30, Abhijeet Deshpande wrote:
>> Hi Peter,
>> The graph database on which I am running these tests contains 10 million
>> place records.
>> Attached here is a sample file that contains about 10,000 records. Each
>> line contains a record and is seperated by "|". The columns are as
>> mentioned below.
>> id | latitude | longitude | name | address | city | state | zip
>> -Abhijeet
>> On Thursday, 8 November 2012 04:34:11 UTC+5:30, Peter Neubauer wrote:
>>> So,
>>> this is with only a couple of hundred layer entries? Sounds very slow.
>>> Do you have a sample data file so I could test it?
>>> On Wed, Nov 7, 2012 at 10:32 AM, Abhijeet Deshpande <avdes...@gmail.com>wrote:
>>>> Hi Craig
>>>> Thank you for the response. As suggested, I have removed the sorting
>>>> component from the request and now simply fetching paginated results, 100
>>>> per page, using the following call.
>>>> I have observed that the graph search time is negligible but when I
>>>> traverse the result in while loop, for lower page ranges (Page 1: 1-100,
>>>> Page 2: 101-200, Page 3: 201-300) it takes about 800-900 milliseconds to
>>>> execute the first hasNext() call and for later pages like 8,9,10 it takes
>>>> about 4000 to 5000 ms. This slows down the overall performance.
>>>> Please let me know if it is possible to significantly reduce this
>>>> result traversal time irrespective of the page being requested.
>>>> I suspect it may be related to the way I am creating the layer in graph
>>>> database and hence I have also attached the java code file that populates
>>>> graph database. Code steps at a high level are
>>>> 1. Create graphDatabaseService
>>>> 2. Create spatialDatabaseService
>>>> 3. Create SimplePointLayer called places
>>>> 4. Read a place record that contains lat and long along with other
>>>> details from file and input file add it to the layer to create
>>>> SpatialDatabaseRecord
>>>> 5. To this newly created node add other properties like place name,
>>>> address, zip code etc
>>>> Can the access time be reduced if we create some index which can be
>>>> used while traversal?
>>>> Please let me know your thoughts on this.
>>>> Regards
>>>> Abhijeet
>>>> On Thursday, 25 October 2012 17:15:55 UTC+5:30, Craig Taverner wrote:
>>>>> The best solution is to perform the query with a sufficiently large
>>>>> bounding box to give at least the number of results you expect, and then
>>>>> soft and limit in the client code afterwards. This works very well, if you
>>>>> guess the bounding box correctly. That guess is best done with domain
>>>>> knowledge, something the client code is more likely to have.
>>>>> The fundamental problem here, and the reason why the sorting is not
>>>>> done internally, is that the spatial index is based on location, not
>>>>> distance. While it is possible to make an index based on distance, the
>>>>> origin of the search would be specific to the index. This means the index
>>>>> would only work for searches of distance from a particular point always,
>>>>> not generalized to any point. So, to support searches around any point (the
>>>>> point you pass in the search query), we need to build a bounding box, query
>>>>> the index on that, and then filter to points at the right distance.
>>>>> On Thu, Oct 25, 2012 at 1:23 PM, Peter Neubauer <
>>>>> peter.n...@neotechnology.**com> wrote:
>>>>>> Abhijeet,
>>>>>> from the docs and implementation, there is no sorting going on here.
>>>>>> Instead, all of the returned points satisfy the bonuding box you are
>>>>>> requesting
>>>>>> /**
>>>>>> * Extracts Layer items with a distance from the given point
>>>>>> that is
>>>>>> less than or equal the given distance.
>>>>>> *
>>>>>> * @param layer with latitude, longitude coordinates
>>>>>> * @param point
>>>>>> * @param maxDistanceInKm
>>>>>> * @return geoPipeline
>>>>>> */
>>>>>> public static GeoPipeline startNearestNeighborLatLonSear****
>>>>>> ch(Layer
>>>>>> layer, Coordinate point, double maxDistanceInKm) {
>>>>>> Envelope searchWindow =
>>>>>> OrthodromicDistance.**suggestSea**rchWindow(point, maxDistanceInKm);
>>>>>> GeoPipeline pipeline = start(layer, new
>>>>>> SearchIntersectWindow(layer,
>>>>>> searchWindow))
>>>>>> .calculateOrthodromicDistance(****point);
>>>>>> /**
>>>>>> * Calculates the distance between Layer items nearest to the
>>>>>> given
>>>>>> point and the given point.
>>>>>> * The search window created is based on Layer items density
>>>>>> and it
>>>>>> could lead to no results.
>>>>>> *
>>>>>> * @param layer
>>>>>> * @param point
>>>>>> * @param numberOfItemsToFind tries to find this number of items
>>>>>> for comparison
>>>>>> * @return geoPipeline
>>>>>> */
>>>>>> public static GeoPipeline startNearestNeighborSearch(**Lay**er
>>>>>> layer,
>>>>>> Coordinate point, int numberOfItemsToFind) {
>>>>>> Envelope searchWindow =
>>>>>> SpatialTopologyUtils.**createEnv**elopeForGeometryDensi**tyEstimat**
>>>>>> e(layer,
>>>>>> point, numberOfItemsToFind);
>>>>>> return startNearestNeighborSearch(**lay**er, point,
>>>>>> searchWindow);
>>>>>> }
>>>>>> For this to happen, you could either contribute a
>>>>>> SortingNearestNeighborSearch (that does the sorting for you) or have a
>>>>>> second sort step?
>>>>>> On Thu, Oct 25, 2012 at 12:10 PM, Abhijeet Deshpande
>>>>>> <avdes...@gmail.com> wrote:
>>>>>> > Hi
>>>>>> > As specified below, can someone please help me understand why
>>>>>> > startNearestNeighborSearch / startNearestNeighborLatLonSear****ch
>>>>>> functions
>>>>>> > don't return the results sorted on the distance. I have read some
>>>>>> > documentation which suggests that these functions return the
>>>>>> results sorted
>>>>>> > on distance starting with the nearest one.
>>>>>> > Thanks
>>>>>> > Abhijeet
>>>>>> > On Tuesday, 23 October 2012 01:11:02 UTC+5:30, Abhijeet Deshpande
>>>>>> wrote:
>>>>>> >> Hi Craig, Peter
>>>>>> >> Thank you for the inputs.
>>>>>> >> Removing sorting helped quite a lot and I also reduced the result
>>>>>> size
>>>>>> >> from 100 to 10. Now the response times have come down to a few
>>>>>> milliseconds
>>>>>> >> but the downside is that the results are not sorted upon the
>>>>>> distance.
On Fri, Nov 9, 2012 at 5:14 AM, Craig Taverner <cr...@amanzi.com> wrote:
> Hi,
> I've not tried your data yet, but I did have a chance to write some more
> junit test code for the GeoPipes, and I tested two cases, both involving
> creating a random sample of 10000 points in a point layer. The first test
> times how long each chunk of 1000 points takes to read on the stream, just
> iterating over the entire stream. The second uses the range() operation you
> mentioned before, to select the page of data. So both tests have the same
> final result, but the second one will always re-start the iterator and only
> return the 1000 results for the relevant page.
> Not surprisingly, the plain iterator runs at linear speed, with each 1000
> points taking roughly as long to return, while the paging iterator slows
> down, because it keeps restarting at the beginning.
> So, this leads me to believe you are also restarting at the beginning. The
> trick with using a plain iterator is to maintain state between calls. If
> you are running in embedded mode, this is obviously trivial. But if you are
> running a web app, I think it gets more subtle.
> If you want to see my test code to verify if I am in fact modeling the
> same process as you, I can commit it to neo4j-spatial. I plan to do that
> anyway, after a little more double-checking.
> Regards, Craig
> On Thu, Nov 8, 2012 at 2:35 AM, Abhijeet Deshpande <avdeshpa...@gmail.com>wrote:
>> Attached is the second chunk of 5000 records.
>> On Thursday, 8 November 2012 16:04:27 UTC+5:30, Abhijeet Deshpande wrote:
>>> Hi Peter,
>>> The graph database on which I am running these tests contains 10 million
>>> place records.
>>> Attached here is a sample file that contains about 10,000 records. Each
>>> line contains a record and is seperated by "|". The columns are as
>>> mentioned below.
>>> id | latitude | longitude | name | address | city | state | zip
>>> -Abhijeet
>>> On Thursday, 8 November 2012 04:34:11 UTC+5:30, Peter Neubauer wrote:
>>>> So,
>>>> this is with only a couple of hundred layer entries? Sounds very slow.
>>>> Do you have a sample data file so I could test it?
>>>> On Wed, Nov 7, 2012 at 10:32 AM, Abhijeet Deshpande <avdes...@gmail.com
>>>> > wrote:
>>>>> Hi Craig
>>>>> Thank you for the response. As suggested, I have removed the sorting
>>>>> component from the request and now simply fetching paginated results, 100
>>>>> per page, using the following call.
>>>>> I have observed that the graph search time is negligible but when I
>>>>> traverse the result in while loop, for lower page ranges (Page 1: 1-100,
>>>>> Page 2: 101-200, Page 3: 201-300) it takes about 800-900 milliseconds to
>>>>> execute the first hasNext() call and for later pages like 8,9,10 it takes
>>>>> about 4000 to 5000 ms. This slows down the overall performance.
>>>>> Please let me know if it is possible to significantly reduce this
>>>>> result traversal time irrespective of the page being requested.
>>>>> I suspect it may be related to the way I am creating the layer in
>>>>> graph database and hence I have also attached the java code file that
>>>>> populates graph database. Code steps at a high level are
>>>>> 1. Create graphDatabaseService
>>>>> 2. Create spatialDatabaseService
>>>>> 3. Create SimplePointLayer called places
>>>>> 4. Read a place record that contains lat and long along with other
>>>>> details from file and input file add it to the layer to create
>>>>> SpatialDatabaseRecord
>>>>> 5. To this newly created node add other properties like place name,
>>>>> address, zip code etc
>>>>> Can the access time be reduced if we create some index which can be
>>>>> used while traversal?
>>>>> Please let me know your thoughts on this.
>>>>> Regards
>>>>> Abhijeet
>>>>> On Thursday, 25 October 2012 17:15:55 UTC+5:30, Craig Taverner wrote:
>>>>>> The best solution is to perform the query with a sufficiently large
>>>>>> bounding box to give at least the number of results you expect, and then
>>>>>> soft and limit in the client code afterwards. This works very well, if you
>>>>>> guess the bounding box correctly. That guess is best done with domain
>>>>>> knowledge, something the client code is more likely to have.
>>>>>> The fundamental problem here, and the reason why the sorting is not
>>>>>> done internally, is that the spatial index is based on location, not
>>>>>> distance. While it is possible to make an index based on distance, the
>>>>>> origin of the search would be specific to the index. This means the index
>>>>>> would only work for searches of distance from a particular point always,
>>>>>> not generalized to any point. So, to support searches around any point (the
>>>>>> point you pass in the search query), we need to build a bounding box, query
>>>>>> the index on that, and then filter to points at the right distance.
>>>>>> On Thu, Oct 25, 2012 at 1:23 PM, Peter Neubauer <
>>>>>> peter.n...@neotechnology.**com> wrote:
>>>>>>> Abhijeet,
>>>>>>> from the docs and implementation, there is no sorting going on here.
>>>>>>> Instead, all of the returned points satisfy the bonuding box you are
>>>>>>> requesting
>>>>>>> /**
>>>>>>> * Extracts Layer items with a distance from the given point
>>>>>>> that is
>>>>>>> less than or equal the given distance.
>>>>>>> *
>>>>>>> * @param layer with latitude, longitude coordinates
>>>>>>> * @param point
>>>>>>> * @param maxDistanceInKm
>>>>>>> * @return geoPipeline
>>>>>>> */
>>>>>>> public static GeoPipeline startNearestNeighborLatLonSear****
>>>>>>> ch(Layer
>>>>>>> layer, Coordinate point, double maxDistanceInKm) {
>>>>>>> Envelope searchWindow =
>>>>>>> OrthodromicDistance.**suggestSea**rchWindow(point, maxDistanceInKm);
>>>>>>> GeoPipeline pipeline = start(layer, new
>>>>>>> SearchIntersectWindow(layer,
>>>>>>> searchWindow))
>>>>>>> .calculateOrthodromicDistance(****point);
>>>>>>> /**
>>>>>>> * Calculates the distance between Layer items nearest to
>>>>>>> the given
>>>>>>> point and the given point.
>>>>>>> * The search window created is based on Layer items density
>>>>>>> and it
>>>>>>> could lead to no results.
>>>>>>> *
>>>>>>> * @param layer
>>>>>>> * @param point
>>>>>>> * @param numberOfItemsToFind tries to find this number of items
>>>>>>> for comparison
>>>>>>> * @return geoPipeline
>>>>>>> */
>>>>>>> public static GeoPipeline startNearestNeighborSearch(**Lay**er
>>>>>>> layer,
>>>>>>> Coordinate point, int numberOfItemsToFind) {
>>>>>>> Envelope searchWindow =
>>>>>>> SpatialTopologyUtils.**createEnv**elopeForGeometryDensi**tyEstimat**
>>>>>>> e(layer,
>>>>>>> point, numberOfItemsToFind);
>>>>>>> return startNearestNeighborSearch(**lay**er, point,
>>>>>>> searchWindow);
>>>>>>> }
>>>>>>> For this to happen, you could either contribute a
>>>>>>> SortingNearestNeighborSearch (that does the sorting for you) or have
>>>>>>> a
>>>>>>> second sort step?
>>>>>>> On Thu, Oct 25, 2012 at 12:10 PM, Abhijeet Deshpande
>>>>>>> <avdes...@gmail.com> wrote:
>>>>>>> > Hi
>>>>>>> > As specified below, can someone please help me understand why
>>>>>>> > startNearestNeighborSearch / startNearestNeighborLatLonSear****ch
>>>>>>> functions
>>>>>>> > don't return the results sorted on the distance. I have read some
>>>>>>> > documentation which suggests that these
> On Fri, Nov 9, 2012 at 5:14 AM, Craig Taverner <cr...@amanzi.com> wrote:
>> Hi,
>> I've not tried your data yet, but I did have a chance to write some more
>> junit test code for the GeoPipes, and I tested two cases, both involving
>> creating a random sample of 10000 points in a point layer. The first test
>> times how long each chunk of 1000 points takes to read on the stream, just
>> iterating over the entire stream. The second uses the range() operation you
>> mentioned before, to select the page of data. So both tests have the same
>> final result, but the second one will always re-start the iterator and only
>> return the 1000 results for the relevant page.
>> Not surprisingly, the plain iterator runs at linear speed, with each 1000
>> points taking roughly as long to return, while the paging iterator slows
>> down, because it keeps restarting at the beginning.
>> So, this leads me to believe you are also restarting at the beginning.
>> The trick with using a plain iterator is to maintain state between calls.
>> If you are running in embedded mode, this is obviously trivial. But if you
>> are running a web app, I think it gets more subtle.
>> If you want to see my test code to verify if I am in fact modeling the
>> same process as you, I can commit it to neo4j-spatial. I plan to do that
>> anyway, after a little more double-checking.
>> Regards, Craig
>> On Thu, Nov 8, 2012 at 2:35 AM, Abhijeet Deshpande <avdeshpa...@gmail.com
>> > wrote:
>>> Attached is the second chunk of 5000 records.
>>> On Thursday, 8 November 2012 16:04:27 UTC+5:30, Abhijeet Deshpande wrote:
>>>> Hi Peter,
>>>> The graph database on which I am running these tests contains 10
>>>> million place records.
>>>> Attached here is a sample file that contains about 10,000 records. Each
>>>> line contains a record and is seperated by "|". The columns are as
>>>> mentioned below.
>>>> id | latitude | longitude | name | address | city | state | zip
>>>> -Abhijeet
>>>> On Thursday, 8 November 2012 04:34:11 UTC+5:30, Peter Neubauer wrote:
>>>>> So,
>>>>> this is with only a couple of hundred layer entries? Sounds very slow.
>>>>> Do you have a sample data file so I could test it?
>>>>> On Wed, Nov 7, 2012 at 10:32 AM, Abhijeet Deshpande <
>>>>> avdes...@gmail.com> wrote:
>>>>>> Hi Craig
>>>>>> Thank you for the response. As suggested, I have removed the sorting
>>>>>> component from the request and now simply fetching paginated results, 100
>>>>>> per page, using the following call.
>>>>>> I have observed that the graph search time is negligible but when I
>>>>>> traverse the result in while loop, for lower page ranges (Page 1: 1-100,
>>>>>> Page 2: 101-200, Page 3: 201-300) it takes about 800-900 milliseconds to
>>>>>> execute the first hasNext() call and for later pages like 8,9,10 it takes
>>>>>> about 4000 to 5000 ms. This slows down the overall performance.
>>>>>> Please let me know if it is possible to significantly reduce this
>>>>>> result traversal time irrespective of the page being requested.
>>>>>> I suspect it may be related to the way I am creating the layer in
>>>>>> graph database and hence I have also attached the java code file that
>>>>>> populates graph database. Code steps at a high level are
>>>>>> 1. Create graphDatabaseService
>>>>>> 2. Create spatialDatabaseService
>>>>>> 3. Create SimplePointLayer called places
>>>>>> 4. Read a place record that contains lat and long along with other
>>>>>> details from file and input file add it to the layer to create
>>>>>> SpatialDatabaseRecord
>>>>>> 5. To this newly created node add other properties like place name,
>>>>>> address, zip code etc
>>>>>> Can the access time be reduced if we create some index which can be
>>>>>> used while traversal?
>>>>>> Please let me know your thoughts on this.
>>>>>> Regards
>>>>>> Abhijeet
>>>>>> On Thursday, 25 October 2012 17:15:55 UTC+5:30, Craig Taverner wrote:
>>>>>>> The best solution is to perform the query with a sufficiently large
>>>>>>> bounding box to give at least the number of results you expect, and then
>>>>>>> soft and limit in the client code afterwards. This works very well, if you
>>>>>>> guess the bounding box correctly. That guess is best done with domain
>>>>>>> knowledge, something the client code is more likely to have.
>>>>>>> The fundamental problem here, and the reason why the sorting is not
>>>>>>> done internally, is that the spatial index is based on location, not
>>>>>>> distance. While it is possible to make an index based on distance, the
>>>>>>> origin of the search would be specific to the index. This means the index
>>>>>>> would only work for searches of distance from a particular point always,
>>>>>>> not generalized to any point. So, to support searches around any point (the
>>>>>>> point you pass in the search query), we need to build a bounding box, query
>>>>>>> the index on that, and then filter to points at the right distance.
>>>>>>> On Thu, Oct 25, 2012 at 1:23 PM, Peter Neubauer <
>>>>>>> peter.n...@neotechnology.**com> wrote:
>>>>>>>> Abhijeet,
>>>>>>>> from the docs and implementation, there is no sorting going on here.
>>>>>>>> Instead, all of the returned points satisfy the bonuding box you are
>>>>>>>> requesting
>>>>>>>> /**
>>>>>>>> * Extracts Layer items with a distance from the given
>>>>>>>> point that is
>>>>>>>> less than or equal the given distance.
>>>>>>>> *
>>>>>>>> * @param layer with latitude, longitude coordinates
>>>>>>>> * @param point
>>>>>>>> * @param maxDistanceInKm
>>>>>>>> * @return geoPipeline
>>>>>>>> */
>>>>>>>> public static GeoPipeline startNearestNeighborLatLonSear***
>>>>>>>> *ch(Layer
>>>>>>>> layer, Coordinate point, double maxDistanceInKm) {
>>>>>>>> Envelope searchWindow =
>>>>>>>> OrthodromicDistance.**suggestSea**rchWindow(point,
>>>>>>>> maxDistanceInKm);
>>>>>>>> GeoPipeline pipeline = start(layer, new
>>>>>>>> SearchIntersectWindow(layer,
>>>>>>>> searchWindow))
>>>>>>>> .calculateOrthodromicDistance(****point);
>>>>>>>> /**
>>>>>>>> * Calculates the distance between Layer items nearest to
>>>>>>>> the given
>>>>>>>> point and the given point.
>>>>>>>> * The search window created is based on Layer items
>>>>>>>> density and it
>>>>>>>> could lead to no results.
>>>>>>>> *
>>>>>>>> * @param layer
>>>>>>>> * @param point
>>>>>>>> * @param numberOfItemsToFind tries to find this number of items
>>>>>>>> for comparison
>>>>>>>> * @return geoPipeline
>>>>>>>> */
>>>>>>>> public static GeoPipeline startNearestNeighborSearch(**Lay*
>>>>>>>> *er layer,
>>>>>>>> Coordinate point, int numberOfItemsToFind) {
>>>>>>>> Envelope searchWindow =
>>>>>>>> SpatialTopologyUtils.**createEnv**elopeForGeometryDensi**tyEstimat*
>>>>>>>> *e(layer,
>>>>>>>> point, numberOfItemsToFind);
>>>>>>>> return startNearestNeighborSearch(**lay**er,
>>>>>>>> point, searchWindow);
>>>>>>>> }
>>>>>>>> For this to happen, you could either contribute a
>>>>>>>> SortingNearestNeighborSearch (that does the sorting for you) or
>>>>>>>> have a
>>>>>>>> second sort step?