SOLR

491 views
Skip to first unread message

Geoff Parkhurst

unread,
Mar 11, 2015, 11:31:09 AM3/11/15
to lu...@googlegroups.com
Hi folks

Is anyone using SOLR with Railo / Lucee? By that I mean a separate
SOLR server / cluster - not an inbuilt cfsearch / cfcollection.

I can't seem to see an extension or plugin to do the remote connection
unless I'm missing some config option somewhere.

SOLR fits our needs perfectly (shardable, facetted searching and all
that), but a bit of Stack Overflow seems to intimate we'd need to roll
our own java-based persistent http connection pool to get good
performance.

It's something we could consider sponsoring or paying someone to
open-source if need be (CMD are you listening?) but I hope we'd not be
reinventing the wheel...

Any input appreciated.

Best,
Geoff

Igal @ Lucee.org

unread,
Mar 11, 2015, 11:35:25 AM3/11/15
to lu...@googlegroups.com
Geoff,

I think that most of us use ElasticSearch instead of SOLR.
see https://www.elastic.co/

Igal Sapir
Lucee Core Developer
Lucee.org

Andreas Eppinger

unread,
Mar 11, 2015, 11:56:08 AM3/11/15
to lu...@googlegroups.com
We use solr with an external solr  cluster with a modified version of cfsolrlib

--
You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/550060B5.1050301%40lucee.org.

For more options, visit https://groups.google.com/d/optout.

Robert Munn

unread,
Mar 11, 2015, 3:30:19 PM3/11/15
to lu...@googlegroups.com
Maybe the SO post is out of date. If you use cfsolrlib, which uses Java-based solrj under the covers, you are already using connection pooling/ See this post:


My only issue with cfsolrlib is that it seems to be using the XML format for indexing and querying, which was good five years ago but is now unnecessary as Solr supports JSON. Might be worth investing some time to fork Shannon’s cfsolrlib repo and patch the library to use JSON as an optional format. 

If you want to roll your own, you could use:


or



I haven’t used ElasticSearch, but it seems to be more popular among new projects than Solr. 




--
You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.

Geoff Parkhurst

unread,
Mar 11, 2015, 6:13:20 PM3/11/15
to lu...@googlegroups.com
Many thanks for the input all. I'll definitely take a look at
ElasticSearch and the cfsolrlib.

The SO question was this by the way - not that old (2013) - someone
trying to connect ACF10 to ElasticSearch as it turns out:

http://stackoverflow.com/questions/17434138/understanding-persistent-http-connections-in-coldfusion
> https://groups.google.com/d/msgid/lucee/67A4CCA6-D15E-4F7B-AB2D-B515C2B10B56%40gmail.com.

Andreas Eppinger

unread,
Mar 11, 2015, 6:32:10 PM3/11/15
to lu...@googlegroups.com
You can simple switch to the binaryFormat of the solrj client with the the cfsolrlib by using the flag "binaryEnabled"

cfsolrlib used only in the first version the xml Format / HTTP -Calls

Geoff Parkhurst

unread,
Mar 12, 2015, 6:22:13 AM3/12/15
to lu...@googlegroups.com
On 11 March 2015 at 15:35, Igal @ Lucee.org <ig...@lucee.org> wrote:
> Geoff,
>
> I think that most of us use ElasticSearch instead of SOLR.
> see https://www.elastic.co/

Thanks Igal. How are you connecting to an ES instance? Looks to me
like the same scenario as SOLR:

- Use cfhttp (slow - create connection, authenticate, get data, close
connection)
- Roll your own persistent connection pool with Java
- invoke some pre-built wrapped java driver

Was there a Railo elasticsearch extension at one time or did I imagine
that? Can't seem to find one under Lucee...

Many thanks
Geoff

Andrew Dixon

unread,
Mar 12, 2015, 7:01:23 AM3/12/15
to lu...@googlegroups.com
Hi Geoff,

I was actually looking at ES last night and on the ES site there is a link to this CFML project on Github:


It says it is still beta but there hasn't been a commit for 5 months, so I'm not sure what is going on with it. I did tweet at Jason Fill and asked but I've not heard back. It appears however to work ok for what I wanted to do, but looking in the source it is using http requests, but it honestly didn't feel slow, but then I guess it depends what you are doing.

Kind regards,

Andrew
about.me
mso - Lucee - Member

Geoff

--
You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.

Julian Halliwell

unread,
Mar 12, 2015, 7:39:39 AM3/12/15
to lu...@googlegroups.com
I looked at Jason's client too a while ago. Nice piece of work, but it
was written for an older ES release and I had issues with 1.x

Having previously used the embedded Solr in CF9 I ended up writing my
own wrapper which mimics the CF/Solr behaviour. It uses (cf)http and
seems to perform well, but then my needs are fairly small-scale and
I'm not using the full clustering/sharding capability.

Geoff Parkhurst

unread,
Mar 12, 2015, 8:34:31 AM3/12/15
to lu...@googlegroups.com
Thanks for that. It's that underlying connectivity that concerns me; I
can't shake the feeling that cfhttp is not the right method for
performance due to all the overheads with connection / auth etc.

I think we'd hit performance problems at both ends - getting fast
response times for e-commerce customers, and bulk inserting / updating

(One of) the PHP SOLR libraries (solarium) has a sub-set of connection
methods ("adaptors") so you can choose to cURL, or http, or zend etc.

I've not yet looked at Jason's code but if that connection method is
abstracted into its own chunk, perhaps I could build on that with a
persistent java connection pool or something...

We're not yet ready to turn our whole ecom site AJAX'y and make the
client call SOLR / ES directly...

Still digging anyhow - many thanks


On 12 March 2015 at 11:01, Andrew Dixon <andrew...@gmail.com> wrote:
> https://groups.google.com/d/msgid/lucee/CAG1WijVBQBZmZkY2nDhjw_-gbnc_wcODBzhz9wK9SVfBokauSA%40mail.gmail.com.

Igal @ Lucee.org

unread,
Mar 12, 2015, 11:10:04 AM3/12/15
to lu...@googlegroups.com
I've toyed with the Bulk API in the past -- http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/bulk.html -- but TBH the http client is rather efficient and I mostly use it, so test it before you conclude that it's too slow.

how many requests per second do you expect?



Igal Sapir

Lucee Core Developer
Lucee.org

Alex Skinner

unread,
Mar 12, 2015, 3:12:45 PM3/12/15
to lu...@googlegroups.com

I recommend this

https://github.com/DominicWatson/cfelasticsearch

Sent from my phone

Geoff Parkhurst

unread,
Mar 12, 2015, 5:20:44 PM3/12/15
to lu...@googlegroups.com
On 12 March 2015 at 15:09, Igal @ Lucee.org <ig...@lucee.org> wrote:
> how many requests per second do you expect?

Well, right now, 5 front-end web servers are maintaining about 200
connections to our DBs and servicing about 200 requests per second.

The majority of those will be front-end catalogue requests - so maybe
30 - 40 cfhttp calls per second per server. (But we get 10x this
traffic at Christmas / Valentine's - which is why we need the
shardability)

It's the response time that I'm most interested in maintaining though
- you're right, I'd need to do some testing before writing off cfhttp
- just feels like a lot of setup / authenticate / close connection
traffic (lag) we could do without...

Igal @ Lucee.org

unread,
Mar 12, 2015, 5:33:22 PM3/12/15
to lu...@googlegroups.com
well, when I spoke with the people at Elastic a couple of years ago they said that since we use Apache HttpClient the connections are reused by default, so we're good (unlike other platforms where a new connection was created for each request).  TBH I never tested that myself because I never bumped into performance issues.

I'm not sure what you mean by "authenticate"?  are you planning to front elasticsearch with a proxy server?  are you planning to use elastic's Shield?  (I imagine your servers sit behind a firewall and are communicating between themselves on a LAN).

you should definitely run some tests first, and please share with us your results when you have them.

if performance is an issue then look into the Bulk API that I mentioned in a previous email on this thread.


Igal Sapir
Lucee Core Developer
Lucee.org

Dominic Watson

unread,
Mar 13, 2015, 6:03:08 AM3/13/15
to lu...@googlegroups.com

That hasn't been touched since 2012 and I believe only just got started. I wouldn't recommend it! Indeed, I might just take it down. (this is not just me being defensive about my own code).

D



For more options, visit https://groups.google.com/d/optout.



--
Pixl8 Interactive, 3 Tun Yard, Peardon Street, London
SW8 3HT, United Kingdom

T: +44 [0] 845 260 0726 W: www.pixl8.co.uk E: in...@pixl8.co.uk

Follow us on: Facebook Twitter LinkedIn
CONFIDENTIAL AND PRIVILEGED - This e-mail and any attachment is intended solely for the addressee, is strictly confidential and may also be subject to legal, professional or other privilege or may be protected by work product immunity or other legal rules. If you are not the addressee please do not read, print, re-transmit, store or act in reliance on it or any attachments. Instead, please email it back to the sender and then immediately permanently delete it. Pixl8 Interactive Ltd Registered in England. Registered number: 04336501. Registered office: 8 Spur Road, Cosham, Portsmouth, Hampshire, PO6 3EB

Michael Offner

unread,
Mar 13, 2015, 12:00:02 PM3/13/15
to lu...@googlegroups.com
Fyi Lucee 5 will move the search (lucene) to an extension, so you could even do a extension that replaces the current lucene implementation 

Micha 
--
You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages