Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Cloudsearch Help
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  13 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
tr  
View profile  
 More options Oct 31 2012, 11:02 pm
From: tr <ril...@gmail.com>
Date: Wed, 31 Oct 2012 20:02:29 -0700 (PDT)
Local: Wed, Oct 31 2012 11:02 pm
Subject: Cloudsearch Help

Hi,

I'm having issues getting Cloudsearch functioning correctly.  I have my two
domains set up and the appropriate API url's in my .ini.  I have created
the indexes according to this thread:

https://groups.google.com/forum/?fromgroups=#!searchin/reddit-dev/clo...

I added a cloudsearch_q and it appears to be running.

Searches no longer error as they did before this config, but they are not
returning data, and it appears it's not uploading any data to Cloudsearch.  
I have added the IP of my server to the access policies in Cloudsearch.  I
see no errors to STDOUT when doing searches, or submitting new links.

Any ideas?  I'm not sure how to even troubleshoot this issue.

Thanks in advance.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Keith Mitchell  
View profile  
 More options Nov 5 2012, 1:54 pm
From: Keith Mitchell <kemit...@reddit.com>
Date: Mon, 5 Nov 2012 10:53:39 -0800
Local: Mon, Nov 5 2012 1:53 pm
Subject: Re: [reddit-dev] Cloudsearch Help

In your AWS console for cloudsearch, does your domain appear to have any
documents? If not, then your uploads are not working, and we can figure out
what's breaking on that end. If so, then it's searches that are broken, and
we'll investigate that side.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
tr  
View profile  
 More options Nov 5 2012, 3:59 pm
From: tr <ril...@gmail.com>
Date: Mon, 5 Nov 2012 12:59:53 -0800 (PST)
Local: Mon, Nov 5 2012 3:59 pm
Subject: Re: [reddit-dev] Cloudsearch Help

Zero documents - so yes I assume an upload issue.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Keith Mitchell  
View profile  
 More options Nov 5 2012, 5:48 pm
From: Keith Mitchell <kemit...@reddit.com>
Date: Mon, 5 Nov 2012 14:47:52 -0800
Local: Mon, Nov 5 2012 5:47 pm
Subject: Re: [reddit-dev] Cloudsearch Help

Did you back fill existing documents using
cloudsearch.py:rebuild_link_index()? Is your cloudsearch_changes queue
processor running, and if so, what's in the log?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
tr  
View profile  
 More options Nov 5 2012, 6:32 pm
From: tr <ril...@gmail.com>
Date: Mon, 5 Nov 2012 15:32:44 -0800 (PST)
Local: Mon, Nov 5 2012 6:32 pm
Subject: Re: [reddit-dev] Cloudsearch Help

Forgive me I'm far from an expert on python - how would I run the
rebuild_link_index() function manually?  I have not done that, but, new
content is not being uploaded to Cloudsearch either.

I have "cloudsearch_q   1" in my consumer-counts file, and it is running.  
Is this different from the cloudsearch_changes queue processor?  Where do I
find the log?

Thanks very much.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Keith Mitchell  
View profile  
 More options Nov 6 2012, 1:27 pm
From: Keith Mitchell <kemit...@reddit.com>
Date: Tue, 6 Nov 2012 10:27:19 -0800
Local: Tues, Nov 6 2012 1:27 pm
Subject: Re: [reddit-dev] Cloudsearch Help

You can get a python shell for running reddit code in the proper context by
cd'ing to {reddit}/r2, then running "paster shell your_ini_file.ini".

From there, you can do:

import r2.lib.cloudsearch as cs
cs.rebuild_link_index()

(And of course, you can import anything else from the reddit code base,
inspect objects, load things from the database, etc.)

I'm not 100% sure, but I think by default the q procs will write to syslog.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
tr  
View profile  
 More options Nov 6 2012, 3:38 pm
From: tr <ril...@gmail.com>
Date: Tue, 6 Nov 2012 12:38:38 -0800 (PST)
Local: Tues, Nov 6 2012 3:38 pm
Subject: Re: [reddit-dev] Cloudsearch Help

Thanks - running manually successfully processed, and I now see documents
in Cloudsearch.  Searching are returning successfully.

So it seems the issue is with the q proc?  I'll poke around syslog and see
if I see anything, otherwise if you have any suggestions let me know.  

Thanks for the help thus far.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Keith Mitchell  
View profile  
 More options Nov 7 2012, 2:30 pm
From: Keith Mitchell <kemit...@reddit.com>
Date: Wed, 7 Nov 2012 11:30:08 -0800
Local: Wed, Nov 7 2012 2:30 pm
Subject: Re: [reddit-dev] Cloudsearch Help

Yup, sounds like the q_proc. If you're certain it's running ("sudo initctl
list | grep reddit" should help with that, I think) then the log is the
next important bit.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
tr  
View profile  
 More options Nov 7 2012, 7:28 pm
From: tr <ril...@gmail.com>
Date: Wed, 7 Nov 2012 16:28:23 -0800 (PST)
Local: Wed, Nov 7 2012 7:28 pm
Subject: Re: [reddit-dev] Cloudsearch Help

initctl list shows:

reddit-consumer-cloudsearch_q (1) start/running, process 28212

So it appears to be running as it should.  I'm not sure what to even look
for as far as logs go.  A find for *log* doesn't really come up with
anything significant, and I don't see anything significant in syslog either.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ricky Ramirez  
View profile  
 More options Nov 7 2012, 8:14 pm
From: Ricky Ramirez <r...@reddit.com>
Date: Wed, 7 Nov 2012 17:08:06 -0800
Local: Wed, Nov 7 2012 8:08 pm
Subject: Re: [reddit-dev] Cloudsearch Help

The output is sent to syslog via wrap-job. By default this should go to
/var/log/syslog. The default log facility is cron, so it might also be in
/var/log/cron.log. If that's not the case for you then you need to consult
your syslog documentation.

Ricky


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
tr  
View profile  
 More options Nov 7 2012, 10:38 pm
From: tr <ril...@gmail.com>
Date: Wed, 7 Nov 2012 19:38:43 -0800 (PST)
Local: Wed, Nov 7 2012 10:38 pm
Subject: Re: [reddit-dev] Cloudsearch Help

Attached is the syslog output after submitting a link.  I don't see any
mention of cloudsearch_q in there, but other q proc's such as scraper,
vote_link, etc.

Thanks

  syslog.txt
12K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Keith Mitchell  
View profile  
 More options Nov 8 2012, 12:45 pm
From: Keith Mitchell <kemit...@reddit.com>
Date: Thu, 8 Nov 2012 09:44:44 -0800
Local: Thurs, Nov 8 2012 12:44 pm
Subject: Re: [reddit-dev] Cloudsearch Help

I may have an inkling, as I think about it.

Try modifying reddit-consumer-cloudsearch_q.conf and make the following
change to the end of the wrap-job line:

change
'run_changed()'
into
'run_changed(min_size=0)'

Then restart the q proc.

Cloudsearch's document upload handling is most efficient when working on
batches of documents, so we wait for a sufficiently large number of items
to queue up before processing and sending. Presumably, you're not
generating enough new/changed submissions to hit the default minimum size
of 500, so the queue proc is just waiting and waiting.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
tr  
View profile  
 More options Nov 8 2012, 6:44 pm
From: tr <ril...@gmail.com>
Date: Thu, 8 Nov 2012 15:44:05 -0800 (PST)
Local: Thurs, Nov 8 2012 6:44 pm
Subject: Re: [reddit-dev] Cloudsearch Help

Your inklink was correct.  Looks like I'm all set.

Thanks both of you for all the help.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »