stale views

10 views
Skip to first unread message

joe hobson

unread,
Jan 19, 2012, 12:50:39 PM1/19/12
to learnin...@googlegroups.com
Now that we've gotten around to setting up our own node, I'm starting to learn a bit more about what's going on behind the scenes with the node processes. Inevitably this just leads to more questions, especially when things don't seem to be working correctly. My node is on 0.23.0, a deployment of the VM that was distributed at PlugFest2 - and I believe it hasn't been greatly modified since it was deployed.

My main issue seems to be with stale views. I hit slice and it's not giving me the data that should be there. For awhile I wasn't even able to get harvest/listrecords to return any data. But i've heard talk about "forcing views to update" enough times that I was pretty sure that's really what i needed. By watching the couch log (/opt/couchdb/var/log/couchdb/couch.log) I could see which views were being called for which services, and that all of them send "stale=ok" when obviously it's not okay (or isn't being rebuilt correctly). So when slice didn't work, I pulled up a browser and hit the view directly, and what do you know - i see the view updated in the log, and now slice works. listrecords works now too, though /status still shows doc_count of 0.

It's been explained to me more than once that the views are updated every hour, or after 100 records have been published. I played with it again this morning, testing that rule, and it didn't work for me. I checked my slice for "identity=Brokers of Expertise", then published over 500 paradata records from Brokers of Expertise, gave it some time, and my slice was still returning the same number of results as before. I also tried slicing on today's date, but didn't get any records. Looking at the logs, it appears to check for changes every minute (/resource_data/_changes?feed=continuous&since=4617&include_docs=true). Once again, I wasn't able to get the slice to show new data until I hit the view directly.

Maybe something's going wrong with the check for changes? I'm not seeing any errors in any of the logs, then again maybe i'm missing one. ... .joe


-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-
joe hobson, owner / director
   Navigation North Learning Solutions LLC



Jim Klo

unread,
Jan 19, 2012, 12:56:54 PM1/19/12
to learnin...@googlegroups.com
take a look at this: https://docs.google.com/document/d/1CBWfVFT3Hg55iXk0w3HiSBuW8oQmhexasvo_jO-nv1A/edit?hl=en_US

there was a misconfiguration of the logging on the USB... that I think was preventing the indexing job from running...  There's notes in the above doc on how to fix.

- Jim

Jim Klo
Senior Software Engineer
Center for Software Engineering
SRI International

joe hobson

unread,
Jan 19, 2012, 2:42:41 PM1/19/12
to learnin...@googlegroups.com
I put that into my production.ini file and restarted the service. Went through the same process as before to see if views would get updated (checked a slice, published 300 records to the node, gave it some time, checked slice again), and unfortunately they are not. This looks like the error, in uwsgi.log:

18:16:32,129 DEBUG [lr.lib.couch_change_monitor.base_change_threshold_handler] [MainThread] class: UpdateViewsHandler Willl take action ...
18:16:32,129 DEBUG [lr.lib.couch_change_monitor.base_views_update_handler] [MainThread] class: UpdateViewsHandler Updating views ...
18:16:32,189 ERROR [lr.lib.couch_change_monitor.base_views_update_handler] [MainThread] global name 'appConfig' is not defined
18:16:32,189 DEBUG [lr.lib.couch_change_monitor.base_change_threshold_handler] [MainThread] class: DistributeThresholdHandler count: 100 countThreshold: 1000 timedelta: 0:09:55.906074 timethreshold: 999999999 days, 23:59:59.999999

... .joe

Nick Syrotiuk

unread,
Jan 19, 2012, 5:07:04 PM1/19/12
to learnin...@googlegroups.com
Hi there,

I'm just installing (another) node and working through the current Linux
Installation Guide.

I can't tag or checkout the LR code:

learningregistry@alpha:~/gitrepos$ git clone
https://github.com/LearningRegistry/LearningRegistry.git
Initialized empty Git repository in
/home/learningregistry/gitrepos/LearningRegistry/.git/
remote: Counting objects: 6986, done.
remote: Compressing objects: 100% (1977/1977), done.
remote: Total 6986 (delta 4897), reused 6835 (delta 4751)
Receiving objects: 100% (6986/6986), 31.53 MiB | 490 KiB/s, done.
Resolving deltas: 100% (4897/4897), done.
learningregistry@alpha:~/gitrepos$
learningregistry@alpha:~/gitrepos$ git tag -l
fatal: Not a git repository (or any of the parent directories): .git
learningregistry@alpha:~/gitrepos$
learningregistry@alpha:~/gitrepos$
learningregistry@alpha:~/gitrepos$ git checkout 0.23.4
fatal: Not a git repository (or any of the parent directories): .git
learningregistry@alpha:~/gitrepos$

----------------------------------------

I suspect it's something very simple.......

Cheers, Nick

--
N Syrotiuk | Mimas | University of Manchester | Manchester

Damon Regan

unread,
Jan 19, 2012, 5:41:35 PM1/19/12
to learnin...@googlegroups.com
Hi Nick,

I just spoke with Lou who has been working on the installation guides
and it sounds like Lou sent you guidance to change directories into
the LearningRegistry directory after issuing the git clone command.
(Hopefully we'll figure out why Lou couldn't respond back to the whole
group.)

I'm not a Git expert, so I hope others will chime in, but I think I
successfully checked out the code using the alternative sequence below
a couple days ago:

cd ~/gitrepos
git clone -b 0.23.4 https://github.com/LearningRegistry/LearningRegistry.git

So it sounds like there are two approaches that should work to check
out the code:

1. Slight modification to what you read in the guide:

cd ~/gitrepos
git clone https://github.com/LearningRegistry/LearningRegistry.git
[add] --> cd LearningRegistry
git tag -l
git checkout [latest tag version]

2. The approach I think I successfully used:

cd ~/gitrepos
git clone -b 0.23.4 https://github.com/LearningRegistry/LearningRegistry.git

Hopefully, we'll get an authoritative answer from the experts soon and
get you up and running one way or another.

Best Regards,
Damon

John Poyau

unread,
Jan 19, 2012, 5:50:53 PM1/19/12
to Learning Registry Developers List
Joe,

Which version/tag of the LR code are you using?


On Jan 19, 2:42 pm, joe hobson <joehob...@gmail.com> wrote:
> I put that into my production.ini file and restarted the service. Went through the same process as before to see if views would get updated (checked a slice, published 300 records to the node, gave it some time, checked slice again), and unfortunately they are not. This looks like the error, in uwsgi.log:
>
> 18:16:32,129 DEBUG [lr.lib.couch_change_monitor.base_change_threshold_handler] [MainThread] class: UpdateViewsHandler Willl take action ...
> 18:16:32,129 DEBUG [lr.lib.couch_change_monitor.base_views_update_handler] [MainThread] class: UpdateViewsHandler Updating views ...
> 18:16:32,189 ERROR [lr.lib.couch_change_monitor.base_views_update_handler] [MainThread] global name 'appConfig' is not defined
> 18:16:32,189 DEBUG [lr.lib.couch_change_monitor.base_change_threshold_handler] [MainThread] class: DistributeThresholdHandler count: 100 countThreshold: 1000 timedelta: 0:09:55.906074 timethreshold: 999999999 days, 23:59:59.999999
>
> ... .joe
>
> On Jan 19, 2012, at 9:56 AM, Jim Klo wrote:
>
>
>
> > take a look at this:https://docs.google.com/document/d/1CBWfVFT3Hg55iXk0w3HiSBuW8oQmhexas...
>
> > there was a misconfiguration of the logging on the USB... that I think was preventing the indexing job from running...  There's notes in the above doc on how to fix.
>
> > - Jim
>
> > Jim Klo
> > Senior Software Engineer
> > Center for Software Engineering
> > SRI International
> > e. jim....@sri.com
> > p. 805.542.9330 x121
> > m.         805.286.1350
> > f. 805.546.2444
>
> > On Jan 19, 2012, at 9:50 AM, joe hobson wrote:
>
> >> Now that we've gotten around to setting up our own node, I'm starting to learn a bit more about what's going on behind the scenes with the node processes. Inevitably this just leads to more questions, especially when things don't seem to be working correctly. My node is on 0.23.0, a deployment of the VM that was distributed at PlugFest2 - and I believe it hasn't been greatly modified since it was deployed.
>
> >> My main issue seems to be with stale views. I hit slice and it's not giving me the data that should be there. For awhile I wasn't even able to get harvest/listrecords to return any data. But i've heard talk about "forcing views to update" enough times that I was pretty sure that's really what i needed. By watching the couch log (/opt/couchdb/var/log/couchdb/couch.log) I could see which views were being called for which services, and that all of them send "stale=ok" when obviously it's not okay (or isn't being rebuilt correctly). So when slice didn't work, I pulled up a browser and hit the view directly, and what do you know - i see the view updated in the log, and now slice works. listrecords works now too, though /status still shows doc_count of 0.
>
> >> It's been explained to me more than once that the views are updated every hour, or after 100 records have been published. I played with it again this morning, testing that rule, and it didn't work for me. I checked my slice for "identity=Brokers of Expertise", then published over 500 paradata records from Brokers of Expertise, gave it some time, and my slice was still returning the same number of results as before. I also tried slicing on today's date, but didn't get any records. Looking at the logs, it appears to check for changes every minute (/resource_data/_changes?feed=continuous&since=4617&include_docs=true). Once again, I wasn't able to get the slice to show new data until I hit the view directly.
>
> >> Maybe something's going wrong with the check for changes? I'm not seeing any errors in any of the logs, then again maybe i'm missing one. ... .joe
>
> >> -:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-
> >> joe hobson, owner / director
> >>    Navigation North Learning Solutions LLC- Hide quoted text -
>
> - Show quoted text -

Nick Syrotiuk

unread,
Jan 19, 2012, 6:19:55 PM1/19/12
to learnin...@googlegroups.com
Hi Damon,

I followed the first approach. I knew it was something simple! Thanks
very much..........Nick

joe hobson

unread,
Jan 19, 2012, 6:40:21 PM1/19/12
to learnin...@googlegroups.com
"git log" shows last update as...
commit f05f7bddc3d47456dc6ac414be86b29d49c6d653
Merge: 6446ab1 1f5b919
Author: John Poyau <john....@lmco.com>
Date:   Thu Dec 8 15:13:42 2011 -0800

Lou Wolford

unread,
Jan 20, 2012, 2:05:42 PM1/20/12
to learnin...@googlegroups.com
Hi everyone,

To clear things up a bit, it's better to move into the LearningRegistry directory and perform a git checkout [latest tag version]. This way it grabs the entire tree and switches to that tag whereas using git clone -b [latest tag version] only gets that version. If you have any other questions/concerns let us know Nick, thanks!

John Poyau

unread,
Jan 20, 2012, 3:11:32 PM1/20/12
to Learning Registry Developers List
Joe,

Did my update fix the problem that you were getting with the views?

On Jan 19, 6:40 pm, joe hobson <joehob...@gmail.com> wrote:
> "git log" shows last update as...
>
>
>
> > commit f05f7bddc3d47456dc6ac414be86b29d49c6d653
> > Merge: 6446ab1 1f5b919
> > Author: John Poyau <john.po...@lmco.com>
> > Date:   Thu Dec 8 15:13:42 2011 -0800
> > > - Show quoted text -- Hide quoted text -

Nick Syrotiuk

unread,
Jan 22, 2012, 5:27:00 AM1/22/12
to learnin...@googlegroups.com
Hi there,

I seem to be having the same problem as Joe Hobson. I have two nodes
running on Ubuntu; one uses version 0.23.3 (I think!) of the LR code and
the other uses version 0.23.4.

Here's the problem as I see it:

$ curl http://alpha.mimas.ac.uk/status
{"node_name": "JLeRN alpha node", "node_id":
"989e14689205447aa0483e72620fccdd", "active": true, "timestamp":
"2012-01-22T00:04:34.940108Z", "start_time":
"2012-01-21T15:10:27.211390Z", "install_time":
"2012-01-21T15:10:27.211390Z", "earliestDatestamp": null, "doc_count": 0}$
$
$
$ curl -X POST -H "Content-Type:application/json"
"http://alpha.mimas.ac.uk/publish" -d @test_data.json
{"document_results": [{"doc_ID": "b5c5449a8783453c8f041640c92d07c9",
"OK": true}], "OK": true}
$
[Time passes. Overnight actually. Joe mentioned the need to wait one
hour or to publish 100 docs. Just wondering where this is documented as
I seem to have missed it?]
$
$ curl http://alpha.mimas.ac.uk/obtain
{"documents":[]}
$
$
$ curl http://alpha.mimas.ac.uk/slice?from=2012-01-22
{"documents":[

], "resultCount":0}
$
$
$ curl http://alpha.mimas.ac.uk/status
{"node_name": "JLeRN alpha node", "node_id":
"989e14689205447aa0483e72620fccdd", "active": true, "timestamp":
"2012-01-22T00:07:38.848501Z", "start_time":
"2012-01-21T15:10:27.211390Z", "install_time":
"2012-01-21T15:10:27.211390Z", "earliestDatestamp": null, "doc_count": 0}$
$

--------------------------------------
If I were to log in to the server, open firefox, browse to Couchdb, and
run some views on "resource_data", the document I published would be there.

If I then were to execute this command again:
$ curl http://alpha.mimas.ac.uk/status

the doc_count would increase to 1, and obtain and slice would retrieve
the doc successfully. [I tested this on the other node.]

I should add that I'm using the default development.ini config file when
I start uwsgi. This file looks completely different from the config
file here: https://gist.github.com/1584446. Do you think I need to use
a different config file?

Cheers, Nick

--

joe hobson

unread,
Jan 22, 2012, 12:57:58 PM1/22/12
to learnin...@googlegroups.com
I attempted to apply the patch without first thinking of backing up what I had (dumb). So your patch upgraded my entire installation to 0.23.3 rather than just patching the one file you changed. Unfortunately now it's not running, probably because there are other necessary steps for upgrading from 0.23.0 to 0.23.3. There is already a pivotal story that addresses this specific issue. Maybe someone can give me the short version so I can get my node back up and running?

John Poyau

unread,
Jan 23, 2012, 11:14:38 AM1/23/12
to Learning Registry Developers List
Joe,

Sorry to hear that. You can try backing out the last merge you did
and see if that helps. The following link has some info on undoing
merge.

http://stackoverflow.com/questions/2389361/git-undo-a-merge

My changes were only to one file you should be able to just copy from
github
https://github.com/LearningRegistry/LearningRegistry/blob/master/LR/lr/model/resource_data_monitor/update_views_handler.py
and and paste it your update_views_handler.py file once you been able
to restore to last working version.

What is error are you getting ?

joe hobson

unread,
Jan 23, 2012, 1:29:50 PM1/23/12
to learnin...@googlegroups.com
John,
Thanks for your help. I was able to rollback my last merge using the page you listed below (i'm still learning git. We're mostly an svn shop). the update_views_handler.py wasn't the issue (mine matched yours in that case). You fixed my issue with updates to base_views_update_handler.py in pull request 176. I updated that file only, restarted the service, checked the slice recordcount, published 200 docs, and slice showed the updated recordcount. Only error I noticed in the couch.log was this one, which seems unrelated...


[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19508.142>] checkpointing view update at seq 2752 for resource_data _design/oai-pmh-get-records
[Mon, 23 Jan 2012 18:03:31 GMT] [error] [<0.20434.142>] Uncaught error in HTTP request: {exit,
                                                         {timeout,
                                                          {gen_server,call,
                                                           [<0.19456.142>,
                                                            request_group_info]}}}
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19525.142>] checkpointing view update at seq 2744 for resource_data _design/oai-pmh-identify-timestamp
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19581.142>] checkpointing view update at seq 2283 for resource_data _design/oai-pmh-test-data
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.11263.0>] checkpointing view update at seq 5002 for resource_data _design/learningregistry-resource-location
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19456.142>] checkpointing view update at seq 2399 for resource_data _design/filter
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19544.142>] checkpointing view update at seq 2812 for resource_data _design/oai-pmh-list-identifiers
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19561.142>] checkpointing view update at seq 2967 for resource_data _design/oai-pmh-list-metadata-formats
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19508.142>] checkpointing view update at seq 2753 for resource_data _design/oai-pmh-get-records
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.20434.142>] Stacktrace: [{io_lib_pretty,cind_tag_tuple,7},
                                    {io_lib_pretty,while_fail,3},
                                    {io_lib_pretty,print,6},
                                    {io_lib_format,build,3},
                                    {io_lib_format,build,3},
                                    {io_lib_format,build,3},
                                    {io_lib_format,build,3},
                                    {io_lib_format,build,3}]
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19525.142>] checkpointing view update at seq 2745 for resource_data _design/oai-pmh-identify-timestamp
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19581.142>] checkpointing view update at seq 2284 for resource_data _design/oai-pmh-test-data
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.11263.0>] checkpointing view update at seq 5003 for resource_data _design/learningregistry-resource-location
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19456.142>] checkpointing view update at seq 2400 for resource_data _design/filter
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19544.142>] checkpointing view update at seq 2813 for resource_data _design/oai-pmh-list-identifiers
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19561.142>] checkpointing view update at seq 2968 for resource_data _design/oai-pmh-list-metadata-formats
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19508.142>] checkpointing view update at seq 2754 for resource_data _design/oai-pmh-get-records
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.20434.142>] 127.0.0.1 - - 'GET' /resource_data/_design/filter/_info 500
[Mon, 23 Jan 2012 18:03:31 GMT] [info] [<0.19525.142>] checkpointing view update at seq 2746 for resource_data _design/oai-pmh-identify-timestamp

Damon Regan

unread,
Jan 27, 2012, 4:55:37 PM1/27/12
to learnin...@googlegroups.com
Hi Nick,

Are you still experiencing the problem? It sounds like John's fix for
the view update handler did the trick for Joe. John's fix is
currently in master. We plan to release a new version that will
include this soon, but in the meantime we recommend pulling from
master, which should give you an easy upgrade path.

It's a two step process to upgrade your node to master:

1. Pull the most recent tag from git

cd <your path to git repository>/LearningRegistry
git checkout master
git pull

2. Run the setup node python script

python setup_node.py -d

It is important to re-run the setup node python script as
configuration changes take place during updates. We hope to have an
update script soon.

Best Regards,
Damon

Nick Syrotiuk

unread,
Jan 28, 2012, 5:40:33 AM1/28/12
to learnin...@googlegroups.com
Hi Damon et al,

Yes, that was definitely the fix. I think the indexes are getting
updated (instantly!) and contain the correct data now. :)

Thanks to everyone who attended the developer call on Thursday, and
thanks especially for your patience!

I greatly look forward to meeting Walt in Nottingham at the Cetis
conference next month.

Cheers, Nick

Reply all
Reply to author
Forward
0 new messages