I’ve pulled data with dates from 10/26/2011 thru 11/7/2011 (a period of about two weeks) from the LR for importing into our SQL database using the Basic Harvest. I’m doing it in date order to get as much info as I can. However, for this approximately two week period, I’ve received nearly 400,000 records from the Learning Registry. http://node01.public.learningregistry.net/status reports there are 437,094 documents in the learning registry. I’m expecting that the amount of data submitted each week remains fairly high for a period of at least a few weeks (or months) beyond what I’ve already pulled. If my assumption is correct, there are a lot more documents in the LR than are being reported by status.
My database experience has been with relational databases, not CouchDB or other NoSQL style databases.
· How many documents are there REALLY in the LR?
· At what time frame can I expect the number of documents submitted to LR to drop significantly?
I need to know this so I can give a fairly accurate estimate to my boss of when we will be caught up with importing data from the LR into our SQL database and can just retrieve new items. If there are really 437,094 documents in the LR, then I’m in pretty good shape. If, on the other hand, there are ten times more records than that, then that would definitely be useful information to share with The Powers That Be, as they are chomping at the bit for the import to be caught up.
If there’s a CouchDB query that I could run against our node that would give me this information, I’d be okay with that too.
Jerome Grimmer
Southern Illinois University Carbondale
2450 Foundation Drive Suite 100
Springfield, IL
Phone: 217-786-3010 ext. 5857
Toll-free: 1-800-252-4822 ext. 5857
NOTE: My E-mail address has changed
"Your words have power. Use them wisely." --Unknown.
--
---
This message is posted from the Google Groups "Learning Registry Developers List" group.
To post: learnin...@googlegroups.com
To unsubscribe: learningreg-d...@googlegroups.com
Hi Steve,
That is very helpful. It is also a bit of a relief on my end to know that there’s probably only about 40,000 records I haven’t pulled and not 4 million! I would be interested in seeing a map reduce that would do what you showed (with source code pleaseJ) so I can learn from it.
Jerome Grimmer
Southern Illinois University Carbondale
2450 Foundation Drive Suite 100
Springfield, IL
Phone: 217-786-3010 ext. 5857
Toll-free: 1-800-252-4822 ext. 5857
NOTE: My E-mail address has changed
"Everybody is a genius. But if you judge a fish on its ability to climb a tree, it will live its whole life believing that it is stupid." – Albert Einstein