Bug in reading list continuations?

3 views
Skip to first unread message

Mariano Kamp

unread,
Sep 28, 2009, 4:23:15 AM9/28/09
to foug...@googlegroups.com
Hey,

  it seems to me that continuations on the reading list are a little bit buggy.

  I have a request to get the most recent 500 articles [1] and wrote the response to full500Req.xml. Now I do the same but using 25 requests using continuations and packs of 20 articles [2], also writing the results to files req<ts>.xml. My understanding is that both should return the same articles when submitted at the same point of time.

  But this doesn't seem to be the case. I found out about this, because users complaint that articles weren't updated in NewsRob. Further analysis brought to light that the articles in question weren't in the reading list even though they should have.

  I wrote a simple script [3] that takes the article ids from the full request on one hand and compares it to the article ids from the 20'er requests on the other hand.

  And the result shows that after a couple of requests both get out-of-sync.
 
mariano-kamps-macbook-pro:requests mkamp$ ruby analyze.rb
parsing full500Req.xml.
parsing req1253365527968.xml.
[..]
parsing req1253366006414.xml.
parsing req1253366030567.xml.
---
["8d1eb435d37f64ff", "8d1eb435d37f64ff", true]
["74e6b581587738ca", "74e6b581587738ca", true]
["f192c4c9bee8ead7", "f192c4c9bee8ead7", true]
["01be91d746473a5f", "01be91d746473a5f", true]
["9d3f9bd2631ac558", "9d3f9bd2631ac558", true]
["679a06e4f540469d", "679a06e4f540469d", true]
["7cb87e18d2d14571", "7cb87e18d2d14571", true]
["d88b46bdac05c86c", "d88b46bdac05c86c", true]
["e474baa0f1febe12", "e474baa0f1febe12", true]
["ed9a6a301aa0be7d", "ed9a6a301aa0be7d", true]
["187b288b6332e76d", "187b288b6332e76d", true]
["e47d19b9ea80eb14", "e47d19b9ea80eb14", true]
["ad0b1778d5c7e6e0", "ad0b1778d5c7e6e0", true]
["2c4b83c3b5597856", "2c4b83c3b5597856", true]
["90ca27c3d155f7a8", "90ca27c3d155f7a8", true]
["4346c04f3f689416", "4346c04f3f689416", true]
["64f15489b9565629", "64f15489b9565629", true]
["967c6f579b03236d", "967c6f579b03236d", true]
["2de69eaf3df6787d", "2de69eaf3df6787d", true]
["a86274e45aef243c", "a86274e45aef243c", true]
--- 20
["3fce2ca48ae6ca60", "3fce2ca48ae6ca60", true]
[..] Omitted the remaining until the "anonmaly"
["028a6f2e8b66b23c", "028a6f2e8b66b23c", true]
--- 160
["9197a687a7441a55", "9197a687a7441a55", true]
["6f56d7190adf03ba", "6f56d7190adf03ba", true]
["6ae10128c24205aa", "6ae10128c24205aa", true]
["5c0a3e93dabf85d7", "6eff16a8ae6cb4d1", false]
["6eff16a8ae6cb4d1", "06e9508963e25ce0", false]
["52c53bf78e9c10ee", "d3848ade7c820ba1", false]
["99aa9f316f024018", "0d34d85930e8eac4", false]
["6fa9fded446b1187", "998dc0d591fff445", false]
["025fc91a526bb0f8", "84ddaee650177ee8", false]
["06e9508963e25ce0", "b90645a8996f101d", false]
["20a60f75ab88b279", "e3e871661b887d5d", false]
["48d71784440b886c", "05dece5c449aeec6", false]
["4612141d42141f2e", "46fedb5fd93f06c8", false]
["868205cd8616ba4a", "9563ba84d22e7101", false]
["d3848ade7c820ba1", "3673ff90ce76b082", false]
["5d633deaa2978b04", "a8e9bd4b2bf0539f", false]
["eb438a756adaf6bf", "689cd2a91faf4929", false]
["2b24435bf9216706", "479cf3df51e69a28", false]
["0d34d85930e8eac4", "2d5dcb24e6e45d7e", false]
["998dc0d591fff445", "30a18317ba07cb06", false]
--- 180
["84ddaee650177ee8", "d0e5c77080536451", false]
[..]
["2d5dcb24e6e45d7e", "02f6c7e9855a4dd8", false]
--- 200
["30a18317ba07cb06", "d8781a41c28974ed", false]
[..] Omitted the rest until the end
["4c65094d8a552f50", "f97f5a37694eb1ff", false]
--- 500


  I attached the files and the script and the request data to this mail.

Cheers,
Mariano

[1] http://www.google.com/reader/atom/user/-/state/com.google/reading-list?n=500&r=n
[2] http://www.google.com/reader/atom/user/-/state/com.google/reading-list?n=20&r=n&c=CNGnpub2-pwC
[3]
require 'rexml/document'
require 'pp'

def extract_ids(filename)
  puts "parsing #{filename}."
  rows = []
  d = REXML::Document.new(File.read filename)
  d.elements.each('/feed/entry/id') do |id_element|
    id_value = id_element.text.match(/item\/(.*)/)[1]
    rows << [id_value]
  end
  rows
end

# read full 500
rows = extract_ids('full500Req.xml')

# read 20'er packs
idx = 0
Dir.glob('req*.xml').sort.each do |fn|
  ids = extract_ids(fn)
  ids.each do |id|
    id_value = id.first
    rows[idx] << id_value
    match = rows[idx].first.eql?(id_value)
    rows[idx] << match
    idx += 1
  end
end

# report

puts "---"
rows.each_with_index do |row, i|
  pp row
  puts "--- #{i+1}" if (i+1) % 20 == 0
end

Mariano Kamp

unread,
Sep 28, 2009, 4:33:27 AM9/28/09
to foug...@googlegroups.com
1. I should not forget the attachments.
2. I should not forget the attachments.
3. I should not forget the attachments.
4. I should not forget the attachments.
5. I should not forget the attachments.
[..]
500. I should not forget the attachments.

Here it is: http://claudia-und-mariano.net/req.zip

Mariano Kamp

unread,
Oct 10, 2009, 10:47:18 AM10/10/09
to Friends of the Unofficial Google Reader API
The zip file contains my personal feeds and articles. Although that is
not exactly classified information I don't want them to be public
forever, so I remove the zip file from my server again.
If there is an interest, let me know and I put it again.

Btw. give the response here I meanwhile don't use continuations
anymore, instead I use SAX for parsing.
> [1]http://www.google.com/reader/atom/user/-/state/com.google/reading-lis...
> [2]http://www.google.com/reader/atom/user/-/state/com.google/reading-lis...
Reply all
Reply to author
Forward
0 new messages