Hey Marcus,
thank you for following my suggestion to take the discussion here. I
didn't expect you to do that right away ;-), but here we are ;-)
Yes, sounds about right, plus the actual content like images etc,
but those only once.
On a sidebar: A 20 minutes sync shouldn't make too much sense for
most cases, as afaik Google doesn't poll the feeds all that
frequently. So once an hour should be ok and for example my gf uses
once every four hours, because she is looking for content, not
breaking news.
The only need for frequent updates I can see is for syncing back to
Reader, because you might be switching back to the Google Reader web
interface and expect your changes from the phone to be reflected
already. For this particular case though, there is already a solution
in place: The upload-only-sync, that happens automatically 5 minutes
after your last change.
One way to shrink the MBs would be for NewsRob to use gzip when
talking to Google Reader. But as most users are on a flat rate that
doesn't seem like the right tradeoff in most cases, as gunzipping is
very CPU intensive. And performance is not an aspect that NewsRob is
all that good at right now. Furthermore CPU usage means, burning
through the battery faster and producing more heat, which will also
reduce battery life.
And it would only lessen the problem, not solve it.
Ok, now to the proposed approach, please consider this. You have a
capacity of 500 articles, 500 articles are downloaded in the database
after a "full-sync". 250 of them are read, 250 unread. Ok, so far?
Now you want to make a "partial-sync". You ask Google Reader for the
latest 500 articles that are unread.
Now the 250 articles that are unread in the local database need to
be fetched too. So with this approach you would have only cut this
problem into a half, so instead of 1.000.000 articles (taken from your
mail) it would be 500.000 articles.
And as this doesn't solve my problem, not even reduce it by a
magnitude, I would stop here. But for the sake of argument and because
I might be missing something here, now as we got 500 unread articles
from Google the other 250 articles, that we don't have in our local
database, would need to be added to our local database, right? When
doing that, we would reach a capacity of 750. The limit was 500
though. Which 250 to delete? The read ones? The user might still want
them, in particular might just have marked one (accidentally) as read
and would be annoyed if s/he can't undo his or her action later on.
The last issue might be circumvented somewhat, when decoupling the
storage capacity (let's say we increase that to 5.000) from the
download chunk size (500 articles). But this would mean that an old
article that was in the state read locally will not know when the
article was set to unread in Google Reader. At least not until the
next full sync.
Also having a full and a partial-sync is a complicated concept. Some
users told me that they don't read the release notes, nor the website
and wouldn't even want to watch a video. So everything that is non-
intuitive needs explaining, and therefore will miss many users.
Btw. This is exactly the reasons why I will release a version of
NewsRob on Sunday that has a new navigation, because the old way was
too complicated to explain. You'll see what I am talking about on
Sunday.
But continuing with the original problem: Users will be po'ed,
because they marked something old as read (or unread, or starred, or
shared, etc.) and it doesn't show up with the new state on their
phone. They will hate NewsRob for it, write flame mails to me and post
1-star ratings on the Android Market ;-(
The same problem exists, when using some other undocumented feature
of the Google Reader API, where I can specify when I last accessed
there service, but then it also swallows state changes. After
discovering that I didn't investigate this route further, but maybe
there are even other issues waiting for me.
As stated in my other reply, it comes down to this: I can try to
invest a huge amount of time in ugly, complicated workarounds, but
that is also time that could be spend on ugly performance
problems ;-)
So for the time being performance and some missing features (feeds as
first class citizens) are my priority, but I will get back to this
issue, if only to allow for capacity enlargements beyond 500 articles.
During my performance tests I was working with 5000 articles and that
was ok, or at least roughly as ok as 500 are, but until I don't get
the "intelligent sync" problem solved I can't go down that route.
I still hope that Google will add something helpful to their
protocol, but I am not holding my breath ;-) If you happen to know
somebody in the Google Reader team, please point them to this
problem ;-)
Thanks for taking the time to share your thoughts. I am sure that
using this process we will eventually come up with a solution that we
like or at least accept!
Cheers,
Mariano