Problem downloading/viewing items

cheek

unread,

May 2, 2010, 2:32:06 PM5/2/10

to ReaderScope

Hi,

I am having some trouble using readerscope; items do not seem to want
to load. Everything appears to be functioning normally, but when I
select an item to read, the view changes to show the title at the top,
but the item content never actually loads, and i am left with a black
empty background, with a white bar at the top (about 5 pixels tall or
so)

It appears like the item attempts to load, but fails; i sometimes see
the loading percentage text above the white bar start at 0%. It never
seems to make it past about 10% before disappearing and the black
screen, and sometimes it displays no percentage at all.

I noticed this problem before I purchased ReaderScope, but it only
happened on one of my feeds, so i assumed it was related to the feed
itself that RS was having a hard time with. It appears now to be
affecting everything, as i cannot read any articles. This effectively
makes RS usefuless :(

Devs, please help!

Jayesh Salvi

unread,

May 2, 2010, 10:44:26 PM5/2/10

to reade...@googlegroups.com

Hi,

One of the reasons this might happen is the news content that is
downloaded in the cache is no longer accessible. By default the cache is
stored on SDCard. After downloading the items, ReaderScope tries to save
the news content in the cache on SDCard. If at that point writing to
SDCard fails OR it succeeds but the content becomes inaccessible due to
SDCard problems then you will probably see what you are seeing.

Could you go to Settings > Logs, scroll down to the error logs and see
if any SDCard related messages are there. If you want me to look at
them, just press Menu > Email at that point and logs will be emailed to
me.

HTH

Jayesh

cheek

unread,

May 3, 2010, 4:34:07 PM5/3/10

to ReaderScope

It seems that the problem in fact could be related to missing cached
content. When I force a "Refresh" of a particular feed, the items
therein load just fine. (For the most part; still seeing a select few
feeds with no content) I had a feed which had two unread items
published about a week ago, that fail to load in the manner described
above. Doing Menu > Refresh on that feed, reloads the same two items
(which are still unread). Viewing them after refresh, they load fine.
In this entire process, and viewing/refreshing other such feeds, I see
no errors inserted into the "Error log" section of Menu > Settings >
Logs.

This is obviously a known behavior, but is it desirable? It seems that
the correct behavior for a content cache is to: attempt to load from
cache, if an error occurs, or the cache does not contain the item, or
the item contained is null or 0-length, a fresh copy should be fetched
from the original data source, and potentially cached again. This
behavior would make sense where the user has not selected "Go
Offline", as the inverse implies an active connection with the web.

Failing that change, is there another 'preferred' use case for
ReaderScope? Should i be "Refresh"ing each feed before I view it to
ensure the items are available?

Thanks,
~N

fern...@bugabundo.net

unread,

May 3, 2010, 4:45:31 PM5/3/10

to reade...@googlegroups.com

On Mon, May 3, 2010 at 21:34, cheek <chee...@gmail.com> wrote:

It seems that the problem in fact could be related to missing cached
content. When I force a "Refresh" of a particular feed, the items
therein load just fine. (For the most part; still seeing a select few
feeds with no content) I had a feed which had two unread items
published about a week ago, that fail to load in the manner described
above. Doing Menu > Refresh on that feed, reloads the same two items
(which are still unread). Viewing them after refresh, they load fine.
In this entire process, and viewing/refreshing other such feeds, I see
no errors inserted into the "Error log" section of Menu > Settings >
Logs.

You can clear the entire cache, and then let it pull new items, _fixing_ the problem

I have mine on auto purge every 3 days

--
Fernando Pereira
(``-_-´´) -- BUGabundo :o)
http://mi.BUGabundo.net

cheek

unread,

May 3, 2010, 7:56:19 PM5/3/10

to ReaderScope

This may fix the immediate problem of items failing to load, however
it does not guard against whatever conditions caused inconsistencies
between cache and item-database in the first place, so the issue is
likely to reoccur. It also does not correct a code path that leads to
a silent failure (no logs, no message, no action).

Also, I have "Cleanup" turned on, if this is the setting you're
referring to. Found under Menu > Settings > Storage Management >
Storage Cleanup (section). I currently have mine (for a while now) set
to delete items older than a week every day. This obviously does not
completely clear the cache, and the affected items are left.

Clearing the cache completely might not be a bad idea though, Can I
simply delete everything in /sdcard/readerscope/webcontent ??? Will RS
recover gracefully and recache the correct items next refresh?

While we're on the subject of the cache, I'd also like to raise the
point that the storage mechanism chosen wastes about 90% of storage
space allocated on disk (for my feeds/items anyway.) I see A LOT of
very small files which due to disk block size, can result in very
different values for actual "Size" v.s. "Size On Disk"; I've currently
got about 3000 items in my cache folder which is only around 7Mb of
actual data, but takes up nearly 60Mb of disk space! The largest
offender seems to be the multitude of .link files. Please consider
moving these files into some other structure.

Devs, any other advice?

~N

On May 3, 1:45 pm, ferna...@bugabundo.net wrote:

cheek

unread,

May 3, 2010, 7:58:45 PM5/3/10

to ReaderScope

Also forgot to mention regarding the cache, the multitude of
directories and files created gives a lot of file managers trouble
with the volume. Windows handles this well, but both Astro and
Estrongs file manager both crash trying to open this directory....

Jayesh Salvi

unread,

May 3, 2010, 10:19:54 PM5/3/10

to reade...@googlegroups.com

Thanks for the detailed discussion.

inline...

On Mon, May 03, 2010 at 04:58:45PM -0700, cheek wrote:
> Also forgot to mention regarding the cache, the multitude of
> directories and files created gives a lot of file managers trouble
> with the volume. Windows handles this well, but both Astro and
> Estrongs file manager both crash trying to open this directory....
>
>
>
> On May 3, 4:56�pm, cheek <cheek...@gmail.com> wrote:
> > This may fix the immediate problem of items failing to load, however
> > it does not guard against whatever conditions caused inconsistencies
> > between cache and item-database in the first place, so the issue is
> > likely to reoccur. It also does not correct a code path that leads to
> > a silent failure (no logs, no message, no action).
> >
> > Also, I have "Cleanup" turned on, if this is the setting you're
> > referring to. Found under Menu > Settings > Storage Management >
> > Storage Cleanup (section). I currently have mine (for a while now) set
> > to delete items older than a week every day. This obviously does not
> > completely clear the cache, and the affected items are left.
> >
> > Clearing the cache completely might not be a bad idea though, Can I
> > simply delete everything in /sdcard/readerscope/webcontent ??? Will RS
> > recover gracefully and recache the correct items next refresh?

Yes that should be OK. But recommended way of cleaning entire cache is
Settings > Storage Management > Cleanup Now > All Items

> >
> > While we're on the subject of the cache, I'd also like to raise the
> > point that the storage mechanism chosen wastes about 90% of storage
> > space allocated on disk (for my feeds/items anyway.) I see A LOT of
> > very small files which due to disk block size, can result in very
> > different values for actual "Size" v.s. "Size On Disk"; I've currently
> > got about 3000 items in my cache folder which is only around 7Mb of
> > actual data, but takes up nearly 60Mb of disk space! The largest
> > offender seems to be the multitude of .link files. Please consider
> > moving these files into some other structure.

That's an interesting observation.

The .link files are meant to have temporary existence. They contain the
URLs of the actual content. To speed up the downloading the news feed,
RS doesn't fetch the embedded images immediately but stores a pointer to
them in .link files. A background thread then fetches these resources
and deletes the .link files OR if you happen to access that news item
before the background thread reaches it, it will be cached then. In both
cases the .link file will be deleted.

The rest of the files in cache are actual resources (html or image
files), their size is not under RS's control. They could be bunched
together into one single file, but then accessing and deleting
individual resources from that single file would be an expensive in terms
of cpu and i/o time (imagine it to be a file system inside a
file). Using the existing file system as a cache is advantageous in
that case.

Regarding the behavior of cache and your original problem:

You are right about how a cache should behave given that the content
gets accidentally deleted, it should go and fetch the item again.
However it's not practical in this case. There are two types of content
in the cache - 1. A news item (html/text of the news item)
2. the embedded content inside the news item (mainly images).
The embedded content is fetched from the net on as needed basis using
the .link mechanism I explained above. However the news item content is
actually populated by what was received in the news feed.

Therefore, if due to storage issues the news item cache file gets lost
the only way to automatically reload the content is accessing the entire
feed again (which is min 20 items at a time). So instead of doing this
automatically for a remote case of storage accidents, it is left to
manual choice to refresh the feed and fetch the contents again.

Hope this answers your questions.

cheek

unread,

May 4, 2010, 3:24:04 AM5/4/10

to ReaderScope

On May 3, 7:19 pm, Jayesh Salvi <jayeshsa...@gmail.com> wrote:

(switched the order on these blocks to better flow)

> The rest of the files in cache are actual resources (html or image
> files), their size is not under RS's control. They could be bunched
> together into one single file, but then accessing and deleting
> individual resources from that single file would be an expensive in terms
> of cpu and i/o time (imagine it to be a file system inside a
> file). Using the existing file system as a cache is advantageous in
> that case.

The existing behavior of the content cache for news item text/html
makes perfect sense. It is the perfect storage medium for that type of
data; html FILES, in the FILEsystem :) I would also guess that the
"size discrepancy" discussed above would be minimized storing this
type of data (arbitrary length text). Attempting to store in a
secondary file system certainly would lead to increased IO time.

> The .link files are meant to have temporary existence. They contain the
> URLs of the actual content. To speed up the downloading the news feed,
> RS doesn't fetch the embedded images immediately but stores a pointer to
> them in .link files. A background thread then fetches these resources
> and deletes the .link files OR if you happen to access that news item
> before the background thread reaches it, it will be cached then. In both
> cases the .link file will be deleted.

(Bear in mind that I make some assumptions here about how your
software is setup)

The .link files on the other hand, do not seem to be the natural
storage choice for this type of information. To paraphrase, these
files contain a collection of (string) urls or 'references' to other
resources (pictures, css probably, etc) associated with a particular
news item. These target files/items may be found in the cache
directory, or may have to be fetched from the web.

This information seems to clearly be 'structured data' associated
strongly to entries in your item database; as such it should live IN
the database. Just as you store a reference to html cache in the db
(you must, right?), you should also store references to _other_ items
required to display that news-item. This could be implemented as a one-
to-many relationship from your "item" table to a "resources" table,
which could contain a link to the parent news-item, an original
location (url), and the cached location.

Structuring this data in the database should lead to LESS I/O and
faster response all around, since you would have a lot fewer trips to
disk. You should have cleaner and speedier code in: Filling the cache--
background thread just SELECTs items to process; purging items and/or
cache--links would always be synchronized with items by cascading or
transactional DELETEs; viewing items--no need to parse .link files to
load on-demand. You also would mitigate the size discrepancy problem
and get a smaller and more efficient cache!

> Therefore, if due to storage issues the news item cache file gets lost
> the only way to automatically reload the content is accessing the entire
> feed again (which is min 20 items at a time). So instead of doing this
> automatically for a remote case of storage accidents, it is left to
> manual choice to refresh the feed and fetch the contents again.

Wow, that's a bummer. Too bad there is no API to retrieve a single
item. If there is no simple way to retrieve specific items (or small
sets of items that will definitely contain the target), there's not
much you can do except to display an error message (or visual cue)
indicating that the item could not be found. How about "greying out"
the content area, or adding a "cache item not found" image/message?

cheek

unread,

May 6, 2010, 3:08:39 PM5/6/10

to ReaderScope

ping!
bummer, i was hoping to get a response to this.

Jayesh Salvi

unread,

May 6, 2010, 3:27:58 PM5/6/10

to reade...@googlegroups.com

Sorry for delay. inline...

Yes you are right, but my design didn't go that route due to some other
architectural reasons.

The cache is an Android ContentProvider and the GUI app talks to this
provider over standard interface. I am not sure if that makes them run
in two separate processes or not, but Android frameworks hides that
part. Therefore I did not want to access the same database through these
two different entities.

Still I could have separate databases; but that didn't seem necessary.

The .link files should be short-lived. I will however take a took if
they are leaking.

> >
> > Structuring this data in the database should lead to LESS I/O and
> > faster response all around, since you would have a lot fewer trips to
> > disk. You should have cleaner and speedier code in: Filling the cache--
> > background thread just SELECTs items to process; purging items and/or
> > cache--links would always be synchronized with items by cascading or
> > transactional DELETEs; viewing items--no need to parse .link files to
> > load on-demand. You also would mitigate the size discrepancy problem
> > and get a smaller and more efficient cache!
> >
> > > Therefore, if due to storage issues the news item cache file gets lost
> > > the only way to automatically reload the content is accessing the entire
> > > feed again (which is min 20 items at a time). So instead of doing this
> > > automatically for a remote case of storage accidents, it is left to
> > > manual choice to refresh the feed and fetch the contents again.
> >
> > Wow, that's a bummer. Too bad there is no API to retrieve a single
> > item. If there is no simple way to retrieve specific items �(or small
> > sets of items that will definitely contain the target), there's not
> > much you can do except to display an error message (or visual cue)
> > indicating that the item could not be found. How about "greying out"
> > the content area, or adding a "cache item not found" image/message?

Yeah, I suppose an error message can be put in there. I will put it on
my todo.

Reply all

Reply to author

Forward