Binding list views to large data sets

4403 views
Skip to first unread message

THill

unread,
Feb 1, 2010, 1:01:07 AM2/1/10
to Android Developers
I need to be able to scroll through a list with (up to) 20K rows,
backed by a cursor on a read-only db.

Testing 20K rows on a G1, the query takes approx .02s regardless of
table size, while binding the adapter to the list takes 4.5s. Note
that this is before the view calls used in rendering.

What makes the binding delay more troublesome is that it blocks the UI
thread, and is incurred on every rotate. Combine these, and doing a
rotate just after starting the bind results in a delay up to 9s.

I tried lazy loading with a base adapter, but that hits the same delay
in the underlying cursor.getCount(), and the overall user experience
isn't ideal.

I also considered using an ArrayAdapter, adding synthesized objects
from the cursor rows, but then I'd incur the overhead of creating up
to 20K objects (not to mention the additional memory requirements).

Any recommended alternatives or workarounds?

Thanks in advance,
Tim

Nerdrow

unread,
Feb 1, 2010, 5:51:44 PM2/1/10
to Android Developers
The standard reply is going to be you'll never actually need to view
all 20K rows, so either page it out or filter the results to a more
manageable number. Then again, if an end user is willing to scroll
through 20K rows, they might be willing to wait 5s for it to load :)

Past that, I assume you're extending a CursorAdapter, you can use
setEmptyView on the ListView to show a progress dialog, run your query
on a background thread (or AsyncTask), then use a Handler (or the
AsyncTask) to call changeCursor w/the results of the query. It will
at least feel more responsive in that the UI doesn't just freeze while
it's working.

THill

unread,
Feb 2, 2010, 9:31:28 AM2/2/10
to Android Developers
Hehe, that was my reply when I started the project <sigh>.

A crude use case example is where the records are like short log
entries, and you want to scroll through them to watch events over a
time span. You need a big window, since in some time frames, there
may be few records; in others, there may be thousands -- the data
density varies & you can't just give them an index since they may not
know what time range will cover what they need to know.

I'm doing the query & changing the SimpleCursorAdapter cursor in the
background via AsyncTask, then binding the adapter to the list in
onPostExecute. Binding to the list has to be done in the UI thread,
hence the UI freeze during the bind (as the list requests the row
count from the adapter).

Now, I can avoid the UI freeze by doing a cursor.getCount (source of
the delay) before binding to the list, but then I have some cursor
management issues when the activity goes away due to a rotate --
either stall the onStop/onDestroy until the cursor.getCount completes
& the cursor can be closed, or keep the cursor static & risk the
overhead if it is abandoned.

Unless I'm misunderstanding your suggestion, paging won't work because
you'd still need to know the row count (incur delay) before deciding
if/how to paginate.

The iPhone version of our app works without any of these issues, so
I'm hoping that I'm just missing something (and that it's not an
Android limitation).

Other ideas?

Thanks
Tom


On Feb 1, 2:51 pm, Nerdrow <troybe...@gmail.com> wrote:
> The standard reply is going to be you'll never actually need to view
> all 20K rows, so either page it out or filter the results to a more
> manageable number.  Then again, if an end user is willing to scroll
> through 20K rows, they might be willing to wait 5s for it to load :)
>
> Past that, I assume you're extending a CursorAdapter, you can use
> setEmptyView on the ListView to show a progress dialog, run your query
> on a background thread (or AsyncTask), then use a Handler (or the
> AsyncTask) to call changeCursor w/the results of the query.  It will
> at least feel more responsive in that the UI doesn't just freeze while
> it's working.
>
> On Jan 31, 10:01 pm, THill <thill.dr...@gmail.com> wrote:
>

> > I need to be able to scroll through alistwith (up to) 20K rows,


> > backed by a cursor on a read-only db.
>
> > Testing 20K rows on a G1, the query takes approx .02s regardless of

> > table size, while binding the adapter to thelisttakes 4.5s.  Note


> > that this is before the view calls used in rendering.
>
> > What makes the binding delay more troublesome is that it blocks the UI
> > thread, and is incurred on every rotate.  Combine these, and doing a

> > rotate just after starting thebindresults in a delay up to 9s.

Mark Murphy

unread,
Feb 2, 2010, 9:53:58 AM2/2/10
to android-d...@googlegroups.com
> A crude use case example is where the records are like short log
> entries, and you want to scroll through them to watch events over a
> time span. You need a big window, since in some time frames, there
> may be few records; in others, there may be thousands -- the data
> density varies & you can't just give them an index since they may not
> know what time range will cover what they need to know.

That still doesn't mean you should be loading 20K records in a single query.

> I'm doing the query & changing the SimpleCursorAdapter cursor in the
> background via AsyncTask, then binding the adapter to the list in
> onPostExecute. Binding to the list has to be done in the UI thread,
> hence the UI freeze during the bind (as the list requests the row
> count from the adapter).

Correct. The "UI freeze" timing is because a query is done lazily -- the
query is not truly executed against the database until data is actually
needed. A call to getCount() is the likely first statement that will
trigger the actual query.

Hence, don't load 20K records in a single query.

> Now, I can avoid the UI freeze by doing a cursor.getCount (source of
> the delay) before binding to the list, but then I have some cursor
> management issues when the activity goes away due to a rotate --
> either stall the onStop/onDestroy until the cursor.getCount completes
> & the cursor can be closed, or keep the cursor static & risk the
> overhead if it is abandoned.

Or handle the UI change yourself by indicating in the manifest you will
handle those configuration changes, then override onConfigurationChanged()
and adjust your UI as needed.

> Unless I'm misunderstanding your suggestion, paging won't work because
> you'd still need to know the row count (incur delay) before deciding
> if/how to paginate.

Use the LIMIT and OFFSET terms on your SELECT statement to obtain data in
smaller chunks. Create a wrapping adapter (like my EndlessAdapter) that
only loads chunks when the user scrolls to the bottom of the list and
therefore needs more data.

http://github.com/commonsguy/cwac-endless

--
Mark Murphy (a Commons Guy)
http://commonsware.com
Android App Developer Books: http://commonsware.com/books.html


tchill

unread,
Feb 1, 2010, 7:12:53 PM2/1/10
to Android Developers
Hehe, that was my reply when I started the project <sigh>.

A crude use case example is where the records are like short log


entries, and you want to scroll through them to watch events over a
time span. You need a big window, since in some time frames, there
may be few records; in others, there may be thousands -- the data
density varies & you can't just give them an index since they may not
know what time range will cover what they need to know.

I'm doing the query & changing the SimpleCursorAdapter cursor in the


background via AsyncTask, then binding the adapter to the list in
onPostExecute. Binding to the list has to be done in the UI thread,
hence the UI freeze during the bind (as the list requests the row
count from the adapter).

Now, I can avoid the UI freeze by doing a cursor.getCount (source of


the delay) before binding to the list, but then I have some cursor
management issues when the activity goes away due to a rotate --
either stall the onStop/onDestroy until the cursor.getCount completes
& the cursor can be closed, or keep the cursor static & risk the
overhead if it is abandoned.

Unless I'm misunderstanding your suggestion, paging won't work because


you'd still need to know the row count (incur delay) before deciding
if/how to paginate.

The iPhone version of our app works without any of these issues, so


I'm hoping that I'm just missing something (and that it's not an
Android limitation).

Other ideas?

Thanks
Tom


On Feb 1, 2:51 pm, Nerdrow <troybe...@gmail.com> wrote:

THill

unread,
Feb 2, 2010, 3:29:25 PM2/2/10
to Android Developers
Thanks Mark.

> Use the LIMIT and OFFSET terms on your SELECT statement to obtain data in
> smaller chunks. Create a wrapping adapter (like my EndlessAdapter) that
> only loads chunks when the user scrolls to the bottom of thelistand
> therefore needs more data.
>

Doing the load in smaller chunks would still require at least the same
amount of time to fill the list, but it would let me interrupt the
load more quickly, so I can incorporate that & handle rotates normally
-- thanks for the whack on the side of the head!

Unfortunately, I still need to load the full set of records because I
need to be able to provide a sliding window anywhere in the result
set, not just add to the end of the the list during the load (i.e.,
the user is not just continuously scrolling to the bottom of this huge
list).

Similar to a scroll button in typical windowing systems, in this list
of events & timestamps, the user needs to be able to select a
position, say 2/3 of the way down the list & scroll to the entries
they want to review.

Is the list/adapter just creating 20K objects as it walks the result
set? If so, I'd be tempted to do this myself so I can give the user
some visual cue/index into the data. I don't want to add
significantly more overhead though.

Any additional suggestions/insights?

Thanks again,
Tim


THill

unread,
Feb 2, 2010, 3:31:32 PM2/2/10
to Android Developers
Thanks Mark.

> Use the LIMIT and OFFSET terms on your SELECT statement to obtain data in
> smaller chunks. Create a wrapping adapter (like my EndlessAdapter) that

> only loads chunks when the user scrolls to the bottom of thelistand
> therefore needs more data.
>

Doing the load in smaller chunks would still require at least the same
amount of time to fill the list, but it would let me interrupt the

load more quickly, so I may incorporate that & handle rotates normally
-- woot!

Unfortunately, I need to load the full set of records because I need


to be able to provide a sliding window anywhere in the result set, not
just add to the end of the the list during the load (i.e., the user is
not just continuously scrolling to the bottom of this huge list).

Similar to a scroll button in typical windowing systems, in this list
of events & timestamps, the user needs to be able to select a

position, say 2/3 of the way down the list & scroll up/down if that
position is after/before the entries they want to review.

Mark Murphy

unread,
Feb 2, 2010, 3:38:03 PM2/2/10
to android-d...@googlegroups.com
> Is the list/adapter just creating 20K objects as it walks the result
> set?

No, but your database query is probably making a 100K roster of objects or
more. Only as many views as are needed for the visible rows in the list
will be created.

theSmith

unread,
Feb 2, 2010, 3:49:51 PM2/2/10
to Android Developers

As Mark said, no its not creating 20K views, but using a custom
implementation of a ArrayAdapter will allow you to stop a bunch of
necessary calls to findViewById() and inflate() if you override the
getView() method and use the view holder strategy. This will allow
your app to flow a little more smoothly when dealing with a dataset
that large.

THill

unread,
Feb 2, 2010, 4:33:54 PM2/2/10
to Android Developers
Thanks.

Mark's comment about the object roster is what I was curious about.
SimpleCursorAdapter calls Cursor.moveToPosition() every time the list
calls bindView(), so I was wondering what the cursor does with result
set & when -- i.e., does it create objects (presumably indexes into
the db file) for the result rows up front, or only as requested?

Regardless, it looks like I'll be merging the results of multiple
cursors with some sort of Adapter wrapper, so I'll need to keep all of
those cursors active while the list is around.

Cheers
Tim


Mark Murphy

unread,
Feb 2, 2010, 5:00:44 PM2/2/10
to android-d...@googlegroups.com
> Mark's comment about the object roster is what I was curious about.
> SimpleCursorAdapter calls Cursor.moveToPosition() every time the list
> calls bindView(), so I was wondering what the cursor does with result
> set & when -- i.e., does it create objects (presumably indexes into
> the db file) for the result rows up front, or only as requested?

A Cursor represents the entire result set. If you run a query for 20K rows
and 5 columns, you will instantiate 100K objects at the time the query is
actually run.

A CursorAdapter will only convert rows into Views on demand. The ListView
will only demand as many rows as can be seen on screen.

So, for a query of 20K rows from a database with 5 columns, and a ListView
capable of showing 10 rows, you will create 100K objects for the Cursor
and a few dozen objects for the row Views.

Hence, your cost is in doing the query, not in displaying the results.

> Regardless, it looks like I'll be merging the results of multiple
> cursors with some sort of Adapter wrapper, so I'll need to keep all of
> those cursors active while the list is around.

Possibly. You could use an ArrayList and ArrayAdapter for the actual
ListView, and convert each query's Cursor into a new set of objects to be
added to the adapter. Then, you can close and release the Cursor. This has
the advantage of simplifying the merge logic, at the cost of extra
processing (and briefly extra memory) when you load in each chunk.

THill

unread,
Feb 2, 2010, 5:22:22 PM2/2/10
to Android Developers
That's what I feared. I was hoping the cursor was only keeping
primitive references to the data in the db file.

If I'm going to be doing many smaller queries & keeping the results
around, sounds like I might as well keep the objects myself. Lets me
index across the varying data density also.

There's still the lifecycle handling -- if I have this ArrayList, and
I want to keep it across rotates, I could keep it in a static. The
trick is knowing when to release/null it. First thought is to set a
flag in onRetainNonConfigurationInstance, otherwise I'll release it if
the flag isn't set when onDestroy comes along. Seem reasonable?

Very helpful dialog, I appreciate everyone's time & cycles.

Tim

Mark Murphy

unread,
Feb 2, 2010, 6:10:20 PM2/2/10
to android-d...@googlegroups.com
> There's still the lifecycle handling -- if I have this ArrayList, and
> I want to keep it across rotates, I could keep it in a static. The
> trick is knowing when to release/null it. First thought is to set a
> flag in onRetainNonConfigurationInstance, otherwise I'll release it if
> the flag isn't set when onDestroy comes along. Seem reasonable?

Why not just pass the ArrayList in onRetainNonConfigurationInstance()?

THill

unread,
Feb 2, 2010, 6:19:29 PM2/2/10
to Android Developers
Because that's what a smart person would do & I'm an idiot for not
thinking about it before asking <grin>. That's two whacks -- I'm
outta here.

Need more caffeine...
Cheers,
tim

On Feb 2, 3:10 pm, "Mark Murphy" <mmur...@commonsware.com> wrote:
> > There's still the lifecycle handling -- if I have this ArrayList, and
> > I want to keep it across rotates, I could keep it in a static.  The
> > trick is knowing when to release/null it.  First thought is to set a
> > flag in onRetainNonConfigurationInstance, otherwise I'll release it if
> > the flag isn't set when onDestroy comes along.  Seem reasonable?
>
> Why not just pass the ArrayList in onRetainNonConfigurationInstance()?
>
> --

> Mark Murphy (a Commons Guy)http://commonsware.com

Bob Kerns

unread,
Feb 3, 2010, 2:01:54 AM2/3/10
to Android Developers
On Feb 2, 2:00 pm, "Mark Murphy" <mmur...@commonsware.com> wrote:
...

> A Cursor represents the entire result set. If you run a query for
20K rows
> and 5 columns, you will instantiate 100K objects at the time the
query is
> actually run.
...

Actually, that shouldn't be quite right. (I know you know this, but
were focused on making your main point. I just want to clear up any
confusion).

A Cursor should represent a result set + a movable index in that
result set. The major reason for using a cursor, is precisely to AVOID
simultaneously allocating space for all 20K or 20M or 20B rows.

Now, a naive implementation is, certainly, to load in all the results
from a query. But a PROPER implementation will:

1) analyze the query
2) form a query plan -- sort of a mini program for generating the
query result.
3) on request for the first row, begin executing the plan, one or a
few rows at a time. The portion of the query results that have been
read are "the window".
4) Once you've read the data from one row, and move to the next, the
storage can be freed.

(Everything but the final reading should happen on the database side,
to avoid doing unnecessary work on the query).

Note that, at no time, have you allocated 20K rows worth of objects.
Yet, by the time the task is done, IF you actually read ALL the
objects, you you'll have allocated each of those 20K rows's objects.
In fact, if you move back an forth, you may allocate a good deal MORE.
But the garbage collector will save you, and your peak usage will only
be a few rows worth.

It's important to note that a cursor can't completely save you from an
expensive query! Behind the scenes, the database may have to do some
very storage-expensive work -- like sort the results of a complex
query. To do this, it may need to first collect ALL the rows, and then
sort. Indexes may help in some cases. This is the sort of thing that
makes database programming an interesting challenge sometimes...

But even in this case, there's a benefit to getting your results via a
cursor. You may need to have a complete list of rows - but you won't
need to turn those rows into Java objects until you need them. So you
won't need as much storage all at once.

This all works very synergistally with your CursorAdaptors! Just
stuffing all 20K rows into a ListView isn't going to save you
anything. But if 10 items fit on your screen, only asking for about 10
items is going to save you a factor of 2K in peak storage. And if you
only look at one or two screens, the performance difference will be
huge!

Anyway, you've just convinced me to go get your book. It's got to be
worth it, just for a pre-written EndlessAdaptor implementation! (Yes,
I can download it, but..)

THill

unread,
Feb 3, 2010, 4:47:31 PM2/3/10
to Android Developers
What you say makes sense Bob, but testing seems to indicate the
Android SQLite implementation isn't so proper.

I have a simple app that creates 20001 rows in a table, each with an
int _id & 2 varchar fields.

Getting the count of rows via db.rawQuery("select count(*) from
table", null) and getting the value from the resulting cursor takes
0.2s.

Getting the count of rows using
cursor=db.rawQuery("select _id, field1, field2 from table",
null); //NOTE: no 'where' or 'order by'
count=cursor.getCount()
takes 4.5s.

During this time, the log has messages:
E/CursorWindow( 695): need to grow: mSize = 1048576, size = 29,
freeSpace() = 19, numRows = 11832
E/CursorWindow( 695): not growing since there are already 11832
row(s), max size 1048576
E/Cursor ( 695): Failed allocating 29 bytes for text/blob at
11831,1
D/Cursor ( 695): finish_program_and_get_row_count row 8169

So, getCount() is certainly not optimal, and appears to be allocating
something based on result set size. The list view requests the count
of items up front, so a slow getCount() can impact the UI if you just
trust SimpleCursorAdapter on large, simple queries.

When scrolling down the list, each bindView results in a
cursor.moveToPosition() call. Doing this in a test app shows that
moving to 1000, 2000, ..., 20000 each takes 1ms -- except when going
from 11000 to 12000, which takes >2s. No additional log messages.

Where it gets u.g.l.y. is when moving from position 20000 backward to
0. 20000, ..., 12000 take 1ms, but moving from 12000 to 11000,
10000, ..., 0 *each* take 4.5s (!). This is really fun to observe
when scrolling backward in the list from 12000. Once you hit some
point, every row takes 4.5s before it is shown.

When this starts happening, every move results in log messages like:
D/dalvikvm( 320): GC freed 251 objects / 13256 bytes in 116ms
E/CursorWindow( 763): need to grow: mSize = 1048576, size = 29,
freeSpace() = 21, numRows = 11631
E/CursorWindow( 763): not growing since there are already 11631
row(s), max size 1048576
E/Cursor ( 763): Failed allocating 29 bytes for text/blob at
19630,1
D/Cursor ( 763): finish_program_and_get_row_count row 370

This would again indicate the cursor is not agnostic to the result set
size. I'll be doing more testing with many smaller query cursors
behind a single adapter, but there appears to be a clear threshold.


Bob Kerns

unread,
Feb 3, 2010, 11:23:33 PM2/3/10
to Android Developers
Ah, the difference between theory and practice is that in theory,
there IS no difference, but in practice, there IS.

What platform did you do this on?

I was afraid getCount() was going to turn out to be slow. But I'm
puzzled that it's slow, but SELECT COUNT(*) FROM <TABLE> is relatively
quick. Clearly, the database is capable of better behavior.

It's pretty typical for reverse cursor motion to be more expensive,
but this is rather larger than I'd have expected.

The discontinuity at 12000 rows, in each direction, makes me suspect
it's using a max window size of 12,000, which is nuts on a phone, and
really wants to be configurable.

I'm also wondering if CursorWindow.freeLastRow() is ever being called
-- and if not, what would happen if we were to call it at suitable
points.

You didn't indicate whether you are using PreparedStatements -- if
not, some of the overhead may come from parsing and managing the
queries. Or, given a weak implementation -- it might not matter
anyway.

I wonder if we could supply our own CursorWindow subclass that behaves
better? The documentation is rather lacking...

Another factor to consider here is the database's caching. If you use
LIMIT to only query a small portion, you could make a derived cursor
that (excluding transaction differences) behaves more reasonably in
terms of how much it fetches. But then you'll be executing changing
queries over and over, so you may be doing more work on the back end.
Given the memory constraints, I suspect each subsequent query will
cost about the same as the first...

Maybe, to be scientific about this, I should write my own tests, we
compare notes, trade programs, and get a bunch of people to try them
on different platforms. I might have time if I don't answer any
questions for a while... :=)!

This is an important use case; it's worth understanding well. I wonder
if anyone has already covered this ground?

jotobjects

unread,
Feb 6, 2010, 3:15:39 PM2/6/10
to Android Developers
It makes sense that getCount() on the cursor might involve reading all
the rows even if it probably doesn't actually instantiate all the
objects as Bob Kerns correctly points out.

A work around is to do a seperate "select count(*)" with the same
where clause but this is only an estimate of the number of rows the
cursor will return since possibly rows can be inserted or deleted
between selecting the count and selecting the cursor. You can also do
"select count(*), column1, column2..." so you get the count on every
row. Finding the current relative "page" is another challenge...

On Feb 3, 8:23 pm, Bob Kerns <r...@acm.org> wrote:
> Ah, the difference between theory and practice is that in theory,
> there IS no difference, but in practice, there IS.
>
> What platform did you do this on?
>

> I was afraidgetCount() was going to turn out to be slow. But I'm

> > So,getCount() is certainly not optimal, and appears to be allocating


> > something based on result set size.  The list view requests the count

> > of items up front, so a slowgetCount() can impact the UI if you just

Allan Chesarone

unread,
Dec 22, 2016, 3:39:07 PM12/22/16
to Android Developers, thill...@gmail.com
I know this almost a 6 year old post, but I am experiencing the exact issue that you have measured in a production application, and am wondering if anything more ever came of your research? I have a custom cursorAdapter that is handling a cursor with ~10k rows in some instances. While scrolling down is generally fine, there are points where scrolling back up, even one row, causes a block in the UI thread for a few seconds. If a user is attempting to quickly scroll up, the application can become unresponsive. This is a huge problem for us, and I have currently been unable to find a solution to this. You post here is really the only one I have come across that seems to match the problem that we are seeing in large data sets.

Thank you.

Глеб Никитенко

unread,
Aug 5, 2017, 6:06:50 PM8/5/17
to Android Developers, thill...@gmail.com
I absolutely agree with you, Alan!

The question is not only still remains open, but now (due introduction the room-library) becomes even more controversial.
I'd like to see a united logic of interaction between all parts AOSP(especially framework's API)

ContentProvider's -> Cursors -> CursorLoaders -> CursorAdapters

   seemed like a good start to consistency, but a bit raw.

So today it still looks undeveloped, and the "room" is still young(moreover, it is not a substitute: there is no support for IPC, memory efficiency), the cursor-philosophy is also not declared deprecated. ((
Reply all
Reply to author
Forward
0 new messages