[android-kernel] Eclair Cache Problem

閲覧: 156 回
最初の未読メッセージにスキップ

Anthony

未読、
2010/05/07 20:06:102010/05/07
To: android...@googlegroups.com
After switching my phone from Donut to Eclair, I've been experiencing
a lot of performance issues. The phone will work well for a while but
after 15 or 20 minutes of clicking through the UI, it will become
almost completely unresponsive and mostly stay that way. The CPU usage
is pretty low, and there is about 35MB of cache with a few MB of free
RAM. I was using 2.6.29 on both Donut and Eclair, although they have
some modifications on top of that.

When watching `top`, I can see that there is a substantial amount of
IO wait (>50%) when I'm in this state. When I put 1 in block_dump and
watch /proc/kmsg, I see that most of the IO seems to be coming from
kswapd when I'm in this state (the phone doesn't have swap enabled).

I've been trying to figure out the root cause of this, and the two
best indicators that I've found so far are that:

- When I look at /proc/meminfo when I'm in this state, "Active(file)"
and "Inactive(file)" are very low compared to my Donut device (<2MB
versus ~20MB)
- When I run `echo 3 > /proc/sys/vm/drop_caches`, the lag immediately
goes away (for a while)

Could it be that pages are being stuck in the cache? Is there a way
for that to happen? I don't know how to see a good overview from the
cache, although I've been getting systemtap working on my device, so I
hope I can find a point to probe in the kernel to log this
information. The only other alternative to pages being stuck that I've
considered is that there is a setting, which dictates some min/max
amounts for the cache in terms of page cache vs file cache.

Does anyone have suggestions about what I could investigate to find
the root cause of this? I'm not familiar at all with the kernel, but
I've started reading through some of the page cache related code,
though I haven't learned much from that, yet.

Thank you

--
unsubscribe: android-kerne...@googlegroups.com
website: http://groups.google.com/group/android-kernel

Seth Forshee

未読、
2010/05/12 15:48:162010/05/12
To: android...@googlegroups.com
I've been looking into this issue with Anthony, and I wanted to follow
up with some of our findings and some additional questions.

The really peculiar thing about the /proc/meminfo output when the device
becomes unresponsive is the delta between the page cache size and the
number of pages in the file lrus. One example:

MemTotal: 177036 kB
MemFree: 2096 kB
Buffers: 104 kB
Cached: 34356 kB
Active: 70756 kB
Inactive: 72688 kB
Active(anon): 70136 kB
Inactive(anon): 70368 kB
Active(file): 620 kB
Inactive(file): 2320 kB

So a very large portion of the page cache is occupied by pages
containing something other than cached data from media-backed
filesystems, and it's not buffer cache data either. If we run 'stop;
echo 1 > /proc/sys/vm/drop_caches', this delta is reduced to reasonable
levels.

One thing we'd really like to do is track down where all these anonymous
pages in the page cache are coming from. The only other data I'm aware
of that is accounted to the page cache is from tmpfs. But the delta
here is awfully big. Does anyone know of any other data that resides in
the page cache? Does anyone know of any way to analyze the page cache
to find out the sources of the pages it contains? I did write some code
to get statistics about the pages used by ashmem via a proc node, and
that only accounted for about 30kB in the case quoted above. We also
know that the initramfs and userland tmpfs mounts only account for a
couple of megabytes.

The other odd thing we've noticed is that when we're in this state, the
lowmemorykiller isn't killing any processes. This is because
global_page_state(NR_FILE_PAGES) is still above the highest value we
have defined in lowmem_minfree[], but in reality we are extremely low on
free and reclaimable pages due to most of the pages represented in
NR_FILE_PAGES being pinned into the page cache. It seems that maybe
NR_FILE_PAGES isn't the best value to use in systems without swap. I'm
not an expert on the page cache, but it seems like the method below
might be a better way to go. It certainly helps our situation a lot.
Any comments?

Thanks,
Seth


diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index 39d5e65..360d3a0 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -78,6 +78,16 @@ task_notify_func(struct notifier_block *self, unsigned long val, void *data)
return NOTIFY_OK;
}

+static inline int other_file_pages(void)
+{
+#ifdef CONFIG_SWAP
+ return global_page_state(NR_FILE_PAGES);
+#else
+ return global_page_state(NR_ACTIVE_FILE) +
+ global_page_state(NR_INACTIVE_FILE);
+#endif
+}
+
static int lowmem_shrink(int nr_to_scan, gfp_t gfp_mask)
{
struct task_struct *p;
@@ -90,7 +100,7 @@ static int lowmem_shrink(int nr_to_scan, gfp_t gfp_mask)
int selected_oom_adj;
int array_size = ARRAY_SIZE(lowmem_adj);
int other_free = global_page_state(NR_FREE_PAGES);
- int other_file = global_page_state(NR_FILE_PAGES);
+ int other_file = other_file_pages();

/*
* If we already have a death outstanding, then

Seth Forshee

未読、
2010/05/13 23:42:052010/05/13
To: android...@googlegroups.com
On Wed, May 12, 2010 at 02:48:16PM -0500, Seth Forshee wrote:
> +static inline int other_file_pages(void)
> +{
> +#ifdef CONFIG_SWAP
> + return global_page_state(NR_FILE_PAGES);
> +#else
> + return global_page_state(NR_ACTIVE_FILE) +
> + global_page_state(NR_INACTIVE_FILE);
> +#endif
> +}

Maybe instead of this we could just use global_reclaimable_pages(). This
will make the lowmemkiller less aggressive for systems with swap than it
is currently as it will include all pages on the anonymous lrus, not
just those in the page cache, but maybe that's appropriate (probably
there are few or none Android systems using swap anyway). Any comments?

Anthony

未読、
2010/05/14 14:08:152010/05/14
To: android...@googlegroups.com
This turned out to be an issue in ashmem.c and has been resolved.

Naseer

未読、
2010/05/14 14:14:442010/05/14
To: Android Linux Kernel Development
Hi Anthony,
Can you please elaborate on the solution ?

Thanks,
-Naseer

Seth Forshee

未読、
2010/05/14 14:42:322010/05/14
To: android...@googlegroups.com
On Fri, May 14, 2010 at 11:14:44AM -0700, Naseer wrote:
> Hi Anthony,
> Can you please elaborate on the solution ?

This commit was missing.

http://android.git.kernel.org/?p=kernel/common.git;a=commit;h=11fd1772d96736c065f722d6d8a5092f086be3f7
全員に返信
投稿者に返信
転送
新着メール 0 件