Re: [PATCH 3/3] [v2] memcg: add anon_scan

KAMEZAWA Hiroyuki

unread,

Jan 13, 2010, 3:30:02 AM1/13/10

to

On Wed, 13 Jan 2010 17:22:37 +0900 (JST)
KOSAKI Motohiro <kosaki....@jp.fujitsu.com> wrote:

> Changelog
> since v1: cancel to remove "recent_xxx" debug statistics as bilbir's
> mention
>
> ===========================================
>
> anon_scan_ratio feature doesn't only useful for global VM pressure
> analysis, but it also useful for memcg memroy pressure analysis.
>
> Then, this patch add anon_scan_ratio field to memory.stat file too.
>
> Cc: Balbir Singh <bal...@linux.vnet.ibm.com>
> Cc: KAMEZAWA Hiroyuki <kamezaw...@jp.fujitsu.com>
> Signed-off-by: KOSAKI Motohiro <kosaki....@jp.fujitsu.com>

Acked-by: KAMEZAWA Hiroyuki <kamezaw...@jp.fujitsu.com>

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

KOSAKI Motohiro

unread,

Jan 13, 2010, 3:30:02 AM1/13/10

to

Changelog
since v1: cancel to remove "recent_xxx" debug statistics as bilbir's
mention

===========================================

anon_scan_ratio feature doesn't only useful for global VM pressure
analysis, but it also useful for memcg memroy pressure analysis.

Then, this patch add anon_scan_ratio field to memory.stat file too.

Cc: Balbir Singh <bal...@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezaw...@jp.fujitsu.com>
Signed-off-by: KOSAKI Motohiro <kosaki....@jp.fujitsu.com>

---
mm/memcontrol.c | 65 +++++++++++++++++++++++++++++++++++-------------------
1 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 325df12..7348edc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2950,6 +2950,11 @@ static int mem_control_stat_show(struct cgroup *cont, struct cftype *cft,
{
struct mem_cgroup *mem_cont = mem_cgroup_from_cont(cont);
struct mcs_total_stat mystat;
+ struct zone *zone;
+ unsigned long total_anon = 0;
+ unsigned long total_scan_anon = 0;
+ unsigned long recent_rotated[2] = {0};
+ unsigned long recent_scanned[2] = {0};
int i;

memset(&mystat, 0, sizeof(mystat));
@@ -2978,33 +2983,47 @@ static int mem_control_stat_show(struct cgroup *cont, struct cftype *cft,
cb->fill(cb, memcg_stat_strings[i].total_name, mystat.stat[i]);
}

-#ifdef CONFIG_DEBUG_VM
cb->fill(cb, "inactive_ratio", calc_inactive_ratio(mem_cont, NULL));

- {
- int nid, zid;
+ for_each_populated_zone(zone) {
+ int nid = zone->zone_pgdat->node_id;
+ int zid = zone_idx(zone);
struct mem_cgroup_per_zone *mz;
- unsigned long recent_rotated[2] = {0, 0};
- unsigned long recent_scanned[2] = {0, 0};
-
- for_each_online_node(nid)
- for (zid = 0; zid < MAX_NR_ZONES; zid++) {
- mz = mem_cgroup_zoneinfo(mem_cont, nid, zid);
-
- recent_rotated[0] +=
- mz->reclaim_stat.recent_rotated[0];
- recent_rotated[1] +=
- mz->reclaim_stat.recent_rotated[1];
- recent_scanned[0] +=
- mz->reclaim_stat.recent_scanned[0];
- recent_scanned[1] +=
- mz->reclaim_stat.recent_scanned[1];
- }
- cb->fill(cb, "recent_rotated_anon", recent_rotated[0]);
- cb->fill(cb, "recent_rotated_file", recent_rotated[1]);
- cb->fill(cb, "recent_scanned_anon", recent_scanned[0]);
- cb->fill(cb, "recent_scanned_file", recent_scanned[1]);
+ unsigned long anon;
+ unsigned long ratio;
+
+ mz = mem_cgroup_zoneinfo(mem_cont, nid, zid);
+
+ anon = MEM_CGROUP_ZSTAT(mz, LRU_INACTIVE_ANON);
+ anon += MEM_CGROUP_ZSTAT(mz, LRU_ACTIVE_ANON);
+
+ ratio = get_anon_scan_ratio(zone, mem_cont, mem_cont->swappiness);
+
+ /*
+ * We have per-zone anon-scan-ratio. but we don't hope display such
+ * value directly. Instead, we display following fomula.
+ *
+ * sum(anon * ratio/100)
+ * --------------------- * 100
+ * sum(anon)
+ */
+ total_anon += anon;
+ total_scan_anon += anon * ratio;
+
+#ifdef CONFIG_DEBUG_VM
+ recent_rotated[0] += mz->reclaim_stat.recent_rotated[0];
+ recent_rotated[1] += mz->reclaim_stat.recent_rotated[1];
+ recent_scanned[0] += mz->reclaim_stat.recent_scanned[0];
+ recent_scanned[1] += mz->reclaim_stat.recent_scanned[1];
+#endif
}
+ cb->fill(cb, "anon_scan_ratio", total_scan_anon / total_anon);
+
+#ifdef CONFIG_DEBUG_VM
+ cb->fill(cb, "recent_rotated_anon", recent_rotated[0]);
+ cb->fill(cb, "recent_rotated_file", recent_rotated[1]);
+ cb->fill(cb, "recent_scanned_anon", recent_scanned[0]);
+ cb->fill(cb, "recent_scanned_file", recent_scanned[1]);
#endif

return 0;
--
1.6.5.2

KOSAKI Motohiro

unread,

Jan 13, 2010, 3:30:02 AM1/13/10

to

Changelog
from v1
- get_anon_scan_ratio don't tak zone->lru_lock anymore
because zoneinfo_show_print takes zone->lock.

======================================
Vmscan folks was asked "why does my system makes so much swap-out?"
in lkml at several times.
At that time, I made the debug patch to show recent_anon_{scanned/rorated}
parameter at least three times.

Thus, its parameter should be showed on /proc/zoneinfo. It help
vmscan folks debugging.

Signed-off-by: KOSAKI Motohiro <kosaki....@jp.fujitsu.com>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezaw...@jp.fujitsu.com>
---
include/linux/swap.h | 2 ++
mm/vmscan.c | 50 ++++++++++++++++++++++++++++++++++++--------------
mm/vmstat.c | 7 +++++--
3 files changed, 43 insertions(+), 16 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index a2602a8..e95d7ed 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -280,6 +280,8 @@ extern void scan_unevictable_unregister_node(struct node *node);
extern int kswapd_run(int nid);
extern void kswapd_stop(int nid);

+unsigned long get_anon_scan_ratio(struct zone *zone, struct mem_cgroup *memcg, int swappiness);
+
#ifdef CONFIG_MMU
/* linux/mm/shmem.c */
extern int shmem_unuse(swp_entry_t entry, struct page *page);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 640486b..0900931 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1493,8 +1493,8 @@ static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan,
* percent[0] specifies how much pressure to put on ram/swap backed
* memory, while percent[1] determines pressure on the file LRUs.
*/
-static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
- unsigned long *percent)
+static void __get_scan_ratio(struct zone *zone, struct scan_control *sc,
+ int need_update, unsigned long *percent)
{
unsigned long anon, file, free;
unsigned long anon_prio, file_prio;
@@ -1535,18 +1535,19 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
*
* anon in [0], file in [1]
*/
- if (unlikely(reclaim_stat->recent_scanned[0] > anon / 4)) {
- spin_lock_irq(&zone->lru_lock);
- reclaim_stat->recent_scanned[0] /= 2;
- reclaim_stat->recent_rotated[0] /= 2;
- spin_unlock_irq(&zone->lru_lock);
- }
-
- if (unlikely(reclaim_stat->recent_scanned[1] > file / 4)) {
- spin_lock_irq(&zone->lru_lock);
- reclaim_stat->recent_scanned[1] /= 2;
- reclaim_stat->recent_rotated[1] /= 2;
- spin_unlock_irq(&zone->lru_lock);
+ if (need_update) {
+ if (unlikely(reclaim_stat->recent_scanned[0] > anon / 4)) {
+ spin_lock_irq(&zone->lru_lock);
+ reclaim_stat->recent_scanned[0] /= 2;
+ reclaim_stat->recent_rotated[0] /= 2;
+ spin_unlock_irq(&zone->lru_lock);
+ }
+ if (unlikely(reclaim_stat->recent_scanned[1] > file / 4)) {
+ spin_lock_irq(&zone->lru_lock);
+ reclaim_stat->recent_scanned[1] /= 2;
+ reclaim_stat->recent_rotated[1] /= 2;
+ spin_unlock_irq(&zone->lru_lock);
+ }
}

/*
@@ -1572,6 +1573,27 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
percent[1] = 100 - percent[0];
}

+static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
+ unsigned long *percent)
+{
+ __get_scan_ratio(zone, sc, 1, percent);
+}
+
+unsigned long get_anon_scan_ratio(struct zone *zone, struct mem_cgroup *memcg, int swappiness)
+{
+ unsigned long percent[2];
+ struct scan_control sc = {
+ .may_swap = 1,
+ .swappiness = swappiness,
+ .mem_cgroup = memcg,
+ };
+
+ __get_scan_ratio(zone, &sc, 0, percent);
+
+ return percent[0];
+}
+
+
/*
* Smallish @nr_to_scan's are deposited in @nr_saved_scan,
* until we collected @swap_cluster_max pages to scan.
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 6051fba..f690117 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -15,6 +15,7 @@
#include <linux/cpu.h>
#include <linux/vmstat.h>
#include <linux/sched.h>
+#include <linux/swap.h>

#ifdef CONFIG_VM_EVENT_COUNTERS
DEFINE_PER_CPU(struct vm_event_state, vm_event_states) = {{0}};
@@ -760,11 +761,13 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
"\n all_unreclaimable: %u"
"\n prev_priority: %i"
"\n start_pfn: %lu"
- "\n inactive_ratio: %u",
+ "\n inactive_ratio: %u"
+ "\n anon_scan_ratio: %lu",
zone_is_all_unreclaimable(zone),
zone->prev_priority,
zone->zone_start_pfn,
- zone->inactive_ratio);
+ zone->inactive_ratio,
+ get_anon_scan_ratio(zone, NULL, vm_swappiness));
seq_putc(m, '\n');

Balbir Singh

unread,

Jan 13, 2010, 4:10:02 AM1/13/10

to

On Wed, Jan 13, 2010 at 1:52 PM, KOSAKI Motohiro
<kosaki....@jp.fujitsu.com> wrote:
> Changelog
> �since v1: cancel to remove "recent_xxx" debug statistics as bilbir's
> �mention
>
> ===========================================
>
> anon_scan_ratio feature doesn't only useful for global VM pressure
> analysis, but it also useful for memcg memroy pressure analysis.
>
> Then, this patch add anon_scan_ratio field to memory.stat file too.
>
> Cc: Balbir Singh <bal...@linux.vnet.ibm.com>
> Cc: KAMEZAWA Hiroyuki <kamezaw...@jp.fujitsu.com>
> Signed-off-by: KOSAKI Motohiro <kosaki....@jp.fujitsu.com>

[snip]

Looks good to me

Acked-by: Balbir Singh <bal...@linux.vnet.ibm.com>

Minchan Kim

unread,

Jan 13, 2010, 5:40:01 AM1/13/10

to

Hi, Kosaki.

On Wed, Jan 13, 2010 at 5:21 PM, KOSAKI Motohiro
<kosaki....@jp.fujitsu.com> wrote:
> Changelog
> from v1
> - get_anon_scan_ratio don't tak zone->lru_lock anymore
> because zoneinfo_show_print takes zone->lock.

When I saw this changelog first, I got confused.
That's because there is no relation between lru_lock and lock in zone.
You mean zoneinfo is allowed to have a stale data?
Tend to agree with it.

>
>
> ======================================
> Vmscan folks was asked "why does my system makes so much swap-out?"
> in lkml at several times.
> At that time, I made the debug patch to show recent_anon_{scanned/rorated}
> parameter at least three times.
>
> Thus, its parameter should be showed on /proc/zoneinfo. It help
> vmscan folks debugging.

I support this suggestion.

>
> Signed-off-by: KOSAKI Motohiro <kosaki....@jp.fujitsu.com>
> Reviewed-by: Rik van Riel <ri...@redhat.com>
> Reviewed-by: KAMEZAWA Hiroyuki <kamezaw...@jp.fujitsu.com>
> ---
> include/linux/swap.h | 2 ++
> mm/vmscan.c | 50 ++++++++++++++++++++++++++++++++++++--------------
> mm/vmstat.c | 7 +++++--
> 3 files changed, 43 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index a2602a8..e95d7ed 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -280,6 +280,8 @@ extern void scan_unevictable_unregister_node(struct node *node);
> extern int kswapd_run(int nid);
> extern void kswapd_stop(int nid);
>
> +unsigned long get_anon_scan_ratio(struct zone *zone, struct mem_cgroup *memcg, int swappiness);

Today Andrew said to me. :)
"The vmscan.c code makes an effort to look nice in an 80-col display."

Why do you add new parameter 'need_update'?
Do you see any lru_lock heavy contention? (reclaim VS cat /proc/zoneinfo)
I think maybe not.
I am not sure no locking version is needed.

> + if (need_update) {
> + if (unlikely(reclaim_stat->recent_scanned[0] > anon / 4)) {
> + spin_lock_irq(&zone->lru_lock);
> + reclaim_stat->recent_scanned[0] /= 2;
> + reclaim_stat->recent_rotated[0] /= 2;
> + spin_unlock_irq(&zone->lru_lock);
> + }
> + if (unlikely(reclaim_stat->recent_scanned[1] > file / 4)) {
> + spin_lock_irq(&zone->lru_lock);
> + reclaim_stat->recent_scanned[1] /= 2;
> + reclaim_stat->recent_rotated[1] /= 2;
> + spin_unlock_irq(&zone->lru_lock);
> + }
> }
>
> /*
> @@ -1572,6 +1573,27 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
> percent[1] = 100 - percent[0];
> }
>
> +static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
> + unsigned long *percent)
> +{
> + __get_scan_ratio(zone, sc, 1, percent);
> +}
> +

If we really need this version and your changelog is right,
Let's add 'note'. ;-)

/* Caller have to hold zone->lock */

--
Kind regards,
Minchan Kim

KOSAKI Motohiro

unread,

Jan 13, 2010, 7:00:01 PM1/13/10

to

Hi

> Hi, Kosaki.
>
> On Wed, Jan 13, 2010 at 5:21 PM, KOSAKI Motohiro
> <kosaki....@jp.fujitsu.com> wrote:
> > Changelog
> > from v1
> > - get_anon_scan_ratio don't tak zone->lru_lock anymore
> > because zoneinfo_show_print takes zone->lock.
>
> When I saw this changelog first, I got confused.
> That's because there is no relation between lru_lock and lock in zone.
> You mean zoneinfo is allowed to have a stale data?
> Tend to agree with it.

Well. zone->lock and zone->lru_lock should be not taked at the same time.
[1/4] of my last version removed zone->lock, then get_anon_scan_ratioo()
can take zone->lru_lock. but I dropped it. thus get_anon_scan_ration() can't
take zone->lru_lock.

Thus, I added need_update parameter.

KOSAKI Motohiro

unread,

Jan 14, 2010, 12:20:02 AM1/14/10

to

> > Well. zone->lock and zone->lru_lock should be not taked at the same time.
>

> I looked over the code since I am out of office.
> I can't find any locking problem zone->lock and zone->lru_lock.
> Do you know any locking order problem?
> Could you explain it with call graph if you don't mind?
>
> I am out of office by tomorrow so I can't reply quickly.
> Sorry for late reponse.

This is not lock order issue. both zone->lock and zone->lru_lock are
hotpath lock. then, same tame grabbing might cause performance impact.

thanks.

Minchan Kim

unread,

Jan 14, 2010, 12:20:02 AM1/14/10

to

On Thu, Jan 14, 2010 at 8:50 AM, KOSAKI Motohiro
<kosaki....@jp.fujitsu.com> wrote:
> Hi
>
>> Hi, Kosaki.
>>
>> On Wed, Jan 13, 2010 at 5:21 PM, KOSAKI Motohiro
>> <kosaki....@jp.fujitsu.com> wrote:
>> > Changelog
>> > from v1
>> > - get_anon_scan_ratio don't tak zone->lru_lock anymore
>> > because zoneinfo_show_print takes zone->lock.
>>
>> When I saw this changelog first, I got confused.
>> That's because there is no relation between lru_lock and lock in zone.
>> You mean zoneinfo is allowed to have a stale data?
>> Tend to agree with it.
>
> Well. zone->lock and zone->lru_lock should be not taked at the same time.

I looked over the code since I am out of office.

I can't find any locking problem zone->lock and zone->lru_lock.
Do you know any locking order problem?
Could you explain it with call graph if you don't mind?

I am out of office by tomorrow so I can't reply quickly.
Sorry for late reponse.

--
Kind regards,
Minchan Kim

Minchan Kim

unread,

Jan 15, 2010, 12:30:03 PM1/15/10

to

Hi, KOSAKI.

On Thu, Jan 14, 2010 at 2:18 PM, KOSAKI Motohiro
<kosaki....@jp.fujitsu.com> wrote:
>> > Well. zone->lock and zone->lru_lock should be not taked at the same time.
>>
>> I looked over the code since I am out of office.
>> I can't find any locking problem zone->lock and zone->lru_lock.
>> Do you know any locking order problem?
>> Could you explain it with call graph if you don't mind?
>>
>> I am out of office by tomorrow so I can't reply quickly.
>> Sorry for late reponse.
>
> This is not lock order issue. both zone->lock and zone->lru_lock are
> hotpath lock. then, same tame grabbing might cause performance impact.

Sorry for late response.

Your patch makes get_anon_scan_ratio of zoneinfo stale.
What you said about performance impact is effective when VM pressure high.
I think stale data is all right normally.
But when VM pressure is high and we want to see the information in zoneinfo(
this case is what you said), stale data is not a good, I think.

If it's not a strong argue, I want to use old get_scan_ratio
in get_anon_scan_ratio.

--
Kind regards,
Minchan Kim

KOSAKI Motohiro

unread,

Jan 17, 2010, 8:10:02 PM1/17/10

to

> Hi, KOSAKI.
>
> On Thu, Jan 14, 2010 at 2:18 PM, KOSAKI Motohiro
> <kosaki....@jp.fujitsu.com> wrote:
> >> > Well. zone->lock and zone->lru_lock should be not taked at the same time.
> >>
> >> I looked over the code since I am out of office.
> >> I can't find any locking problem zone->lock and zone->lru_lock.
> >> Do you know any locking order problem?
> >> Could you explain it with call graph if you don't mind?
> >>
> >> I am out of office by tomorrow so I can't reply quickly.
> >> Sorry for late reponse.
> >
> > This is not lock order issue. both zone->lock and zone->lru_lock are
> > hotpath lock. then, same tame grabbing might cause performance impact.
>
> Sorry for late response.
>
> Your patch makes get_anon_scan_ratio of zoneinfo stale.
> What you said about performance impact is effective when VM pressure high.
> I think stale data is all right normally.
> But when VM pressure is high and we want to see the information in zoneinfo(
> this case is what you said), stale data is not a good, I think.
>
> If it's not a strong argue, I want to use old get_scan_ratio
> in get_anon_scan_ratio.

please looks such function again.

usally we use recent_rotated/recent_scanned ratio. then following
decreasing doesn't change any scan-ratio meaning. it only prevent
stat overflow.

if (unlikely(reclaim_stat->recent_scanned[0] > anon / 4)) {

spin_lock_irq(&zone->lru_lock);

reclaim_stat->recent_scanned[0] /= 2;

reclaim_stat->recent_rotated[0] /= 2;

spin_unlock_irq(&zone->lru_lock);
}

So, I don't think current implementation can show stale data.

Thanks.

Minchan Kim

unread,

Jan 17, 2010, 9:00:02 PM1/17/10

to

Hi, KOSAKI.

It has a primary role that floating average as well as prevenitng overflow. :)
So, It's important.

>
> if (unlikely(reclaim_stat->recent_scanned[0] > anon / 4)) {
> spin_lock_irq(&zone->lru_lock);
> reclaim_stat->recent_scanned[0] /= 2;
> reclaim_stat->recent_rotated[0] /= 2;
> spin_unlock_irq(&zone->lru_lock);
> }
>
>
> So, I don't think current implementation can show stale data.

It can make stale data when high memory pressure happens.

>
> Thanks.
>

Moreever, I don't want to make complicate thing(ie, need_update)
than old if it doesn't have some benefit.(I think lru_lock isn't big overhead)

--
Kind regards,
Minchan Kim

KOSAKI Motohiro

unread,

Jan 17, 2010, 9:00:02 PM1/17/10

to

?? why? and when?
I think it depend on what's stale mean.

Currently(i.e. before the patch), get_scan_ratio have following fomula.
in such region, recent_scanned is not protected by zone->lru_lock.

ap = (anon_prio + 1) * (reclaim_stat->recent_scanned[0] + 1);
ap /= reclaim_stat->recent_rotated[0] + 1;
fp = (file_prio + 1) * (reclaim_stat->recent_scanned[1] + 1);
fp /= reclaim_stat->recent_rotated[1] + 1;
percent[0] = 100 * ap / (ap + fp + 1);

percent[1] = 100 - percent[0];

It mean, shrink_zone() doesn't use exactly recent_scanned value. then
zoneinfo can use the same unexactly value.

> Moreever, I don't want to make complicate thing(ie, need_update)
> than old if it doesn't have some benefit.(I think lru_lock isn't big overhead)

Hmm..
I think lru_lock can makes big overhead.

KOSAKI Motohiro

unread,

Jan 17, 2010, 9:20:01 PM1/17/10

to

> >> It can make stale data when high memory pressure happens.
> >
> > ?? why? and when?
> > I think it depend on what's stale mean.
> >
> > Currently(i.e. before the patch), get_scan_ratio have following fomula.
> > in such region, recent_scanned is not protected by zone->lru_lock.
> >
> > ap = (anon_prio + 1) * (reclaim_stat->recent_scanned[0] + 1);
> > ap /= reclaim_stat->recent_rotated[0] + 1;
> > fp = (file_prio + 1) * (reclaim_stat->recent_scanned[1] + 1);
> > fp /= reclaim_stat->recent_rotated[1] + 1;
> > percent[0] = 100 * ap / (ap + fp + 1);
> > percent[1] = 100 - percent[0];
> >
> > It mean, shrink_zone() doesn't use exactly recent_scanned value. then
> > zoneinfo can use the same unexactly value.
>

> Absoultely right. I missed that. Thanks.
> get_scan_ratio used lru_lock to get reclaim_stat->recent_xxxx.
> But, it doesn't used lru_lock to get ap/fp.
>
> Is it intentional? I think you or Rik know it. :)
> I think if we want to get exact value, we have to use lru_lock until
> getting ap/fp.
> If it isn't, we don't need lru_lock when we get the reclaim_stat->recent_xxxx.
>
> What do you think about it?

I believe the current code is intentional.

Minchan Kim

unread,

Jan 17, 2010, 9:20:01 PM1/17/10

to

I missed Cc.

On Mon, Jan 18, 2010 at 10:54 AM, KOSAKI Motohiro

Absoultely right. I missed that. Thanks.

get_scan_ratio used lru_lock to get reclaim_stat->recent_xxxx.
But, it doesn't used lru_lock to get ap/fp.

Is it intentional? I think you or Rik know it. :)
I think if we want to get exact value, we have to use lru_lock until
getting ap/fp.
If it isn't, we don't need lru_lock when we get the reclaim_stat->recent_xxxx.

What do you think about it?

>
>

>> Moreever, I don't want to make complicate thing(ie, need_update)
>> than old if it doesn't have some benefit.(I think lru_lock isn't big overhead)
>
> Hmm..
> I think lru_lock can makes big overhead.

I don't want to argue strongly about this.
That's because i don't have seen that.
If you have a conern about lru_lock, I don't opposed your patch.

--
Kind regards,
Minchan Kim

Rik van Riel

unread,

Jan 17, 2010, 9:20:02 PM1/17/10

to

On 01/17/2010 09:10 PM, Minchan Kim wrote:

> Absoultely right. I missed that. Thanks.
> get_scan_ratio used lru_lock to get reclaim_stat->recent_xxxx.
> But, it doesn't used lru_lock to get ap/fp.
>
> Is it intentional? I think you or Rik know it. :)
> I think if we want to get exact value, we have to use lru_lock until
> getting ap/fp.
> If it isn't, we don't need lru_lock when we get the reclaim_stat->recent_xxxx.
>
> What do you think about it?

This is definately not intentional.

Getting race conditions in this code could throw off the
statistics by a factor 2. I do not know how serious that
would be for the VM or whether (and how quickly) it would
self correct.

--
All rights reversed.

Minchan Kim

unread,

Jan 17, 2010, 9:30:02 PM1/17/10

to

On Mon, Jan 18, 2010 at 11:16 AM, Rik van Riel <ri...@redhat.com> wrote:
> On 01/17/2010 09:10 PM, Minchan Kim wrote:
>
>> Absoultely right. I missed that. Thanks.
>> get_scan_ratio used lru_lock to get reclaim_stat->recent_xxxx.
>> But, it doesn't used lru_lock to get ap/fp.
>>
>> Is it intentional? I think you or Rik know it. :)
>> I think if we want to get exact value, we have to use lru_lock until
>> getting ap/fp.
>> If it isn't, we don't need lru_lock when we get the
>> reclaim_stat->recent_xxxx.
>>
>> What do you think about it?
>
> This is definately not intentional.
>
> Getting race conditions in this code could throw off the
> statistics by a factor 2. I do not know how serious that
> would be for the VM or whether (and how quickly) it would
> self correct.

Okay. How about making patch to get exact ap/fp?
Although it were not serious or fast recoverable, I think it would be better
to protect lru_lock for consistency if lru_lock isn't big contention lock.

>
> --
> All rights reversed.
>

--
Kind regards,
Minchan Kim

KOSAKI Motohiro

unread,

Jan 17, 2010, 9:30:02 PM1/17/10

to

> On 01/17/2010 09:10 PM, Minchan Kim wrote:
>
> > Absoultely right. I missed that. Thanks.
> > get_scan_ratio used lru_lock to get reclaim_stat->recent_xxxx.
> > But, it doesn't used lru_lock to get ap/fp.
> >
> > Is it intentional? I think you or Rik know it. :)
> > I think if we want to get exact value, we have to use lru_lock until
> > getting ap/fp.
> > If it isn't, we don't need lru_lock when we get the reclaim_stat->recent_xxxx.
> >
> > What do you think about it?
>
> This is definately not intentional.

Really?
So, I'll post next patch.

Thanks.

Re: [PATCH 3/3] [v2] memcg: add anon_scan_ratio to memory.stat file

KAMEZAWA Hiroyuki

KOSAKI Motohiro

KOSAKI Motohiro

Balbir Singh

Minchan Kim

KOSAKI Motohiro

KOSAKI Motohiro

Minchan Kim

Minchan Kim

KOSAKI Motohiro

Minchan Kim

KOSAKI Motohiro

KOSAKI Motohiro

Minchan Kim

Rik van Riel

Minchan Kim

KOSAKI Motohiro