Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Patch v4] Do not reserve crashkernel high memory if crashkernel low memory reserving failed

124 views
Skip to first unread message

Baoquan He

unread,
Sep 22, 2015, 7:50:08 AM9/22/15
to
People reported that when allocating crashkernel memory using
",high" and ",low" syntax, there were cases where the reservation
of the "high" portion succeeds, but the reservation of the "low"
portion fails. Then kexec can load kdump kernel successfully, but
the boot of kdump kernel fails as there's no low memory. This is
because allocation of low memory for kdump kernel can fail on large
systems for reasons. E.g it could be manually specified crashkernel
low memory is too large to find in memblock region.

In this patch add return value for reserve_crashkernel_low. Then
try to reserve crashkernel low memory after crashkernel high memory
has been allocated. If crashkernel low memory reservation failed
free crashkernel high memory and return. User can take measures
when they found kdump kernel cann't be loaded successfully.

Signed-off-by: Baoquan He <b...@redhat.com>
---
v1->v2:
Boris commented that error value EINVAL is negative, should
use "return -EINVAL".

v2->v3:
Yinghai pointed out that during memblock_reserve, we could double
the memblock reserve array. New memblock reserve could be overlapped
with range for crashkernel high. So we have to reserve crashkernel
high firstly, then free it if crashkernel low memory allocation
failed.

v3->v4:
Dave suggested using "return -ENOMEM" when low memory reservation
failed and printing failure message anyway.

arch/x86/kernel/setup.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index fdb7f2a..e362f92 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -493,7 +493,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
# define CRASH_KERNEL_ADDR_HIGH_MAX MAXMEM
#endif

-static void __init reserve_crashkernel_low(void)
+static int __init reserve_crashkernel_low(void)
{
#ifdef CONFIG_X86_64
const unsigned long long alignment = 16<<20; /* 16M */
@@ -522,17 +522,15 @@ static void __init reserve_crashkernel_low(void)
} else {
/* passed with crashkernel=0,low ? */
if (!low_size)
- return;
+ return 0;
}

low_base = memblock_find_in_range(low_size, (1ULL<<32),
low_size, alignment);

if (!low_base) {
- if (!auto_set)
- pr_info("crashkernel low reservation failed - No suitable area found.\n");
-
- return;
+ pr_info("crashkernel low reservation failed - No suitable area found.\n");
+ return -ENOMEM;
}

memblock_reserve(low_base, low_size);
@@ -544,6 +542,7 @@ static void __init reserve_crashkernel_low(void)
crashk_low_res.end = low_base + low_size - 1;
insert_resource(&iomem_resource, &crashk_low_res);
#endif
+ return 0;
}

static void __init reserve_crashkernel(void)
@@ -595,6 +594,11 @@ static void __init reserve_crashkernel(void)
}
memblock_reserve(crash_base, crash_size);

+ if (crash_base >= (1ULL<<32) && reserve_crashkernel_low()) {
+ memblock_free(crash_base, crash_size);
+ return;
+ }
+
printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
"for crashkernel (System RAM: %ldMB)\n",
(unsigned long)(crash_size >> 20),
@@ -604,9 +608,6 @@ static void __init reserve_crashkernel(void)
crashk_res.start = crash_base;
crashk_res.end = crash_base + crash_size - 1;
insert_resource(&iomem_resource, &crashk_res);
-
- if (crash_base >= (1ULL<<32))
- reserve_crashkernel_low();
}
#else
static void __init reserve_crashkernel(void)
--
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Baoquan He

unread,
Sep 22, 2015, 10:00:12 AM9/22/15
to
Add Jerry to CC list.

Andrew Morton

unread,
Sep 22, 2015, 4:00:18 PM9/22/15
to
On Tue, 22 Sep 2015 19:48:14 +0800 Baoquan He <b...@redhat.com> wrote:

> People reported that when allocating crashkernel memory using
> ",high" and ",low" syntax, there were cases where the reservation
> of the "high" portion succeeds, but the reservation of the "low"
> portion fails. Then kexec can load kdump kernel successfully, but
> the boot of kdump kernel fails as there's no low memory. This is
> because allocation of low memory for kdump kernel can fail on large
> systems for reasons. E.g it could be manually specified crashkernel
> low memory is too large to find in memblock region.
>
> In this patch add return value for reserve_crashkernel_low. Then
> try to reserve crashkernel low memory after crashkernel high memory
> has been allocated. If crashkernel low memory reservation failed
> free crashkernel high memory and return. User can take measures
> when they found kdump kernel cann't be loaded successfully.
>
> ...
>
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -493,7 +493,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
> # define CRASH_KERNEL_ADDR_HIGH_MAX MAXMEM
> #endif
>
> -static void __init reserve_crashkernel_low(void)
> +static int __init reserve_crashkernel_low(void)
> {
> #ifdef CONFIG_X86_64
> const unsigned long long alignment = 16<<20; /* 16M */
> @@ -522,17 +522,15 @@ static void __init reserve_crashkernel_low(void)
> } else {
> /* passed with crashkernel=0,low ? */
> if (!low_size)
> - return;
> + return 0;

What's happening here? It's returning "success" when
parse_crashkernel_low() fails?

> }
>
> low_base = memblock_find_in_range(low_size, (1ULL<<32),
> low_size, alignment);
>
> if (!low_base) {
> - if (!auto_set)
> - pr_info("crashkernel low reservation failed - No suitable area found.\n");
> -
> - return;
> + pr_info("crashkernel low reservation failed - No suitable area found.\n");

That's not a terribly useful message. If kdump is now unavailable and
the operator needs to take some remedial action then we should inform
them of this.

Also, such a message should have higher severity than KERN_INFO?

Baoquan He

unread,
Sep 22, 2015, 8:10:08 PM9/22/15
to
On 09/22/15 at 12:54pm, Andrew Morton wrote:
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -493,7 +493,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
> > # define CRASH_KERNEL_ADDR_HIGH_MAX MAXMEM
> > #endif
> >
> > -static void __init reserve_crashkernel_low(void)
> > +static int __init reserve_crashkernel_low(void)
> > {
> > #ifdef CONFIG_X86_64
> > const unsigned long long alignment = 16<<20; /* 16M */
> > @@ -522,17 +522,15 @@ static void __init reserve_crashkernel_low(void)
> > } else {
> > /* passed with crashkernel=0,low ? */
> > if (!low_size)
> > - return;
> > + return 0;
>
> What's happening here? It's returning "success" when
> parse_crashkernel_low() fails?

It's the case user specify "crashkernel=0,low" to disable
crashkernel low memory allocation explicitly. So here we parse the
cmdline and get it's in this case, reture 0 directly.

>
> > }
> >
> > low_base = memblock_find_in_range(low_size, (1ULL<<32),
> > low_size, alignment);
> >
> > if (!low_base) {
> > - if (!auto_set)
> > - pr_info("crashkernel low reservation failed - No suitable area found.\n");
> > -
> > - return;
> > + pr_info("crashkernel low reservation failed - No suitable area found.\n");
>
> That's not a terribly useful message. If kdump is now unavailable and
> the operator needs to take some remedial action then we should inform
> them of this.
>
> Also, such a message should have higher severity than KERN_INFO?

Yes, how about KERN_ERR? It's an unexpected result from kdump side
though it doesn't harm the normal kernel.

Andrew Morton

unread,
Sep 22, 2015, 8:10:08 PM9/22/15
to
On Wed, 23 Sep 2015 08:02:55 +0800 Baoquan He <b...@redhat.com> wrote:

> >
> > > }
> > >
> > > low_base = memblock_find_in_range(low_size, (1ULL<<32),
> > > low_size, alignment);
> > >
> > > if (!low_base) {
> > > - if (!auto_set)
> > > - pr_info("crashkernel low reservation failed - No suitable area found.\n");
> > > -
> > > - return;
> > > + pr_info("crashkernel low reservation failed - No suitable area found.\n");
> >
> > That's not a terribly useful message. If kdump is now unavailable and
> > the operator needs to take some remedial action then we should inform
> > them of this.
> >
> > Also, such a message should have higher severity than KERN_INFO?
>
> Yes, how about KERN_ERR? It's an unexpected result from kdump side
> though it doesn't harm the normal kernel.

Sure, KERN_ERR is good. Along with more useful message text.

Baoquan He

unread,
Sep 22, 2015, 8:30:09 PM9/22/15
to
On 09/22/15 at 05:08pm, Andrew Morton wrote:
> On Wed, 23 Sep 2015 08:02:55 +0800 Baoquan He <b...@redhat.com> wrote:
>
> > >
> > > > }
> > > >
> > > > low_base = memblock_find_in_range(low_size, (1ULL<<32),
> > > > low_size, alignment);
> > > >
> > > > if (!low_base) {
> > > > - if (!auto_set)
> > > > - pr_info("crashkernel low reservation failed - No suitable area found.\n");
> > > > -
> > > > - return;
> > > > + pr_info("crashkernel low reservation failed - No suitable area found.\n");
> > >
> > > That's not a terribly useful message. If kdump is now unavailable and
> > > the operator needs to take some remedial action then we should inform
> > > them of this.
> > >
> > > Also, such a message should have higher severity than KERN_INFO?
> >
> > Yes, how about KERN_ERR? It's an unexpected result from kdump side
> > though it doesn't harm the normal kernel.
>
> Sure, KERN_ERR is good. Along with more useful message text.

OK, will do. Thanks for your suggestion.

Baoquan He

unread,
Sep 23, 2015, 11:30:09 PM9/23/15
to
People reported that when allocating crashkernel memory using
",high" and ",low" syntax, there were cases where the reservation
of the "high" portion succeeds, but the reservation of the "low"
portion fails. Then kexec can load kdump kernel successfully, but
the boot of kdump kernel fails as there's no low memory. This is
because allocation of low memory for kdump kernel can fail on large
systems for reasons. E.g it could be manually specified crashkernel
low memory is too large to find in memblock region.

In this patch add return value for reserve_crashkernel_low. Then
try to reserve crashkernel low memory after crashkernel high memory
has been allocated. If crashkernel low memory reservation failed
free crashkernel high memory and return. User can take measures
when they found kdump kernel cann't be loaded successfully.

Signed-off-by: Baoquan He <b...@redhat.com>
---
v1->v2:
Boris commented that error value EINVAL is negative, should
use "return -EINVAL".

v2->v3:
Yinghai pointed out that during memblock_reserve, we could double
the memblock reserve array. New memblock reserve could be overlapped
with range for crashkernel high. So we have to reserve crashkernel
high firstly, then free it if crashkernel low memory allocation
failed.

v3->v4:
Dave suggested using "return -ENOMEM" when low memory reservation
failed and printing failure message anyway.

v4->v5:
Andrew suggested changing the content of failure message and taking
higher severity KERN_ERR.

arch/x86/kernel/setup.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index fdb7f2a..e510e61 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -493,7 +493,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
# define CRASH_KERNEL_ADDR_HIGH_MAX MAXMEM
#endif

-static void __init reserve_crashkernel_low(void)
+static int __init reserve_crashkernel_low(void)
{
#ifdef CONFIG_X86_64
const unsigned long long alignment = 16<<20; /* 16M */
@@ -522,17 +522,16 @@ static void __init reserve_crashkernel_low(void)
} else {
/* passed with crashkernel=0,low ? */
if (!low_size)
- return;
+ return 0;
}

low_base = memblock_find_in_range(low_size, (1ULL<<32),
low_size, alignment);

if (!low_base) {
- if (!auto_set)
- pr_info("crashkernel low reservation failed - No suitable area found.\n");
-
- return;
+ pr_err("Failed to reserve %ldMB of low memory for crashkernel, please try smaller size.\n",
+ (unsigned long)(low_size >> 20));
+ return -ENOMEM;
}

memblock_reserve(low_base, low_size);
@@ -544,6 +543,7 @@ static void __init reserve_crashkernel_low(void)
crashk_low_res.end = low_base + low_size - 1;
insert_resource(&iomem_resource, &crashk_low_res);
#endif
+ return 0;
}

static void __init reserve_crashkernel(void)
@@ -595,6 +595,11 @@ static void __init reserve_crashkernel(void)
}
memblock_reserve(crash_base, crash_size);

+ if (crash_base >= (1ULL<<32) && reserve_crashkernel_low()) {
+ memblock_free(crash_base, crash_size);
+ return;
+ }
+
printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
"for crashkernel (System RAM: %ldMB)\n",
(unsigned long)(crash_size >> 20),
@@ -604,9 +609,6 @@ static void __init reserve_crashkernel(void)
crashk_res.start = crash_base;
crashk_res.end = crash_base + crash_size - 1;
insert_resource(&iomem_resource, &crashk_res);
-
- if (crash_base >= (1ULL<<32))
- reserve_crashkernel_low();
}
#else
static void __init reserve_crashkernel(void)
--
2.4.0

Baoquan He

unread,
Oct 14, 2015, 2:20:06 AM10/14/15
to
Hi all,

Ping.

Any other suggestion for this fix?

Joerg Roedel

unread,
Oct 14, 2015, 4:50:07 AM10/14/15
to
Reviewed-by: Joerg Roedel <jro...@suse.de>

The patch is also in SLES for some time now and was successfully tested
there.


Joerg

Borislav Petkov

unread,
Oct 14, 2015, 6:50:07 AM10/14/15
to
On Wed, Oct 14, 2015 at 10:43:28AM +0200, Joerg Roedel wrote:
> Reviewed-by: Joerg Roedel <jro...@suse.de>
>
> The patch is also in SLES for some time now and was successfully tested
> there.

Thanks, I massaged it a bit and ended up applying this:

---
From: Baoquan He <b...@redhat.com>
Date: Thu, 24 Sep 2015 11:24:51 +0800
Subject: [PATCH] x86/setup: Do not reserve crashkernel high memory if low
reservation failed

People reported that when allocating crashkernel memory using the
",high" and ",low" syntax, there were cases where the reservation of the
high portion succeeds but the reservation of the low portion fails.

Then kexec can load the kdump kernel successfully, but booting the kdump
kernel fails as there's no low memory.

The low memory allocation for the kdump kernel can fail on large systems
for a couple of reasons. For example, the manually specified crashkernel
low memory can be too large and thus no adequate memblock region would
be found.

Therefore, we try to reserve low memory for the crash kernel *after*
the high memory portion has been allocated. If that fails, we free
crashkernel high memory too and return. The user can then take measures
accordingly.

Signed-off-by: Baoquan He <b...@redhat.com>
Reviewed-by: Joerg Roedel <jro...@suse.de>
Tested-by: Joerg Roedel <jro...@suse.de>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: Andy Lutomirski <lu...@amacapital.net>
Cc: Dave Young <dyo...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: jerry_...@hp.com
Cc: Jiri Kosina <jko...@suse.cz>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Mark Salter <msa...@redhat.com>
Cc: Thomas Gleixner <tg...@linutronix.de>
Cc: WANG Chao <chao...@redhat.com>
Cc: x86-ml <x...@kernel.org>
Cc: yin...@kernel.org
Link: http://lkml.kernel.org/r/1443065091-28198-1...@redhat.com
[ Massage text. ]
Signed-off-by:
---
arch/x86/kernel/setup.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index fdb7f2a2d328..1b36839e41eb 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -493,7 +493,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
# define CRASH_KERNEL_ADDR_HIGH_MAX MAXMEM
#endif

-static void __init reserve_crashkernel_low(void)
+static int __init reserve_crashkernel_low(void)
{
#ifdef CONFIG_X86_64
const unsigned long long alignment = 16<<20; /* 16M */
@@ -522,17 +522,16 @@ static void __init reserve_crashkernel_low(void)
} else {
/* passed with crashkernel=0,low ? */
if (!low_size)
- return;
+ return 0;
}

low_base = memblock_find_in_range(low_size, (1ULL<<32),
low_size, alignment);

if (!low_base) {
- if (!auto_set)
- pr_info("crashkernel low reservation failed - No suitable area found.\n");
-
- return;
+ pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
2.3.5

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--
0 new messages