Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0

Adam Strohl

unread,

Mar 10, 2012, 3:08:56 AM3/10/12

to FreeBSD-Stable ML

I've now seen this on two different VMs on two different ESXi servers
(Xeon based hosts but different hardware otherwise and at different
facilities):

Everything runs fine for weeks then (seemingly) suddenly/randomly the
clock STOPS. In the first case I saw a jump backwards of about 15
minutes (and then a 'freeze' of the clock). The second time just 'time
standing still' with no backwards jump. Logging accuracy is of course
questionable given the nature of the issue, but nothing really jumps out
(ie; I don't see NTPd adjusting the time just before this happens or
anything like that).

Naturally the clock stopping causes major issues, but the machine does
technically stay running. My open sessions respond, but anything that
relies on time moving forward hangs. I can't even gracefully reboot it
because shutdown/etc all rely on time moving forward (heh).

So I'm not sure if this is a VMWare/ESXi issue or a FreeBSD issue, or
some kind of interaction between the two. I manage lots of VMWare
based FreeBSD VMs, but these are the only ESXi 5.0 servers and the only
FreeBSD 9.0 VMs. I have never seen anything quite like this before, and
last night as I mentioned above I had it happen for the second time on a
different VM + ESXi server combo so I'm not thinking its a fluke
anymore. I've looked for other reports of this both in VMWare and
FreeBSD contexts and not seeing anything.

What is interesting is that the 2 servers that have shown this issue
perform similar tasks, which are different from the other VMs which have
not shown this issue (yet). This is 2 VMs out of a dozen VMs spread
over two ESXi servers on different coasts. This might be a coincidence
but seems suspicious. These two VMs run these services (where as the
other VMs don't):

- BIND
- CouchDB
- MySQL
- NFS server
- Dovecot 2.x

I would also say that these two VMs probably are the most active, have
the most RAM and consume the most CPU because of what they do (vs. the
others).

I have disabled NTPd since I am running the OpenVM Tools (which I
believe should be keeping the time in sync with the ESXi host, which
itself uses NTP), my only guess is maybe there is some kind of collision
where NTPd and OpenVMTools were adjusting the time at the same time.
I'm playing the waiting game now to see what this brings (again though I
am running NTPd and OpenVMTools on all the other VMs which have yet to
show this issue).

Anyone seen anything like this? Ring any bells?

--

Adam Strohl
A-Team Systems
http://ateamsystems.com/

_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

Adam Strohl

unread,

Mar 10, 2012, 8:58:25 AM3/10/12

to Bjoern A. Zeeb, FreeBSD-Stable ML

On 3/10/2012 17:10, Bjoern A. Zeeb wrote:

> On 10. Mar 2012, at 08:07 , Adam Strohl wrote:
>
>> I've now seen this on two different VMs on two different ESXi servers (Xeon based hosts but different hardware otherwise and at different facilities):
>>
>> Everything runs fine for weeks then (seemingly) suddenly/randomly the clock STOPS.
>

> Apart from the ntp vs. openvm-tools thing, do you have an idea what "for weeks" means in more detail? Can you check based on last/daily mails/.. how many days it was since last reboot to a) see if it's close to a integer wrap-around or b) to give anyone who wants to reproduce this maybe a clue on how long they'll have to wait? For that matter, is it a stock 9.0 or your own kernel? What other modules are loaded?

Uptime was 31 days on the first incident / server (occurred 5 days ago)
Uptime was 4 days on the second incident / server (occurred last night)

One additional unique factor I just thought of: the two problem VMs have
4 cores allocated to them inside ESXi, while the rest have 2 cores.

Kernel config is a copy of GENERIC (amd64) with the following lines
added to the bottom. All the VMs use this same kernel which I compiled
once and then installed via NFS on the rest:

# -- Add Support for nicer console
#
options VESA
options SC_PIXEL_MODE

# -- IPFW support
#
options IPFIREWALL
options IPFIREWALL_VERBOSE
options IPFIREWALL_VERBOSE_LIMIT=10
options IPDIVERT
options IPFIREWALL_FORWARD

Ian Lepore

unread,

Mar 11, 2012, 1:04:33 PM3/11/12

to Adam Strohl, FreeBSD-Stable ML

I've run into the "time standing still" problem, but only on bringing up
FreeBSD on new hardware (usually industrial single-board computers). In
those cases time never advances beyond the time obtained from the RTC
hardware at boot. I've never seen it happen that time runs normally for
a while then stops advancing, but I have almost no experience with
FreeBSD as a VM guest OS.

When I have seen the problem, it's always been due to interrupt
problems, such as the timer tick handler getting hung or the selected
timer hardware not generating interrupts.

It seems unlikely to me that ntpd and the vm tools would be fighting in
a way that caused this symptom. The way ntpd affects timing is to step
the clock (which gets logged), or to numerically steer the kernel's
timekeeping routines. The steering is clamped at 500 ppm; to make the
clock appear to stop it would have to steer at 1e6 ppm. I've always
assumed that VM guest services daemons that handle timekeeping use the
same ntp_adjtime() interface to the kernel timekeeping that ntpd itself
uses, so the same steering limits would apply.

If it happens again, interesting data might be found in the output of:

sysctl kern.timecounter
sysctl kern.eventtimer
vmstat -i
ntpdc -c kerninfo
<anything unusual in dmesg output>

-- Ian

Steve Wills

unread,

Mar 18, 2012, 2:38:20 PM3/18/12

to Adam Strohl, Bjoern A. Zeeb, FreeBSD-Stable ML

On 03/10/12 08:56, Adam Strohl wrote:
> On 3/10/2012 17:10, Bjoern A. Zeeb wrote:
>> On 10. Mar 2012, at 08:07 , Adam Strohl wrote:
>>
>>> I've now seen this on two different VMs on two different ESXi servers
>>> (Xeon based hosts but different hardware otherwise and at different
>>> facilities):
>>>
>>> Everything runs fine for weeks then (seemingly) suddenly/randomly the
>>> clock STOPS.
>>
>> Apart from the ntp vs. openvm-tools thing, do you have an idea what
>> "for weeks" means in more detail? Can you check based on last/daily
>> mails/.. how many days it was since last reboot to a) see if it's
>> close to a integer wrap-around or b) to give anyone who wants to
>> reproduce this maybe a clue on how long they'll have to wait? For
>> that matter, is it a stock 9.0 or your own kernel? What other modules
>> are loaded?
>
> Uptime was 31 days on the first incident / server (occurred 5 days ago)
> Uptime was 4 days on the second incident / server (occurred last night)
>

I've experienced something similar once or twice with ESXi 5.0. The
second time it happened, I found that kern.timecounter.tc.HPET.counter
stopped changing. I was told on IRC that this indicated a "hardware"
problem, which I took to indicate a possible bug in ESXi. I haven't
upgraded to ESXi 5.0 Update 1 yet to see if that changes anything.
Rebooting of course fixed it, it has been a while since this happened
and it hasn't happened again since so I haven't pursued it. Just another
data point, hope it hopes.

Steve

Adam Strohl

unread,

Mar 19, 2012, 2:28:04 AM3/19/12

to Steve Wills, Bjoern A. Zeeb, FreeBSD-Stable ML

On 3/12/2012 0:01, Ian Lepore wrote:
> It seems unlikely to me that ntpd and the vm tools would be fighting in
> a way that caused this symptom. The way ntpd affects timing is to step
> the clock (which gets logged), or to numerically steer the kernel's
> timekeeping routines. The steering is clamped at 500 ppm; to make the
> clock appear to stop it would have to steer at 1e6 ppm. I've always
> assumed that VM guest services daemons that handle timekeeping use the
> same ntp_adjtime() interface to the kernel timekeeping that ntpd itself
> uses, so the same steering limits would apply.

An excellent point.

>
> If it happens again, interesting data might be found in the output of:
>
> sysctl kern.timecounter
> sysctl kern.eventtimer
> vmstat -i
> ntpdc -c kerninfo
> <anything unusual in dmesg output>

Will do, I know there was nothing in dmesg, I will definitely check all
of this though if/when it happens again. I just brought up another ESXi
5.0 host with FreeBSD 9.0 VMs (created from dump/restore from the
existing ones), so there is an increased chance of me seeing this
hopefully and getting to the bottom of it. Or it never happens again :P

On 3/19/2012 1:36, Steve Wills wrote:
> I've experienced something similar once or twice with ESXi 5.0. The
> second time it happened, I found that kern.timecounter.tc.HPET.counter
> stopped changing. I was told on IRC that this indicated a "hardware"
> problem, which I took to indicate a possible bug in ESXi. I haven't
> upgraded to ESXi 5.0 Update 1 yet to see if that changes anything.
> Rebooting of course fixed it, it has been a while since this happened
> and it hasn't happened again since so I haven't pursued it. Just another
> data point, hope it hopes.

Thanks for the info! I didn't realize there was an update out already
for 5.0 (I don't see it on VMWare's site).

Mike Tkachuk

unread,

Mar 22, 2012, 9:20:54 AM3/22/12

to freebsd...@freebsd.org

Hello Ian,

I'm also facing same problem, just updated to esxi 5 update1, will
see if it changes anything.
It really looks like an esxi problem but I did not experienced it
with FreeBSD 8
Here is the output of requested commands:

sysctl kern.timecounter
kern.timecounter.tick: 1
kern.timecounter.choice: TSC(-100) i8254(0) ACPI-fast(900) HPET(950) dummy(-1000000)
kern.timecounter.hardware: HPET
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 1217640570
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.ACPI-fast.counter: 708780
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 14912
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.TSC.counter: 1849391829
kern.timecounter.tc.TSC.frequency: 3411483000
kern.timecounter.tc.TSC.quality: -100
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 1

sysctl kern.eventtimer
kern.eventtimer.choice: LAPIC(600) i8254(100) RTC(0)
kern.eventtimer.et.LAPIC.flags: 7
kern.eventtimer.et.LAPIC.frequency: 33000086
kern.eventtimer.et.LAPIC.quality: 600
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.periodic: 0
kern.eventtimer.timer: LAPIC
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 4

vmstat -i
interrupt total rate
irq1: atkbd0 192 0
irq15: ata1 17 0
irq18: em0 3519101 2
cpu0:timer 92428991 63
irq256: mpt0 1887875 1
cpu2:timer 29982105 20
cpu1:timer 46249642 31
cpu3:timer 15983874 10
cpu6:timer 4721414 3
cpu7:timer 4554357 3
cpu4:timer 8397088 5
cpu5:timer 6163947 4
Total 213888603 146

----- near 10 sec later -----

sysctl kern.timecounter
kern.timecounter.tick: 1
kern.timecounter.choice: TSC(-100) i8254(0) ACPI-fast(900) HPET(950) dummy(-1000000)
kern.timecounter.hardware: HPET
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 1217640570
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.ACPI-fast.counter: 802778
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 24402
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.TSC.counter: 3853372129
kern.timecounter.tc.TSC.frequency: 3411483000
kern.timecounter.tc.TSC.quality: -100
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 1

sysctl kern.eventtimer
kern.eventtimer.choice: LAPIC(600) i8254(100) RTC(0)
kern.eventtimer.et.LAPIC.flags: 7
kern.eventtimer.et.LAPIC.frequency: 33000086
kern.eventtimer.et.LAPIC.quality: 600
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.periodic: 0
kern.eventtimer.timer: LAPIC
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 4

vmstat -i
interrupt total rate
irq1: atkbd0 192 0
irq15: ata1 17 0
irq18: em0 3519133 2
cpu0:timer 92429487 63
irq256: mpt0 1887875 1
cpu2:timer 29983186 20
cpu1:timer 46250719 31
cpu3:timer 15983969 10
cpu6:timer 4721451 3
cpu7:timer 4554394 3
cpu4:timer 8397125 5
cpu5:timer 6164274 4
Total 213891822 146

--
Best regards,
Mike mailto:mi...@tkachuk.name

Andriy Gapon

unread,

Mar 22, 2012, 11:08:55 AM3/22/12

to Mike Tkachuk, freebsd...@freebsd.org

on 22/03/2012 15:19 Mike Tkachuk said the following:
> kern.eventtimer.periodic: 0

It might make sense to try 1 here.
Also you could attempt to involve mav@ directly - here is an author of the code
and an expert on it.

--
Andriy Gapon

Volodymyr Kostyrko

unread,

Mar 22, 2012, 11:35:30 AM3/22/12

to Andriy Gapon, freebsd...@freebsd.org, Mike Tkachuk

Andriy Gapon wrote:
> on 22/03/2012 15:19 Mike Tkachuk said the following:
>> kern.eventtimer.periodic: 0
>
> It might make sense to try 1 here.
> Also you could attempt to involve mav@ directly - here is an author of the code
> and an expert on it.

Better ask before setting as this doubles hpet0 (with HPET) or
cpu0:timer (with LAPIC) interrupt rate for me.

--
Sphinx of black quartz judge my vow.

Andriy Gapon

unread,

Mar 22, 2012, 11:57:52 AM3/22/12

to Volodymyr Kostyrko, freebsd...@freebsd.org, Mike Tkachuk

on 22/03/2012 17:33 Volodymyr Kostyrko said the following:

> Andriy Gapon wrote:
>> on 22/03/2012 15:19 Mike Tkachuk said the following:
>>> kern.eventtimer.periodic: 0
>>
>> It might make sense to try 1 here.
>> Also you could attempt to involve mav@ directly - here is an author of the code
>> and an expert on it.
>
> Better ask before setting as this doubles hpet0 (with HPET) or cpu0:timer (with
> LAPIC) interrupt rate for me.

Does it make your system unusable?
Are you comparing with pre-eventtimers version of FreeBSD?

--
Andriy Gapon

Volodymyr Kostyrko

unread,

Mar 22, 2012, 12:15:09 PM3/22/12

to Andriy Gapon, freebsd...@freebsd.org, Mike Tkachuk

Andriy Gapon wrote:
> on 22/03/2012 17:33 Volodymyr Kostyrko said the following:
>> Andriy Gapon wrote:
>>> on 22/03/2012 15:19 Mike Tkachuk said the following:
>>>> kern.eventtimer.periodic: 0
>>>
>>> It might make sense to try 1 here.
>>> Also you could attempt to involve mav@ directly - here is an author of the code
>>> and an expert on it.
>>
>> Better ask before setting as this doubles hpet0 (with HPET) or cpu0:timer (with
>> LAPIC) interrupt rate for me.
>
> Does it make your system unusable?
> Are you comparing with pre-eventtimers version of FreeBSD?

In short term - no. Haven't tested it thoroughly. Results are the same
(double interrupt rate according to `systat 1 -v`) for:
* i386 and amd64 9-STABLE;
* amd64 9.0.

As everything related to timing/freq/acpi can be unpredictive I wouldn't
recommend this to anyone. I own at least two Intel CPU's failing
somewhere near timing/apic when loading cpufreq and enabling powerd.

--
Sphinx of black quartz judge my vow.

Andriy Gapon

unread,

Mar 22, 2012, 12:34:57 PM3/22/12

to Volodymyr Kostyrko, freebsd...@freebsd.org, Mike Tkachuk

on 22/03/2012 18:13 Volodymyr Kostyrko said the following:

> Andriy Gapon wrote:
>> on 22/03/2012 17:33 Volodymyr Kostyrko said the following:
>>> Andriy Gapon wrote:
>>>> on 22/03/2012 15:19 Mike Tkachuk said the following:
>>>>> kern.eventtimer.periodic: 0
>>>>
>>>> It might make sense to try 1 here.
>>>> Also you could attempt to involve mav@ directly - here is an author of the code
>>>> and an expert on it.
>>>
>>> Better ask before setting as this doubles hpet0 (with HPET) or cpu0:timer (with
>>> LAPIC) interrupt rate for me.
>>
>> Does it make your system unusable?
>> Are you comparing with pre-eventtimers version of FreeBSD?
>
> In short term - no. Haven't tested it thoroughly. Results are the same (double
> interrupt rate according to `systat 1 -v`) for:
> * i386 and amd64 9-STABLE;
> * amd64 9.0.

No comment.

> As everything related to timing/freq/acpi can be unpredictive I wouldn't recommend
> this to anyone. I own at least two Intel CPU's failing somewhere near timing/apic
> when loading cpufreq and enabling powerd.
>

What exactly you wouldn't recommend?
Let's not introduce unrelated topics and vague uncertainties.

Setting kern.eventtimer.periodic to 1 makes eventtimer subsystem to behave less
efficiently but more similar to the pre-eventtimer code. So this is #1 suggestion
when people run into some new problems with eventtimers. Which is what this
thread is about.

--
Andriy Gapon

Ian Lepore

unread,

Mar 22, 2012, 12:36:24 PM3/22/12

to Volodymyr Kostyrko, Mi...@freebsd.org, freebsd...@freebsd.org, Andriy Gapon, Tkachuk

On Thu, 2012-03-22 at 18:13 +0200, Volodymyr Kostyrko wrote:
> Andriy Gapon wrote:
> > on 22/03/2012 17:33 Volodymyr Kostyrko said the following:
> >> Andriy Gapon wrote:
> >>> on 22/03/2012 15:19 Mike Tkachuk said the following:
> >>>> kern.eventtimer.periodic: 0
> >>>
> >>> It might make sense to try 1 here.
> >>> Also you could attempt to involve mav@ directly - here is an author of the code
> >>> and an expert on it.
> >>
> >> Better ask before setting as this doubles hpet0 (with HPET) or cpu0:timer (with
> >> LAPIC) interrupt rate for me.
> >
> > Does it make your system unusable?
> > Are you comparing with pre-eventtimers version of FreeBSD?
>
> In short term - no. Haven't tested it thoroughly. Results are the same
> (double interrupt rate according to `systat 1 -v`) for:
> * i386 and amd64 9-STABLE;
> * amd64 9.0.
>
> As everything related to timing/freq/acpi can be unpredictive I wouldn't
> recommend this to anyone. I own at least two Intel CPU's failing
> somewhere near timing/apic when loading cpufreq and enabling powerd.
>

I'm not sure I understand that advice. We have someone whose system is
failing (time stops counting) when using the new event timer code. The
recommendation is to set kern.eventtimer.periodic=1, which as I
understand it makes the new code work more like it did before. That
seems to be a reasonable attempt to work around the problem.

If it works, the system becomes 100% more usable than it is now, even if
that comes at the cost of timers interrupting twice as fast as they did
in previous OS releases. It also generates another datapoint that might
somehow help track down why the event timer code has trouble on some
hardware. Enough such datapoints may eventually lead to an "aha -- it
happens on all systems that have the xyz chipset."

-- Ian

Volodymyr Kostyrko

unread,

Mar 22, 2012, 12:48:19 PM3/22/12

to Andriy Gapon, freebsd...@freebsd.org, Mike Tkachuk

Andriy Gapon wrote:

>> As everything related to timing/freq/acpi can be unpredictive I wouldn't recommend
>> this to anyone. I own at least two Intel CPU's failing somewhere near timing/apic
>> when loading cpufreq and enabling powerd.
> What exactly you wouldn't recommend?
> Let's not introduce unrelated topics and vague uncertainties.
>
> Setting kern.eventtimer.periodic to 1 makes eventtimer subsystem to behave less
> efficiently but more similar to the pre-eventtimer code. So this is #1 suggestion
> when people run into some new problems with eventtimers. Which is what this
> thread is about.

I'm sorry, I totally misunderstood the meaning of this tunable. Its
unclear from man page which value enables or disables periodic code.

--
Sphinx of black quartz judge my vow.

Daniel Braniss

unread,

Apr 8, 2012, 5:36:33 AM4/8/12

to Ian Lepore, Volodymyr Kostyrko, Tkachuk, freebsd...@freebsd.org, Andriy Gapon, Mi...@freebsd.org

Just a me too:
but it was running 8.2-stable!
since it's a production machine, I had no choice but to reboot it.
Also the BIOS time got stuck, so I had to fix the time manualy! ntpd doesn't
like
to advance past a certain delta.

cheers,
danny

Adam Strohl

unread,

Aug 3, 2012, 3:47:54 AM8/3/12

to FreeBSD-Stable ML

Just a heads up on the original issue, which is FreeBSD's timer/clock
stopping under ESXi 5.0 and some later versions of VMware Workstation.

I've gotten a few direct messages that this thread ranks high on Google
but people are missing the solution. A few months ago I found this
forum posting (I believe this was linked in this thread already)
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2012-03/msg00201.html

The long and short of it is that changing the kern.timecounter sysctl
value to ACPI-fast or (ACPI-safe if you're not running 9.x yet) fixes
the hanging issue so far for us.

To temporarily enable it under 9.x:
sysctl kern.timecounter.hardware=ACPI-fast

Pre 9.x (which doesn't have the ACPI-fast mode):
sysctl kern.timecounter.hardware=ACPI-safe

To make this persist across reboots and be enabled by default add this
line to your /etc/sysctl.conf

Under 9.x:
kern.timecounter.hardware=ACPI-fast

Pre 9.x:
kern.timecounter.hardware=ACPI-safe

Hope this helps anyone running across this issue.

--
Adam Strohl
http://www.ateamsystems.com/

Adam Strohl

unread,

Aug 3, 2012, 4:01:19 AM8/3/12

to FreeBSD-Stable ML

Doh, correct URL for the forum post is:
http://forums.freebsd.org/showthread.php?t=31929&page=2

Marcelo Gondim

unread,

Aug 3, 2012, 8:14:46 AM8/3/12

to freebsd...@freebsd.org

Hi all,

I sent a PR [1] but I decided to also send the problem here.
If you try to destroy a geom_virstor that does not exist, this causes a
kernel panic immediately.

Just try:

gvirstor load
gvirstor destroy tatata

# uname -a
FreeBSD zeus.xxxx.xxx.br 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #27: Mon
Jul 16 01:41:24 BRT 2012
ro...@zeus.xxxx.xxx.br:/usr/obj/usr/src/sys/GONDIM amd64

[1] http://www.freebsd.org/cgi/query-pr.cgi?pr=170199

Best regards,
Gondim

Mark Saad

unread,

Aug 3, 2012, 8:31:40 AM8/3/12

to Adam Strohl, FreeBSD-Stable ML

---

Did you ask VMware to update this kb ?

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006427

They need to have a FreeBSD section .

It's slightly sad VMware could. It figure this out . They have notes on disabling hpet due to screwy clock issues under Linux , wonder why they never tried to disable it under FreeBSD ?

---
Mark saad | mark...@longcount.org

Jim Harris

unread,

Aug 3, 2012, 12:52:20 PM8/3/12

to Marcelo Gondim, freebsd...@freebsd.org

On Fri, Aug 3, 2012 at 5:06 AM, Marcelo Gondim <gon...@bsdinfo.com.br> wrote:
> Hi all,
>
> I sent a PR [1] but I decided to also send the problem here.
> If you try to destroy a geom_virstor that does not exist, this causes a
> kernel panic immediately.
>
> Just try:
>
> gvirstor load
> gvirstor destroy tatata
>
> # uname -a
> FreeBSD zeus.xxxx.xxx.br 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #27: Mon Jul
> 16 01:41:24 BRT 2012 ro...@zeus.xxxx.xxx.br:/usr/obj/usr/src/sys/GONDIM
> amd64
>
> [1] http://www.freebsd.org/cgi/query-pr.cgi?pr=170199
>
> Best regards,
> Gondim
>

Hi Gondim,

Can you test the following patch?

Index: sys/geom/virstor/g_virstor.c
===================================================================
--- sys/geom/virstor/g_virstor.c (revision 238909)
+++ sys/geom/virstor/g_virstor.c (working copy)
@@ -235,6 +235,12 @@
return;
}
sc = virstor_find_geom(cp, name);
+ if (sc == NULL) {
+ gctl_error(req, "Don't know anything about '%s'", name);
+ g_topology_unlock();
+ return;
+ }
+
LOG_MSG(LVL_INFO, "Stopping %s by the userland command",
sc->geom->name);
update_metadata(sc);

Thanks,

-Jim

Marcelo Gondim

unread,

Aug 3, 2012, 1:06:18 PM8/3/12

to Jim Harris, freebsd...@freebsd.org

Em 03/08/2012 13:49, Jim Harris escreveu:
> ===================================================================
> --- sys/geom/virstor/g_virstor.c (revision 238909)
> +++ sys/geom/virstor/g_virstor.c (working copy)
> @@ -235,6 +235,12 @@
> return;
> }
> sc = virstor_find_geom(cp, name);
> + if (sc == NULL) {
> + gctl_error(req, "Don't know anything about '%s'", name);
> + g_topology_unlock();
> + return;
> + }
> +
> LOG_MSG(LVL_INFO, "Stopping %s by the userland command",
> sc->geom->name);
> update_metadata(sc);

Hi Jim,

When I applied the patch gave this error:

# patch < /root/patch.diff
Hmm... Looks like a unified diff to me...
The text leading up to this was:
--------------------------

|--- sys/geom/virstor/g_virstor.c (revision 238909)
|+++ sys/geom/virstor/g_virstor.c (working copy)

--------------------------
Patching file sys/geom/virstor/g_virstor.c using Plan A...
Hunk #1 failed at 235.
1 out of 1 hunks failed--saving rejects to sys/geom/virstor/g_virstor.c.rej
done

# cat sys/geom/virstor/g_virstor.c.rej
***************
*** 235,240 ****

return;
}
sc = virstor_find_geom(cp, name);

LOG_MSG(LVL_INFO, "Stopping %s by the userland command",
sc->geom->name);
update_metadata(sc);

--- 235,246 ----

return;
}
sc = virstor_find_geom(cp, name);
+ if (sc == NULL) {
+ gctl_error(req, "Don't know anything about
'%s'", name);
+ g_topology_unlock();
+ return;
+ }
+
LOG_MSG(LVL_INFO, "Stopping %s by the userland command",
sc->geom->name);
update_metadata(sc);

Marcelo Gondim

unread,

Aug 3, 2012, 1:48:37 PM8/3/12

to Jim Harris, freebsd...@freebsd.org

Em 03/08/2012 14:22, Jim Harris escreveu:

> On Fri, Aug 3, 2012 at 10:04 AM, Marcelo Gondim <gon...@bsdinfo.com.br> wrote:
>> Hi Jim,
>>
>> When I applied the patch gave this error:
>>
>> # patch < /root/patch.diff
>> Hmm... Looks like a unified diff to me...
>> The text leading up to this was:
>> --------------------------
>>
>> |--- sys/geom/virstor/g_virstor.c (revision 238909)
>> |+++ sys/geom/virstor/g_virstor.c (working copy)
>> --------------------------
>> Patching file sys/geom/virstor/g_virstor.c using Plan A...
>> Hunk #1 failed at 235.
>> 1 out of 1 hunks failed--saving rejects to sys/geom/virstor/g_virstor.c.rej
>> done
>>
>>
>>

> Strange. It applies with no issues on my checkout.
>
> Let's try an attachment instead. If this doesn't work, could you
> kindly apply the patch by hand?
>
> Thanks,
>
> -Jim
Hi Jim,

# patch < /root/patch.diff
Hmm... Looks like a unified diff to me...
The text leading up to this was:
--------------------------

|Index: sys/geom/virstor/g_virstor.c
|===================================================================
|--- sys/geom/virstor/g_virstor.c (revision 239019)

|+++ sys/geom/virstor/g_virstor.c (working copy)
--------------------------
Patching file sys/geom/virstor/g_virstor.c using Plan A...

Hunk #1 succeeded at 234.
done

I will now compile and test.
I will return with the result. :)

Marcelo Gondim

unread,

Aug 3, 2012, 4:14:24 PM8/3/12

to Jim Harris, freebsd...@freebsd.org

Em 03/08/2012 14:22, Jim Harris escreveu:

> On Fri, Aug 3, 2012 at 10:04 AM, Marcelo Gondim <gon...@bsdinfo.com.br> wrote:
>> Hi Jim,
>>
>> When I applied the patch gave this error:
>>
>> # patch < /root/patch.diff
>> Hmm... Looks like a unified diff to me...
>> The text leading up to this was:
>> --------------------------
>>
>> |--- sys/geom/virstor/g_virstor.c (revision 238909)
>> |+++ sys/geom/virstor/g_virstor.c (working copy)
>> --------------------------
>> Patching file sys/geom/virstor/g_virstor.c using Plan A...
>> Hunk #1 failed at 235.
>> 1 out of 1 hunks failed--saving rejects to sys/geom/virstor/g_virstor.c.rej
>> done
>>
>>
>>

> Strange. It applies with no issues on my checkout.
>
> Let's try an attachment instead. If this doesn't work, could you
> kindly apply the patch by hand?
>
> Thanks,
>
> -Jim
Hi Jim,

Perfect!!!

# gvirstor destroy tudo
gvirstor: Don't know anything about 'tudo'

Jim Harris

unread,

Aug 3, 2012, 4:38:11 PM8/3/12

to Marcelo Gondim, freebsd...@freebsd.org

On Fri, Aug 3, 2012 at 1:12 PM, Marcelo Gondim <gon...@bsdinfo.com.br> wrote:
>
> Hi Jim,
>

> Perfect!!!
>
> # gvirstor destroy tudo
> gvirstor: Don't know anything about 'tudo'
>
>
>

Patch applied to head as r239021. I have requested approval from re@
to merge to stable/9.

Thank you for confirming the patch on your end.

Regards,

-Jim