I have a problem with the network while running VirtualBox.
As soon as I _run_ a VirtualBox I am not able to copy large files
(e.g. virtual disks or ZFS snapshots) using ssh/scp to another machine.
The ssh crashes with "Write failed: Cannot allocate memory"
thrown by a write(2) in /usr/src/crypto/openssh/roaming_common.c (in
function roaming_write). It returns the ENOMEM (an error it should
never return, according to the mainpage;-)
It is immediately working when I stop the VirtualBox, even if the
VirtualBox kernel modules are still loaded.
I also "replaced" the VirtualBox load with lookbusy occuping the 2GB
of memory the VirtualBox usually uses (to emulate the memory
footprint) but it still works.
I experienced the problem with VirtualBox 3.2 first but the upgrade to
VirtualBox 4.0.8 and the base system recently did not help.
> uname -a
FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun
30 17:07:18 EST 2011
ro...@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64
I discussed it on the -stable mailinglist before, as the link to
VirtualBox wasn't obvious to me at first. Scott Sipe had the same
experience, he started the thread:
http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063172.html
He comes to the same conclusion:
http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063234.html
> This is it -- I'm seeing the exact same thing.
>
> Scp dies reliably with VirtualBox running. Quit VirtualBox and I was able to
> scp about 30 large files with no errors. Once I started VirtualBox an
> in-progress scp died within seconds.
>
> Ditto that the Kernel modules merely being loaded don't seem to make a
> difference, it's VirtualBox actually running.
>
> virtualbox-ose-3.2.12_1
Some technical details related to my machine:
http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063221.html
to Scott's here: http://www.cap-press.com/misc/
A similar problem was reported by Mahlon E. Smith in September 2010:
http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.html
I became aware of it because there was the same bce(4) card involved
but that may be a red herring. Scott's is using em(4).
At the moment it is a real showstopper for running VirtualBox/FreeBSD
production because I cannot backup VirtualBoxes. Mahlon gave up on it
and uses Citrix by now (but is still keen to have this solved).
Any idea what causes the problem? I am happy to gather information,
applying patches etc. if it helps.
Thanks for any help
Peter
_______________________________________________
freebsd-...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-emulation
To unsubscribe, send any mail to "freebsd-emulat...@freebsd.org"
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-...@muc.de
> Hi all,
>
> I have a problem with the network while running VirtualBox.
>
> As soon as I _run_ a VirtualBox I am not able to copy large files (e.g.
> virtual disks or ZFS snapshots) using ssh/scp to another machine.
>
> The ssh crashes with "Write failed: Cannot allocate memory"
> <snip>
> At the moment it is a real showstopper for running VirtualBox/FreeBSD
> production because I cannot backup VirtualBoxes. Mahlon gave up on it and
> uses Citrix by now (but is still keen to have this solved).
>
> Any idea what causes the problem? I am happy to gather information,
> applying patches etc. if it helps.
>
Just a thought, does using ssh from ports make any difference? Do you have
any more info about the threshold of file size for when this problem starts
occurring? is it always the same? EG if Vbox has 2 GB mapped out and you
get an error at a certain file size, does reducing the Vbox memory footprint
allow a larger file to be successfully sent?
--
Adam Vande More
> On Wed, Jul 13, 2011 at 6:57 PM, Peter Ross
> <Peter...@bogen.in-berlin.de>wrote:
>
>> Hi all,
>>
>> I have a problem with the network while running VirtualBox.
>>
>> As soon as I _run_ a VirtualBox I am not able to copy large files (e.g.
>> virtual disks or ZFS snapshots) using ssh/scp to another machine.
>>
>> The ssh crashes with "Write failed: Cannot allocate memory"
>> <snip>
>> At the moment it is a real showstopper for running VirtualBox/FreeBSD
>> production because I cannot backup VirtualBoxes. Mahlon gave up on it and
>> uses Citrix by now (but is still keen to have this solved).
>>
>> Any idea what causes the problem? I am happy to gather information,
>> applying patches etc. if it helps.
>>
>
> Just a thought, does using ssh from ports make any difference?
I am running named on the same box. I have overtime some errors there as well:
Apr 13 05:17:41 bind named[23534]: internal_send:
192.168.50.145#65176: Cannot allocate memory
Jun 21 23:30:44 bind named[39864]: internal_send:
192.168.50.251#36155: Cannot allocate memory
Jun 24 15:28:00 bind named[39864]: internal_send:
192.168.50.251#28651: Cannot allocate memory
Jun 28 12:57:52 bind named[2462]: internal_send: 192.168.165.154#1201:
Cannot allocate memory
Jul 13 19:43:05 bind named[4032]: internal_send:
192.168.167.147#52736: Cannot allocate memory
coming from a sendmsg(2).
My theory there is: my scp sends a lot data at the same time while the
named is sending a lot of data over time - both increasing the
likelyhood of the error.
> Do you have
> any more info about the threshold of file size for when this problem starts
> occurring? is it always the same?
No, it varies. Usually after a few GB. E.g. he last one lasted 11GB
but I had failures below 8GB transfer before.
The system itself is quite stable regarding running processes and
memory usage otherwise, here the description of it:
This machine is running:
- DHCP server (host)
- NTP server (host)
- Nagios monitor (nagios jail)
- DNS server (bind jail)
- MySQL server (mysql jail)
- Apache server with ITWiki (apache jail)
- Admin mail server (adminmail jail)
- Zimbra 7.0 Mail server (zimbra VirtualBox)
The machine has 8GB of RAM, and the footprint of the jails is minimal
(the MySQL server is for the mediawiki only which is used by two
people at the moment and not heavily).
Here a top(1) sorted by size:
last pid: 30169; load averages: 0.38, 0.41, 0.41 up 8+19:04:43
11:51:39
159 processes: 1 running, 158 sleeping
CPU: 0.4% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.2% idle
Mem: 84M Active, 356M Inact, 4516M Wired, 1004K Cache, 33M Buf, 2943M Free
Swap: 8188M Total, 8188M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
92688 root 24 44 0 2078M 1991M IPRT S 8 18.3H 5.86%
VBoxHeadle
4768 88 16 51 0 213M 21672K sigwai 8 2:02 0.00% mysqld
57180 www 1 46 0 140M 10344K accept 3 0:00 0.00% httpd
6223 www 1 76 0 139M 2400K accept 14 0:09 0.00% httpd
78674 www 1 44 0 138M 27056K accept 9 0:02 0.00% httpd
78924 www 1 44 0 138M 25928K accept 8 0:02 0.00% httpd
36114 www 1 44 0 138M 25424K accept 2 0:01 0.00% httpd
3997 www 1 44 0 138M 25180K accept 1 0:00 0.00% httpd
57410 www 1 44 0 138M 24476K accept 8 0:01 0.00% httpd
48202 www 1 44 0 138M 18488K accept 10 0:00 0.00% httpd
29695 www 1 44 0 134M 4920K accept 8 0:00 0.00% httpd
> EG if Vbox has 2 GB mapped out and you
> get an error at a certain file size, does reducing the Vbox memory footprint
> allow a larger file to be successfully sent?
Given that the amount of data is randomly just now I cannot imagine
how to get reliable numbers in this experiment.
While I am doing it I monitored the memory usage using top and vmstat
but there does not seem to be a shortage.
I also tried lookbusy to occupy 2GB when VisualBox wasn't running. I
even put slightly more pressure on it as VirtualBox does (that means
the free memory was below the typical numbers when VirtualBox was
running) - but the result is the same:
It works as long as I do not start the VirtualBox.
Regards
Peter
> I am running named on the same box. I have overtime some errors there as
> well:
>
> Apr 13 05:17:41 bind named[23534]: internal_send: 192.168.50.145#65176:
> Cannot allocate memory
> Jun 21 23:30:44 bind named[39864]: internal_send: 192.168.50.251#36155:
> Cannot allocate memory
> Jun 24 15:28:00 bind named[39864]: internal_send: 192.168.50.251#28651:
> Cannot allocate memory
> Jun 28 12:57:52 bind named[2462]: internal_send: 192.168.165.154#1201:
> Cannot allocate memory
> Jul 13 19:43:05 bind named[4032]: internal_send: 192.168.167.147#52736:
> Cannot allocate memory
>
> coming from a sendmsg(2).
>
> My theory there is: my scp sends a lot data at the same time while the
> named is sending a lot of data over time - both increasing the likelyhood of
> the error.
That doesn't really answer the question if a using a different ssh binary
helps, but I'm guessing it won't. You can try with different scp option
like encryption algo, compression, -l, and -v to see if any clues are
gained.
>
>
> Do you have
>> any more info about the threshold of file size for when this problem
>> starts
>> occurring? is it always the same?
>>
>
> No, it varies. Usually after a few GB. E.g. he last one lasted 11GB but I
> had failures below 8GB transfer before.
>
My machine specs are fairly similar to yours although this a mostly a
desktop system(virtualbox-ose-4.0.10). I am unable to reproduce this error
after several attempts at scp'ing a 20GB /dev/random file around. I assume
this would have been enough to trigger it on your system?
> EG if Vbox has 2 GB mapped out and you
>
>> get an error at a certain file size, does reducing the Vbox memory
>> footprint
>> allow a larger file to be successfully sent?
>>
>
> Given that the amount of data is randomly just now I cannot imagine how to
> get reliable numbers in this experiment.
>
I suspect this has less to do with actual memory and more to do with some
other buffer-like bottleneck. Does tuning any of the network buffers make
any difference? A couple to try:
net.inet.ip.intr_queue_maxlen
net.link.ifqmaxlen
kern.ipc.nmbclusters
If possible, does changing from VM bridged -> NAT or vice-versa result in
any behavior change?
> I suspect this has less to do with actual memory and more to do with some
> other buffer-like bottleneck. Does tuning any of the network buffers make
> any difference? A couple to try:
>
> net.inet.ip.intr_queue_maxlen
> net.link.ifqmaxlen
> kern.ipc.nmbclusters
>
> If possible, does changing from VM bridged -> NAT or vice-versa result in
> any behavior change?
>
Also check vmstat -z, net.graph.maxdata may be a candidate as well.
> On Wed, Jul 13, 2011 at 10:02 PM, Adam Vande More
> <amvan...@gmail.com>wrote:
>
>> I suspect this has less to do with actual memory and more to do with some
>> other buffer-like bottleneck. Does tuning any of the network buffers make
>> any difference? A couple to try:
>>
>> net.inet.ip.intr_queue_maxlen
>> net.link.ifqmaxlen
>> kern.ipc.nmbclusters
>>
>> If possible, does changing from VM bridged -> NAT or vice-versa result in
>> any behavior change?
>>
>
> Also check vmstat -z, net.graph.maxdata may be a candidate as well.
I tried FTP (to have something completely different) and it fails as well:
(ftp: netout: Cannot allocate memory)
I watched vmstat -z, and every time it fails, I have another failure
reported for "NetGraph data items".
Regards
Peter
> I tried FTP (to have something completely different) and it fails as well:
> (ftp: netout: Cannot allocate memory)
>
> I watched vmstat -z, and every time it fails, I have another failure
> reported for "NetGraph data items".
>
> Regards
> Peter
>
>
> First, are your kernel and world in sync? If not you'll want to make sure
they are. Does raising the value of net.graph.maxdata help? Set it in
/boot/loader.conf.
Raising the the values of kern.ipc.maxsockbuf, net.graph.maxdgram,
net.graph.recvspace may also help
--
Adam Vande More
Hi,
I have AMD64/STABLE+virtualbox[4.0.10|4.1Beta] on a Dell R710 with
two scenarios:
1.- Sending large files from the host to the guest with scp.
1,. rsync+ssh in the host machine (with virtualbox) sending large
files to a remote machine.
in fact both scenarios are the same, the host machine sending large
chunks of data. Both fail as it fails to everyone in the list.
Ok. So far so good. To track down the problem I tested may changes
of configuration with STABLE (if you want the details please let me know
and I'll send them to the list). None did the trick. I even tried
communications between the host and the guest with vboxnet. scp failed
with the "no memory allocation" problem.
Now I tried AMD64/CURRENT+virtualbox 4.0.10 on a laptop. I tested an
scenario like the one described in number 1. It worked just fine. In
fact I tried something like this in the host machine just to be sure:
# for i in {1..10}
do
cat "large_file.data" | ssh -l root 192.168.56.101
"cat -> /dev/null"
done
Where 192.168.56.101 is the guest machine (I'm using vboxnet). This
large file is 8 Gb file. So it gave me an 80Gb transfer. It worked fine.
This scenario would have failed with STABLE.
So I guess it has something to do with the combo STABLE+Virtualbox.
There must a change in CURRENT that doesn't trigger the problem. I think
it would be appropriate if anyone else could also try with CURRENT.
As an addition I remember all of this worked with 8.1 and an early
version of virtualbox 3.2 series . But I can't say which revision of 8.1
I had when I was using that kind of scenarios.
> On Mon, Jul 18, 2011 at 12:30 AM, Peter Ross
> <Peter...@bogen.in-berlin.de>wrote:
>
>> I tried FTP (to have something completely different) and it fails as well:
>> (ftp: netout: Cannot allocate memory)
>>
>> I watched vmstat -z, and every time it fails, I have another failure
>> reported for "NetGraph data items".
> Does raising the value of net.graph.maxdata help? Set it in
> /boot/loader.conf.
Indees it does. I raised it to 65536 and now I can copy large files
and do not see "NetGraph data items" failures in vmstat -z anymore.
I wonder whether it could be a recommendation of the VirtualBox ports?
I am not the first one to be bitten by it so it would make sense to
send a warning.
E.g. Marlon discarded the whole FreeBSD/VirtualBox setup and went
Citrix instead. It does not have to be like that;-)
In one way it makes sense that the _start_ of the VirtualBox makes the
difference. It is a busy company mailserver with SMTP and HTTP access
and a lot of traffic going through - it all has to go through the
netgraph items.
Of course, I have in my setup another way of working around the
problem: at the moment VirtualBox is using the same interface than the
host. I have a still unused interface I should use instead to separate
the traffic.
Regards
Peter
> Al 13/07/2011 23:57, En/na Peter Ross ha escrit:
>> I have a problem with the network while running VirtualBox.
>>
>> As soon as I _run_ a VirtualBox I am not able to copy large files
>> (e.g. virtual disks or ZFS snapshots) using ssh/scp to another
>> machine.
>>
>> The ssh crashes with "Write failed: Cannot allocate memory"
>>
>> thrown by a write(2) in /usr/src/crypto/openssh/roaming_common.c
>> (in function roaming_write). It returns the ENOMEM (an error it
>> should never return, according to the mainpage;-)
>>
>> It is immediately working when I stop the VirtualBox, even if the
>> VirtualBox kernel modules are still loaded.
>> ..
>> I experienced the problem with VirtualBox 3.2 first but the upgrade
>> to VirtualBox 4.0.8 and the base system recently did not help.
>
I experienced the problem first on VirtualBox 3.2 and
FreeBSD-8.2-PRERELEASE. Marlon already in September 2010.
The laptop isn't the ultimate test as long as you haven't the same
data going through the netgraph subsystem.
Can you try to set net.graph.maxdata as well (see my other e-mail).
Does it solve your problem too?
Thanks for your help
Hi,
My test system is a FreeBSD8.2/AMD64 Stable r222508 in a DELL R710
with 24GB of RAM with a dual Intel PRO/1000 and 4 Broadcom Extreme II
BCM5709. The broadcoms are lagged together giving bridged connectivity
to the virtual machines to the real world. I'm also using a vboxnet
network to bring a dedicated network between the virtual machines and
the host system.
I've been testing with net.graph.maxdata="65536". I've been able to
do large transfers (like 25GB with rsync and scp) from the host to a
remote system. Also I've been able to do transfers of about 10GB from
the host system to one guest system and they went well. Without setting
net.graph.maxdata I usually triggered the problem with transfers of
about 1 or 2GB. As you can see, I saw no problems with the real network
nor with the virtual network.
I did not report earlier because I wanted to be sure enough. Now I
would say that maxdata removed the problem for me.
What I do not get is why it works with CURRENT having
net.graph.maxdata="256". Anyway I think it would be nice to know why
setting the maxdata solves the problem because it would allow the
virtualbox team to add a note in pkg-message with the appropriate
explanation or even proposing a PR to STABLE to fix the issue.
Gustau
> Al 19/07/2011 08:40, En/na Peter Ross ha escrit:
>> Quoting "Gustau P�rez" <gpe...@entel.upc.edu>:
>>
>>> Al 13/07/2011 23:57, En/na Peter Ross ha escrit:
>>>> I have a problem with the network while running VirtualBox.
>>>>
>>>> As soon as I _run_ a VirtualBox I am not able to copy large files
>>>> (e.g. virtual disks or ZFS snapshots) using ssh/scp to another
>>>> machine.
>>>>
>>>> The ssh crashes with "Write failed: Cannot allocate memory"
>>>>
>>>> thrown by a write(2) in /usr/src/crypto/openssh/roaming_common.c
>>>> (in function roaming_write). It returns the ENOMEM (an error it
>>>> should never return, according to the mainpage;-)
>>>>
>>>> It is immediately working when I stop the VirtualBox, even if the
>>>> VirtualBox kernel modules are still loaded.
>>>> ..
>>>> I experienced the problem with VirtualBox 3.2 first but the
>>>> upgrade to VirtualBox 4.0.8 and the base system recently did not
>>>> help.
>>>
>> I experienced the problem first on VirtualBox 3.2 and
>> FreeBSD-8.2-PRERELEASE. Marlon already in September 2010.
>>
>> The laptop isn't the ultimate test as long as you haven't the same
>> data going through the netgraph subsystem.
>>
>> Can you try to set net.graph.maxdata as well (see my other e-mail).
>> Does it solve your problem too?
>>
>> Thanks for your help
>> Peter
>>
>>
>
> Hi,
>
> My test system is a FreeBSD8.2/AMD64 Stable r222508 in a DELL
> R710 with 24GB of RAM with a dual Intel PRO/1000 and 4 Broadcom
> Extreme II BCM5709. The broadcoms are lagged together giving bridged
> connectivity to the virtual machines to the real world. I'm also
> using a vboxnet network to bring a dedicated network between the
> virtual machines and the host system.
>
> I've been testing with net.graph.maxdata="65536". I've been able
> to do large transfers (like 25GB with rsync and scp) from the host
> to a remote system. Also I've been able to do transfers of about
> 10GB from the host system to one guest system and they went well.
> Without setting net.graph.maxdata I usually triggered the problem
> with transfers of about 1 or 2GB. As you can see, I saw no problems
> with the real network nor with the virtual network.
>
> I did not report earlier because I wanted to be sure enough. Now I
> would say that maxdata removed the problem for me.
>
> What I do not get is why it works with CURRENT having
> net.graph.maxdata="256".
This is tested on the laptop? Are you able to upgrade the Dell to
-current to confirm? (Or did you do it already?)
> Anyway I think it would be nice to know why setting the maxdata
> solves the problem because it would allow the virtualbox team to
> add a note in pkg-message with the appropriate explanation or even
> proposing a PR to STABLE to fix the issue.
Of course 65536 is a very high value. I just used it as a starting
point to test but I planning to decrease it gradually next week.
Are your VirtualBoxes busy "network-wise"?
My one is. It is a Zimbra mail server running on Red Hat Enterprise
Linux (the only inhouse Linux server I could not convert to native
FreeBSD but I did not want it to give an extra server on its own..)
It serves 50 accounts, and gets SMTP traffic (spam) arriving, Outlook
connections using HTTP, a web interface, internal redirections to
spamassassin and amavis etc.
The buffers are limited by default to avoid general kernel memory
shortage I guess. There is some advice related to network tuning (e.g.
http://www.freebsdonline.com/content/view/49/63/ )
But I (and others, as it looks like) did not see the reason why it
failed. You could call it a pilot error if I did not consider the
network load as an issue. In the beginning I saw the scp as an
isolated process and wondered why it failed (the host system isn't
really busy).
It is probably not really a VirtualBox issue at all. It is just
VirtualBox is "the box" that hides all that's going on, it is always
good to remember that the load is still there.
I had issues in the past with VMWare as well. It was difficult for
people to understand "why it's so slow" when grunty boxes were
"subdivided" by developers in many many virtual servers. The combined
load was very hard to grasp, it is difficult to have all relevant data
available when an intermittent problem occurs. We had VMWare
consultants called in by the management who walked away after a day
with "the problem does not exist"...
Regards