Issue Booting OSv on XEN with Static IP ! / Tweaking of Network Config possible ?

116 views
Skip to first unread message

Vincent Schwarzer

unread,
Jan 18, 2016, 11:35:32 AM1/18/16
to Osv Dev
Hey OSv Developers,


I have an Xen Instance running were IP addresses have to be static assigned.
it seems that I have to pass them using the run.py script. 

My current run.py call is the following:

sudo scripts/run.py -nVv -b xenbr0 -p xen -c 1 -m 3G -i build/release.x64/osv.raw --script vif-bridge -e "--ip=eth0,xxx.xx.xx.xxx,xxx.xxx.xxx.xxx --defaultgw=xxx.xx.xx.x --nameserver=xxx.xx.xxx.xx `cat build/release/cmdline`"


I have following questions:
  1. Does anybody have an idea why OSv gets stuck at "10240MB <Virtual Block Device> at device/vbd/768attaching as vblk0" ?
    I have fully functional conf script which boots OSv but without the static IP assignment.
    For details see the description further down in the email.

  2. When I reference an raw image in the run.py script with the -i flag I get following warning:
    WARNING: Image format was not specified for '/tmp/osv.raw' and probing guessed raw. Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
    Specify the 'raw' format explicitly to remove the restrictions.
     
    I didn't found any  argument in the run.py script to specify the image format explicitly.
    Did I miss / overlooked something?

  3. Is there another way to pass arguments to OSv for static IP configuration beside using run.py (like using the extra / bootargs parameter in Xen Configs)? I tried passing arguments with both XEN parameter without any success so far.
    See for reference: http://wiki.xenproject.org/wiki/Xen_3.x_Configuration_File_Options

  4. Is it possible to tweak other system properties like # of file descriptors, # of Available ports (e.g. net.ipv4.ip_local_port_range), TCP Timeout (tcp_fin_timeout) and reuse of TIME_WAIT sockets (tcp_tw_reuse),... ? 


Currently the OSv system get stuck at the highlighted line

Parsing config from /tmp/tmpNe9psQ
OSv v0.24-46-g464f4e0
1 CPUs detected
Firmware vendor: Xen
bsd: initializing - done
VFS: mounting ramfs at /
VFS: mounting devfs at /dev
RAM disk at 0x0xffff800002e97040 (4096K bytes)
net: initializing - done
vga: Add VGA device instance
eth0: ethernet address: xx:xx:xx:xx:xx:xx
Back-end specified ring-pages of 16 limited to front-end limit of 15.
Back-end specified ring-pages of 15 is not a power of 2. Limited to 8.
Back-end specified max_requests of 512 limited to front-end limit of 256.
backend features: feature-sg feature-gso-tcp4
10240MB <Virtual Block Device> at device/vbd/768attaching as vblk0

 The configuration it creates/parses looks like this:
builder='hvm'
xen_platform_pci=1
acpi=1
apic=1
boot='c'
vncdisplay=1
memory=3072
vcpus=1
maxcpus=1
name='osv-2647'
disk=['/dev/loop2647,raw,hda,rw']
serial='pty'
paused=0
on_crash='preserve'
vif=['bridge=xenbr0,script=vif-bridge']

I have also another written XEN Config which works (without the static IP part).
Following the config file written by hand as reference, which boots an instance successfully (without static IP assignment though):

builder='hvm'
xen_platform_pci=1
acpi=1
apic=1
boot='c'
vncdisplay=1
memory=3072
vcpus=1
maxcpus=1
name='osv-redis'
disk=['/tmp/osv_redis_3.0.5.raw,raw,hda,rw']
serial='pty'
paused=0
console_autoconnect=1
on_crash='preserve'
vif=['script=vif-bridge,bridge=xenbr0']

Thanks for any hints / help!

Best,

Vincent

P.S.: I added the --script argument for Xen to the run.py script and will create an pull request to the upstream.



Nadav Har'El

unread,
Jan 18, 2016, 11:59:33 AM1/18/16
to Vincent Schwarzer, Glauber Costa, Osv Dev
On Mon, Jan 18, 2016 at 6:35 PM, 'Vincent Schwarzer' via OSv Development <osv...@googlegroups.com> wrote:
Hey OSv Developers,


I have an Xen Instance running were IP addresses have to be static assigned.
it seems that I have to pass them using the run.py script. 

My current run.py call is the following:

sudo scripts/run.py -nVv -b xenbr0 -p xen -c 1 -m 3G -i build/release.x64/osv.raw --script vif-bridge -e "--ip=eth0,xxx.xx.xx.xxx,xxx.xxx.xxx.xxx --defaultgw=xxx.xx.xx.x --nameserver=xxx.xx.xxx.xx `cat build/release/cmdline`"


I have following questions:
  1. Does anybody have an idea why OSv gets stuck at "10240MB <Virtual Block Device> at device/vbd/768attaching as vblk0" ?
    I have fully functional conf script which boots OSv but without the static IP assignment.
    For details see the description further down in the email.

  2. When I reference an raw image in the run.py script with the -i flag I get following warning:
    WARNING: Image format was not specified for '/tmp/osv.raw' and probing guessed raw. Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
    Specify the 'raw' format explicitly to remove the restrictions.

This is a new Qemu feature - it doesn't like to guess that the image file is a "raw image" because this could theoretically lead to some security problems (search the Web for more info...), so they added this complaint.

I think you can safely ignore this warning. It won't appear if you use a qcow image instead of a raw image.
 
  1.  
    I didn't found any  argument in the run.py script to specify the image format explicitly.
    Did I miss / overlooked something?

We never added such an option. We could, I guess.


  1. Is there another way to pass arguments to OSv for static IP configuration beside using run.py (like using the extra / bootargs parameter in Xen Configs)? I tried passing arguments with both XEN parameter without any success so far.
    See for reference: http://wiki.xenproject.org/wiki/Xen_3.x_Configuration_File_Options

The "--ip" et al. are parameters passed to the OSv kernel. What run.py does is to modify a special place in the image that holds the command line, to contain the string you asked for. After you set this command line once, it modifies the image and you can run your image again in whatever method you choose and it will run the same thing with the same parameters, again.
 

  1. Is it possible to tweak other system properties like # of file descriptors, # of Available ports (e.g. net.ipv4.ip_local_port_range), TCP Timeout (tcp_fin_timeout) and reuse of TIME_WAIT sockets (tcp_tw_reuse),... ? 

Good question. I think we do have similar parameters, but don't remember how to set them. Search the source code :-)
They are probably named more similarly to what exists on BSD, not Linux, because our networking code is based on FreeBSD's.
Hmm, why do you even suspect the static IP part? For me, the more suspicious part is disk=['/dev/loop2647,raw,hda,rw']. What is "/dev/loop2647"? In the example below which worked for you, you had a real file there - /tmp/osv_redis_3.0.5.raw. Maybe OSv is hanging because reading from the "/dev/loop2647" hangs on the host?

If you could attach gdb to OSv (as explained in https://github.com/cloudius-systems/osv/wiki/Debugging-OSv) you could try to figure out what is stuck.
 

builder='hvm'
xen_platform_pci=1
acpi=1
apic=1
boot='c'
vncdisplay=1
memory=3072
vcpus=1
maxcpus=1
name='osv-redis'
disk=['/tmp/osv_redis_3.0.5.raw,raw,hda,rw']
serial='pty'
paused=0
console_autoconnect=1
on_crash='preserve'
vif=['script=vif-bridge,bridge=xenbr0']

Thanks for any hints / help!

Best,

Vincent

P.S.: I added the --script argument for Xen to the run.py script and will create an pull request to the upstream.



--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Glauber Costa

unread,
Jan 18, 2016, 2:20:18 PM1/18/16
to Nadav Har'El, Vincent Schwarzer, Osv Dev
I don't recall seeing this message before - it must be new in QEMU,
but if it says
that writes will be restricted - I doubt you can ignore it. This is
very likely the cause for
the guest getting stuck. It is consistent with what I used to get for
devices that would
go bad for whatever reason - not being writable certainly qualifies.

Nadav Har'El

unread,
Jan 18, 2016, 4:57:41 PM1/18/16
to Glauber Costa, Vincent Schwarzer, Osv Dev
On Mon, Jan 18, 2016 at 9:20 PM, Glauber Costa <gla...@scylladb.com> wrote:
>> When I reference an raw image in the run.py script with the -i flag I get
>> following warning:
>>>
>>> WARNING: Image format was not specified for '/tmp/osv.raw' and probing
>>> guessed raw. Automatically detecting the format is dangerous for raw images,
>>> write operations on block 0 will be restricted.
>>> Specify the 'raw' format explicitly to remove the restrictions.
>
>
> This is a new Qemu feature - it doesn't like to guess that the image file is
> a "raw image" because this could theoretically lead to some security
> problems (search the Web for more info...), so they added this complaint.
>
> I think you can safely ignore this warning. It won't appear if you use a
> qcow image instead of a raw image.

I don't recall seeing this message before - it must be new in QEMU,

It is indeed a new message (and checks).

Although this is very offtopic to this list, it seems there's popular interest, so I'll try to explain why they did it. Sorry if I botch up the explanation :-)

The change fixes a very real security hole that existed for cloud providers using qemu (but not very relevant for people who run VMs on their own machine for their personal use).

The hole was this: Imagine that you are a cloud provider who uses raw images (not qcow2 or anything else, but raw images). Because qemu used to recognize their format correctly, you never bothered to explicitly state that these are are raw images. But here comes the problem: Inside the raw image, the VM has the ability to write every byte in the image. What if the rogue VM (again, imagine that you're a cloud provider and anybody can put a VM on your machine) writes to the first bytes of the raw image the header of a qcow2 file?
On the next boot of this VM, the image will look like a qcow2 image to QEMU, and it will process it like one.
Where's the harm in that, you might ask? Well, the harm is that among the features of qcow2 is the ability to include a "base image" - the qcow2 can specify the path name of another qcow2 image, and the VM will miraculously be able to read that image too! With some path guessing or inside information, the attacker can figure out the paths of other guests' images, and get access to them!

There are various ways to prevent this attack (including my favorite, using different Unix accounts for different tenants - see http://www.computer.org/csdl/proceedings/msst/2013/0217/00/06558424.pdf for a paper I once co-authored ;-)). But the simplest way of all is to ask the cloud provider running QEMU to specify explicitly when the image is "raw" - and when doing this on every bood it will be treated as "raw" even if the header on the first block is changed.

And if the QEMU runner did *not* specify raw, and the file is detected as raw, what QEMU does is to prevent the VM from changing the first block of the image. This means that the VM will not be able to forge a qcow2-like header in the first block, and the next boot is guaranteed to continue to see a raw image and not any magic qcow2 stuff.
 
but if it says
that writes will be restricted - I doubt you can ignore it. This is
very likely the cause for
the guest getting stuck.

Only writes to the first block are blocked, for reasons I explained above.
Does a regular OSv run needs to write to the first block?

Anyway, the easiest way to confirm or discredit your theory is to convert the image (with qemu-img convert, or our scripts/convert) to qcow2, and run it. The qcow2 image will not have this warning (or the write protection), and if it hangs in the same way, this write protection is not the problem.
 

Vincent Schwarzer

unread,
Jan 19, 2016, 5:34:38 AM1/19/16
to Nadav Har'El, Glauber Costa, Osv Dev
Hey,
 

Hmm, why do you even suspect the static IP part? For me, the more suspicious part is disk=['/dev/loop2647,raw,hda,rw']. What is "/dev/loop2647"? In the example below which worked for you, you had a real file there - /tmp/osv_redis_3.0.5.raw. Maybe OSv is hanging because reading from the "/dev/loop2647" hangs on the host?

If you could attach gdb to OSv (as explained in https://github.com/cloudius-systems/osv/wiki/Debugging-OSv) you could try to figure out what is stuck.

I had an error by passing 2 nameserver instead of one. The loop<xxxx> is intentional by design judging by the run.py Script 


#create a loop device backed by image file
subprocess.call(["losetup", "/dev/loop%s" % os.getpid(), options.image_file])

Whats the advantage of  using this indirection instead of taking the absolute path to the image? I think I located the error in another part of the 
run.py workflow. To be precise it gets stuck at because of the imgedit.py script.
 
I got OSv running with a static IP address by setting the args first for the usr.img and convert it after that to the raw format.
Still not sure why it wont boot when the file is already an RAW image and the parameter are set after the convert.

I also noticed that I only can specify one nameserver. In our actual network setup we have multiple nameservers so it would come in handy when I could pass more than one nameserver as argument to OSv.
 
Good question. I think we do have similar parameters, but don't remember how to set them. Search the source code :-)
They are probably named more similarly to what exists on BSD, not Linux, because our networking code is based on FreeBSD's.

I had a look and found the files in question. If someone else should stumble about this topic here the files:

Timeout and Timing stuff:
Port Range:

I will write an short OSv wiki article for it this week. :) 


Anyway, the easiest way to confirm or discredit your theory is to convert the image (with qemu-img convert, or our scripts/convert) to qcow2, and run it. The qcow2 image will not have this warning (or the write protection), and if it hangs in the same way, this write protection is not the problem.

I tried it with qcow2  (convert img --> qcow2-old) as well which wont even boot and get stuck already at the parsing part. Maybe another issue here looming?

- Vincent

Nadav Har'El

unread,
Jan 20, 2016, 6:36:20 AM1/20/16
to Vincent Schwarzer, Glauber Costa, Osv Dev
On Tue, Jan 19, 2016 at 12:34 PM, Vincent Schwarzer <vincent....@yahoo.de> wrote:


Anyway, the easiest way to confirm or discredit your theory is to convert the image (with qemu-img convert, or our scripts/convert) to qcow2, and run it. The qcow2 image will not have this warning (or the write protection), and if it hangs in the same way, this write protection is not the problem.

I tried it with qcow2  (convert img --> qcow2-old) as well which wont even boot and get stuck already at the parsing part. Maybe another issue here looming?

Things are getting stranger and stranger... In theory, it doesn't matter which format the image has - if it's qcow2 or raw - it should run the same. But it seems you're getting a different error for qcow2 images. That's strange.
Reply all
Reply to author
Forward
0 new messages