Increasing 2GB limit for OSV_KERNEL_VM_SHIFT

61 views
Skip to first unread message

Darren Lim

unread,
Dec 7, 2023, 3:52:53 AM12/7/23
to OSv Development
Hello,

Per the patch here (https://github.com/cloudius-systems/osv/commit/2a1795db8a22b0b963a64d068f5d8acc93e5785d), I was hoping to get help with making the changes to increase the kernel limit from 2GB to a larger size.

For context, I am trying to load a large project (~3GB) into the unikernel, along with decently large languages (Java, Python) by creating a custom image for the build script. It currently complains in mmu.cc, stating:

Assertion failed: virt >= phys_mem (core/mmu.cc: virt_to_phys: 184)

Thank you!

Waldek Kozaczuk

unread,
Dec 7, 2023, 9:56:34 AM12/7/23
to Darren Lim, OSv Development
Hi,

The 2GB limit and the commit you are referring to should only limit the size of the position-dependent executables (these executables typically want to be loaded in a place where OSv kernel used to be before this commit).

Are your executables larger than 2GB in size? Can you run 'readelf -W -l' against java and node like in this example:

readelf -W -l /usr/lib/jvm/java-8-openjdk-amd64/bin/java

Elf file type is DYN (Position-Independent Executable file)
Entry point 0x10b0
There are 13 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x0002d8 0x0002d8 R   0x8
  INTERP         0x000318 0x0000000000000318 0x0000000000000318 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x000750 0x000750 R   0x1000
  LOAD           0x001000 0x0000000000001000 0x0000000000001000 0x0001a9 0x0001a9 R E 0x1000
  LOAD           0x002000 0x0000000000002000 0x0000000000002000 0x00011c 0x00011c R   0x1000
  LOAD           0x002d48 0x0000000000003d48 0x0000000000003d48 0x0002c8 0x0002d0 RW  0x1000
  DYNAMIC        0x002d58 0x0000000000003d58 0x0000000000003d58 0x000260 0x000260 RW  0x8
  NOTE           0x000338 0x0000000000000338 0x0000000000000338 0x000030 0x000030 R   0x8
  NOTE           0x000368 0x0000000000000368 0x0000000000000368 0x000044 0x000044 R   0x4
  GNU_PROPERTY   0x000338 0x0000000000000338 0x0000000000000338 0x000030 0x000030 R   0x8
  GNU_EH_FRAME   0x002038 0x0000000000002038 0x0000000000002038 0x000034 0x000034 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x002d48 0x0000000000003d48 0x0000000000003d48 0x0002b8 0x0002b8 R   0x1

Can you give us more details about your use case and the error you are getting with the existing limit of 2GB (I am assuming there is a reason you want to increase it).

Regards,
Waldek

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/a5c2844a-e03e-4ac4-8e0a-de81f575889fn%40googlegroups.com.

Darren L

unread,
Dec 7, 2023, 6:43:00 PM12/7/23
to OSv Development
Hi Waldek,

Thanks for the quick response. For more details, I am trying to run a research prototype that requires both Python 3 and Java 8 to run correctly, and I placed the prototype's executable files (which includes large amounts of static data required to run the program) as an image in the apps directory to be linked to the image. The individual files are not huge (up to a few hundred MB each), but rather there are a lot of files that need to be included to run the prototype (totalling altogether 2-3 GB). I wasn't sure if this was the best approach. It seems that I could also dynamically link it in the .img file after the fact, since I only need the Python and Java images to run the program correctly?

The error I am getting occurs when I run the following command: `./scripts/build fs_size_mb=8192 image=python-from-host,java8,prototype` where prototype contains my large executable files. The error I receive is during the build process. It will state:

```
Assertion failed: virt >= phys_mem (core/mmu.cc: virt_to_phys: 184)

[backtrace]
0x0000000040244a7c <__assert_fail+28>
0x00000000402b85fc <mmu::virt_to_phys(void*)+92>
0x00000000402ef3e2 <void mmu::virt_to_phys<virtio::vring::add_sg(void*, unsigned int, virtio::vring_desc::flags)::{lambda(unsigned long, unsigned long)#1}>(void*, unsigned long, virtio::vring::add_sg(void*, unsigned int, virtio::vring_desc::flags)::{lambda(unsigned long, unsigned long)#1})+34>
0x00000000402ef1e6 <virtio::blk::make_request(bio*)+470>
0x000010000006b1e5 <???+438757>
0x000010000009620a <???+614922>
0x000010000006e476 <???+451702>
0x000010000009620a <???+614922>
0x0000000040260e1e <???+1076235806>
0x0000000040260ef2 <taskqueue_thread_loop+82>
0x000000004037db2d <thread_main_c+45>
0x000000004030b361 <???+1076933473>

```

I'm using the most recent OSv pulled from Github. I'm not using node but Python. To my best understanding, this is what I got using readelf for Java and Python. I used the files within my own machine that were then transferred into OSv. I wasn't sure how to use `readelf` within the unikernel itself, so I might need guidance on using readelf within OSv if this is not sufficient.

openjdk-8-zulu-full's java binary:

```
Elf file type is EXEC (Executable file)
Entry point 0x400570
There are 9 program headers, starting at offset 64


Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R E 0x8
  INTERP         0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0008c4 0x0008c4 R E 0x200000
  LOAD           0x000d98 0x0000000000600d98 0x0000000000600d98 0x00027c 0x000290 RW  0x200000
  DYNAMIC        0x000dd0 0x0000000000600dd0 0x0000000000600dd0 0x000210 0x000210 RW  0x8
  NOTE           0x000254 0x0000000000400254 0x0000000000400254 0x000044 0x000044 R   0x4
  GNU_EH_FRAME   0x000818 0x0000000000400818 0x0000000000400818 0x000024 0x000024 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x8
  GNU_RELRO      0x000d98 0x0000000000600d98 0x0000000000600d98 0x000268 0x000268 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00    
   01     .interp
   02     .interp .note.ABI-tag .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
   03     .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss
   04     .dynamic
   05     .note.ABI-tag .note.gnu.build-id
   06     .eh_frame_hdr
   07    
   08     .ctors .dtors .jcr .data.rel.ro .dynamic .got

```

python3.10.12:

```
Elf file type is DYN (Position-Independent Executable file)
Entry point 0x22cb80

There are 13 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x0002d8 0x0002d8 R   0x8
  INTERP         0x000318 0x0000000000000318 0x0000000000000318 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x06c228 0x06c228 R   0x1000
  LOAD           0x06d000 0x000000000006d000 0x000000000006d000 0x2b0dad 0x2b0dad R E 0x1000
  LOAD           0x31e000 0x000000000031e000 0x000000000031e000 0x23ee58 0x23ee58 R   0x1000
  LOAD           0x55d810 0x000000000055e810 0x000000000055e810 0x045528 0x08b6c8 RW  0x1000
  DYNAMIC        0x562bc8 0x0000000000563bc8 0x0000000000563bc8 0x000220 0x000220 RW  0x8

  NOTE           0x000338 0x0000000000000338 0x0000000000000338 0x000030 0x000030 R   0x8
  NOTE           0x000368 0x0000000000000368 0x0000000000000368 0x000044 0x000044 R   0x4
  GNU_PROPERTY   0x000338 0x0000000000000338 0x0000000000000338 0x000030 0x000030 R   0x8
  GNU_EH_FRAME   0x4e72a4 0x00000000004e72a4 0x00000000004e72a4 0x012e74 0x012e74 R   0x4

  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x55d810 0x000000000055e810 0x000000000055e810 0x0067f0 0x0067f0 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00    
   01     .interp
   02     .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
   03     .init .plt .plt.got .plt.sec .text .fini
   04     .rodata .stapsdt.base .eh_frame_hdr .eh_frame
   05     .init_array .fini_array .data.rel.ro .dynamic .got .data .PyRuntime .probes .bss
   06     .dynamic
   07     .note.gnu.property
   08     .note.gnu.build-id .note.ABI-tag
   09     .note.gnu.property
   10     .eh_frame_hdr
   11    
   12     .init_array .fini_array .data.rel.ro .dynamic .got 

```

Let me know if there is anything else you would like to see, or if you would like more clarification on anything mentioned above. Thank you for the help!

Sincerely,
Darren

Waldek Kozaczuk

unread,
Dec 11, 2023, 3:03:01 PM12/11/23
to OSv Development
Hi,

On Thursday, December 7, 2023 at 6:43:00 PM UTC-5 Darren L wrote:
Hi Waldek,

Thanks for the quick response. For more details, I am trying to run a research prototype that requires both Python 3 and Java 8 to run correctly, and I placed the prototype's executable files (which includes large amounts of static data required to run the program) as an image in the apps directory to be linked to the image. The individual files are not huge (up to a few hundred MB each), but rather there are a lot of files that need to be included to run the prototype (totalling altogether 2-3 GB). I wasn't sure if this was the best approach. It seems that I could also dynamically link it in the .img file after the fact, since I only need the Python and Java images to run the program correctly?

The error I am getting occurs when I run the following command: `./scripts/build fs_size_mb=8192 image=python-from-host,java8,prototype` where prototype contains my large executable files. The error I receive is during the build process. It will state:

```
Assertion failed: virt >= phys_mem (core/mmu.cc: virt_to_phys: 184)

[backtrace]
0x0000000040244a7c <__assert_fail+28>
0x00000000402b85fc <mmu::virt_to_phys(void*)+92>
0x00000000402ef3e2 <void mmu::virt_to_phys<virtio::vring::add_sg(void*, unsigned int, virtio::vring_desc::flags)::{lambda(unsigned long, unsigned long)#1}>(void*, unsigned long, virtio::vring::add_sg(void*, unsigned int, virtio::vring_desc::flags)::{lambda(unsigned long, unsigned long)#1})+34>
0x00000000402ef1e6 <virtio::blk::make_request(bio*)+470>
0x000010000006b1e5 <???+438757>
0x000010000009620a <???+614922>
0x000010000006e476 <???+451702>
0x000010000009620a <???+614922>
0x0000000040260e1e <???+1076235806>
0x0000000040260ef2 <taskqueue_thread_loop+82>
0x000000004037db2d <thread_main_c+45>
0x000000004030b361 <???+1076933473>

```

I do not think this error has anything to do with the ELF layout or the kernel shift. I think this issue is different and it happens when OSv runs during the build process (ZFS builder) to create the ZFS disk and upload all files.
I am still very interested in replicating and fixing it.

My wild guess is that it may be caused by the bug somebody else discovered and fixed in his pull request - https://github.com/cloudius-systems/osv/pull/1284/files. Can you only update the include/osv/buf.h file to see if your problem goes away? Otherwise, I have to be able to replicate it somehow.

Also, can you try to build a ROFS image (add fs=rofs to your build command) and run it?
 
There is also an option to use Virtio-FS (see https://github.com/cloudius-systems/osv/wiki/virtio-fs).
Btw you can use a stock JDK or Python from your Linux host (see modules/openjdk9_1x-from-host -> image=...,openjdk9_1x-from-host) and the same for Python (apps/python-from-host -> image=...,python-from-host ).

Let me know if you have any issues.
Message has been deleted

Darren L

unread,
Dec 13, 2023, 10:57:13 PM12/13/23
to OSv Development
Hi Waldek,

I am able to make the issue appear again by creating a big file (i.e. dd if=/dev/urandom of=1GB.bin bs=64M count=16 iflag=fullblock), creating a new app within the apps folder called big_file, setting the usr.manifest to include /big_file/**: ${MODULE_DIR}/**and running:

./scripts/build fs_size_mb=8192 image=python-from-host,java8,big_file

I updated the buf.h file, and the issue remains.

When I used fs=rofs instead, the issue returns but it doesn't appear until I run the run.py script. Example: the last few lines of the build state:

First block: 5242316, blocks count: 5127
Directory entries count 43078
Symlinks count 10
Inodes count 43079

But when I run

./scripts/run.py -e "/python3.10"

I get the following message:

OSv v0.57.0-86-g873cb55a

Assertion failed: virt >= phys_mem (core/mmu.cc: virt_to_phys: 184)

[backtrace]
0x0000000040244a7c <__assert_fail+28>
0x00000000402b85fc <mmu::virt_to_phys(void*)+92>
0x00000000402ef3e2 <void mmu::virt_to_phys<virtio::vring::add_sg(void*, unsigned int, virtio::vring_desc::flags)::{lambda(unsigned long, unsigned long)#1}>(void*, unsigned long, virtio::vring::add_sg(void*, unsigned int, virtio::vring_desc::flags)::{lambda(unsigned long, unsigned long)#1})+34>
0x00000000402ef148 <virtio::blk::make_request(bio*)+312>
0x00000000403bfd75 <multiplex_strategy+197>
0x00000000403cacbb <rofs_read_blocks(device*, unsigned long, unsigned long, void*)+107>
0x00000000403c91fd <???+1077711357>
0x00000000403c1476 <sys_mount+694>
0x00000000403beadb <mount_rofs_rootfs+59>
0x0000000040239491 <do_main_thread(void*)+7809>
0x00000000403e4e69 <???+1077825129>
0x000000004037db2d <thread_main_c+45>
0x000000004030b361 <???+1076933473>


When I use Virtio-FS, the error does not seem to appear, but this seems to cause different errors possibly. When trying to run Python in the unikernel, I get:

sudo PATH=/usr/lib/qemu:$PATH ./scripts/run.py --virtio-fs-tag=myfs --virtio-fs-dir=$(pwd)/build/export -e "/python3.10"
OSv v0.57.0-86-g873cb55a
eth0: 192.168.122.15
Booted up in 109.01 ms
Cmdline: /python3.10
Fatal Python error: _Py_HashRandomization_Init: failed to get random numbers to initialize Python
Python runtime state: preinitialized


Hope this is enough to replicate the issue. Thank you!

Waldek Kozaczuk

unread,
Dec 19, 2023, 12:59:51 AM12/19/23
to OSv Development
Hi,

Sorry for the late reply.

I did manage to replicate your issue. It turns out that when building so large images, one needs more memory to run the so-called ZFS builder. After increasing the memory from 512M to 1G I was able to build the image successfully.

diff --git a/scripts/upload_manifest.py b/scripts/upload_manifest.py
index a3796f95..65e91a9c 100755
--- a/scripts/upload_manifest.py
+++ b/scripts/upload_manifest.py
@@ -164,7 +164,7 @@ def main():
         console = '--console=serial'
         zfs_builder_name = 'zfs_builder-stripped.elf'
 
-    osv = subprocess.Popen('cd ../..; scripts/run.py -k --kernel-path build/release/%s --arch=%s --vnc none -m 512 -c1 -i "%s" --block-device-cache unsafe -s -e "%s --norandom --nomount --noinit --preload-zfs-library /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set compression=off osv" --forward tcp:127.0.0.1:%s-:10000' % (zfs_builder_name,arch,image_path,console,upload_port), shell=True, stdout=subprocess.PIPE)
+    osv = subprocess.Popen('cd ../..; scripts/run.py -k --kernel-path build/release/%s --arch=%s --vnc none -m 1G -c1 -i "%s" --block-device-cache unsafe -s -e "%s --norandom --nomount --noinit --preload-zfs-library /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set compression=off osv" --forward tcp:127.0.0.1:%s-:10000' % (zfs_builder_name,arch,image_path,console,upload_port), shell=True, stdout=subprocess.PIPE)
 
     upload(osv, manifest, depends, upload_port)
 
In case you see a similar error when running the image, also try to bump up memory like - run.py -m 1G

We should probably detect this condition in core/mmu.cc accordingly and better handle it possibly by informing user there is not enough physical memory:

 181
 182     // For now, only allow non-mmaped areas.  Later, we can either
 183     // bounce such addresses, or lock them in memory and translate
 184     assert(virt >= phys_mem);
 185     return reinterpret_cast<uintptr_t>(virt) & (mem_area_size - 1);
 186 }
Reply all
Reply to author
Forward
0 new messages