S2E x86_64 issues

290 views
Skip to first unread message

Jonathan Einstoss

unread,
Oct 26, 2015, 12:45:03 PM10/26/15
to S2E Developer Forum
Hello,
I've been having problems running the x86_64 version of S2E with the minimal example found here. There are three non-deterministic errors that I have found so far. The errors do not occur every time, sometimes S2E will terminate all the states without encountering any errors. These errors are NOT seen when I run the same example with x86 S2E. Also, the OS used for the x86_64 tests was CentOS 7, and CentOS 6.7 for the x86 tests.

The first one is the TLB assertion error documented here. The assertion error does not occur as often as the other two error.

The second error is a bug in S2EDeviceState::getBuffer() that I was able to patch. However, I don't think the patch I applied actually solves the issue, but provides a band-aid for S2E not to crash. Basically, the position and size passed to S2EDeviceState::getBuffer() are out of bounds for the buffer they access. The following is a stack trace for this error:
(gdb) back
#0  0x00007ffff44b40c6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x0000555555a6659f in s2e::S2EDeviceState::getBuffer (this=0x55557fad9af0, buf=0x5555765b2020 "", pos=32768, size=32768) at /home/jonathan/repos/s2e/qemu/s2e/S2EDeviceState.cpp:273
#2  0x0000555555a66005 in s2e_qemu_get_buffer (buf=0x5555765b2020 "", pos=32768, size=32768) at /home/jonathan/repos/s2e/qemu/s2e/S2EDeviceState.cpp:87
#3  0x0000555555aa742d in qemu_memfile_get_buffer (opaque=0x55557652a380, buf=0x5555765b2020 "", pos=32768, size=32768) at /home/jonathan/repos/s2e/qemu/savevm.c:218
#4  0x0000555555aacb4a in qemu_fill_buffer (f=0x5555765b1fd0) at /home/jonathan/repos/s2e/qemu/savevm.c:567
#5  0x0000555555aa844b in qemu_peek_byte (f=0x5555765b1fd0, offset=0) at /home/jonathan/repos/s2e/qemu/savevm.c:725
#6  0x0000555555aa83da in qemu_get_byte (f=0x5555765b1fd0) at /home/jonathan/repos/s2e/qemu/savevm.c:738
#7  0x0000555555aac7b9 in qemu_get_8s (f=0x5555765b1fd0, pv=0x7fffaeb49072 "") at /home/jonathan/repos/s2e/qemu/qemu-file.h:163
#8  0x0000555555aa8bf9 in get_uint8 (f=0x5555765b1fd0, pv=0x7fffaeb49072, size=1) at /home/jonathan/repos/s2e/qemu/savevm.c:1015
#9  0x0000555555aaa009 in vmstate_load_state (f=0x5555765b1fd0, vmsd=0x55555736d428, opaque=0x5555764dd148, version_id=1) at /home/jonathan/repos/s2e/qemu/savevm.c:1520
#10 0x0000555555aaa289 in vmstate_subsection_load (f=0x5555765b1fd0, vmsd=0x55555736d500, opaque=0x5555764dd148) at /home/jonathan/repos/s2e/qemu/savevm.c:1910
#11 0x0000555555aaa062 in vmstate_load_state (f=0x5555765b1fd0, vmsd=0x55555736d500, opaque=0x5555764dd148, version_id=3) at /home/jonathan/repos/s2e/qemu/savevm.c:1530
#12 0x0000555555aa9fe7 in vmstate_load_state (f=0x5555765b1fd0, vmsd=0x55555736d740, opaque=0x5555764dc5e0, version_id=3) at /home/jonathan/repos/s2e/qemu/savevm.c:1518
#13 0x0000555555aaae31 in vmstate_load (f=0x5555765b1fd0, se=0x5555764f9c90, version_id=3) at /home/jonathan/repos/s2e/qemu/savevm.c:1598
#14 0x0000555555aaadaf in s2e_qemu_load_state (f=0x5555765b1fd0, se=0x5555764f9c90) at /home/jonathan/repos/s2e/qemu/savevm.c:1973
#15 0x0000555555a663a8 in s2e::S2EDeviceState::restoreDeviceState (this=0x55557fad9af0) at /home/jonathan/repos/s2e/qemu/s2e/S2EDeviceState.cpp:229
#16 0x0000555555a8d932 in s2e::S2EExecutor::doStateSwitch (this=0x555557e5b580, oldState=0x555558663e40, newState=0x55557fad98e0) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutor.cpp:1412
#17 0x0000555555a8cea2 in s2e::S2EExecutor::selectNextState (this=0x555557e5b580, state=0x555558663e40) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutor.cpp:1530
#18 0x0000555555a8cbf7 in s2e::S2EExecutor::stateSwitchTimerCallback (opaque=0x555557e5b580) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutor.cpp:1299
#19 0x0000555555829d0f in qemu_run_timers (clock=0x555557e03160) at /home/jonathan/repos/s2e/qemu/qemu-timer.c:420
#20 0x0000555555829f94 in qemu_run_all_timers () at /home/jonathan/repos/s2e/qemu/qemu-timer.c:484
#21 0x00005555557f2f49 in main_loop_wait (nonblocking=0) at /home/jonathan/repos/s2e/qemu/main-loop.c:492
#22 0x0000555555b275c9 in main_loop () at /home/jonathan/repos/s2e/qemu/vl.c:1608
#23 0x0000555555b2271b in main (argc=9, argv=0x7fffffffe5a8, envp=0x7fffffffe5f8) at /home/jonathan/repos/s2e/qemu/vl.c:3863

The patch I applied to S2EDeviceState::getBuffer() simply makes sure that memcpy() isn't called with bad bounds for the buffer. This avoids the segfault that was crashing S2E but does not solve the issue that the function is called with bad parameters.

Finally, the third error is a free() on an invalid pointer. I set MALLOC_CHECK_ to 3, such that the stack trace would show where the error occurred (that's why SIGABRT was raised). The stack trace follows:
(gdb) back
#0  0x00007ffff43a10d5 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007ffff43a483b in __GI_abort () at abort.c:91
#2  0x00007ffff43de32e in __libc_message (do_abort=2, fmt=0x7ffff44e85d8 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:201
#3  0x00007ffff43e8b26 in malloc_printerr (action=3, str=0x7ffff44e4f09 "free(): invalid pointer", ptr=<optimized out>) at malloc.c:5051
#4  0x0000555555a7b730 in __gnu_cxx::new_allocator<std::tr1::__detail::_Hash_node<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8u> >, false>*>::deallocate (this=0x7fffffffda50, __p=0x55558273a1e0)
    at /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/ext/new_allocator.h:98
#5  0x0000555555a7b6a3 in std::tr1::_Hashtable<klee::ObjectState*, std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8u> >, std::allocator<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8u> > >, std::_Select1st<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8u> > >, std::equal_to<klee::ObjectState*>, std::tr1::hash<klee::ObjectState*>, std::tr1::__detail::_Mod_range_hashing, std::tr1::__detail::_Default_ranged_hash, std::tr1::__detail::_Prime_rehash_policy, false, false, true>::_M_deallocate_buckets (this=0x555582736e40, __p=0x55558273a1e0, __n=15173) at /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/tr1/hashtable.h:508
#6  0x0000555555a85191 in std::tr1::_Hashtable<klee::ObjectState*, std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8> >, std::allocator<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8> > >, std::_Select1st<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8> > >, std::equal_to<klee::ObjectState*>, std::tr1::hash<klee::ObjectState*>, std::tr1::__detail::_Mod_range_hashing, std::tr1::__detail::_Default_ranged_hash, std::tr1::__detail::_Prime_rehash_policy, false, false, true>::~_Hashtable (this=0x555582736e40) at /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/tr1/hashtable.h:640
#7  0x0000555555a85155 in std::tr1::__unordered_map<klee::ObjectState*, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8>, std::tr1::hash<klee::ObjectState*>, std::equal_to<klee::ObjectState*>, std::allocator<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8> > >, false>::~__unordered_map (this=0x555582736e40) at /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/tr1/unordered_map.h:43
#8  0x0000555555a85135 in std::tr1::unordered_map<klee::ObjectState*, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8>, std::tr1::hash<klee::ObjectState*>, std::equal_to<klee::ObjectState*>, std::allocator<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8> > > >::~unordered_map (this=0x555582736e40) at /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/tr1/unordered_map.h:180
#9  0x0000555555a746b5 in std::tr1::unordered_map<klee::ObjectState*, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8>, std::tr1::hash<klee::ObjectState*>, std::equal_to<klee::ObjectState*>, std::allocator<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8> > > >::~unordered_map (this=0x555582736e40) at /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/tr1/unordered_map.h:180
#10 0x0000555555a6881b in s2e::S2EExecutionState::~S2EExecutionState (this=0x555582736b40) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutionState.cpp:146
#11 0x0000555555a684a8 in s2e::S2EExecutionState::~S2EExecutionState (this=0x555582736b40) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutionState.cpp:124
#12 0x0000555555a8d0d7 in s2e::S2EExecutor::selectNextState (this=0x555557e5b550, state=0x555582736b40) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutor.cpp:1540
#13 0x0000555555a8ccd7 in s2e::S2EExecutor::stateSwitchTimerCallback (opaque=0x555557e5b550) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutor.cpp:1299
#14 0x0000555555829d0f in qemu_run_timers (clock=0x555557e03160) at /home/jonathan/repos/s2e/qemu/qemu-timer.c:420
#15 0x0000555555829f94 in qemu_run_all_timers () at /home/jonathan/repos/s2e/qemu/qemu-timer.c:484
#16 0x00005555557f2f49 in main_loop_wait (nonblocking=0) at /home/jonathan/repos/s2e/qemu/main-loop.c:492
#17 0x0000555555b276a9 in main_loop () at /home/jonathan/repos/s2e/qemu/vl.c:1608
#18 0x0000555555b227fb in main (argc=10, argv=0x7fffffffe588, envp=0x7fffffffe5e0) at /home/jonathan/repos/s2e/qemu/vl.c:3863

You can find the test image here. If the snapshot doesn't work, you can create one after running the "run.sh" script in /root/. The user and password are "root" and "password" respectively (no quotes).

Any suggestions on how to address these issues would be appreciated.

Thanks,
Jonathan

Vitaly Chipounov

unread,
Oct 26, 2015, 5:28:43 PM10/26/15
to s2e...@googlegroups.com
Hi,

Thanks for the detailed report. I could reproduce some of the crashes. I will look into them.

Vitaly
--
--
You received this message because you are a member of the S2E Developer Forum.
To post to this group, send email to s2e...@googlegroups.com
To unsubscribe from this group, send email to s2e-dev+u...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/s2e-dev

---
You received this message because you are subscribed to the Google Groups "S2E Developer Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to s2e-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Vitaly Chipounov

unread,
Oct 26, 2015, 6:05:01 PM10/26/15
to s2e...@googlegroups.com
Hi,

I pushed a fix. Let me know if it works.

Vitaly

0xefbeefbe

unread,
Oct 27, 2015, 1:13:44 PM10/27/15
to s2e...@googlegroups.com
Hi Vitally,

Jonathan and I have been looking into this for a bit now. I believe your patches are helping; however, there are a bunch of 64-bit patches from the QEMU that might need to be applied. (http://git.qemu.org/?p=qemu.git&a=search&h=refs%2Fheads%2Fmaster&st=commit&s=w64%3A)

It appears that you applied a set of them up to a given date. This is the first patch that we observed to be missing: http://git.qemu.org/?p=qemu.git;a=commitdiff;h=27b0dc16b4e50b59aeefff2bfa73cf2eb8c7a70d

Jonathan is currently applying the patches to S2E. 

We also believe the patches will help the "Assertion `tlbIt != m_tlbMap.end()’ failed” problem that Иван Писарев has been experiencing. We’ve also encountered the same assertion while trying to analyze 64-bit binaries. When we saw that assertion we traced back into tlb_set_page (exec.c) and saw that the there was an incorrect data type for addend (unsigned long instead of uintptr_t). We believe the data type of addend is important as it is passed to s2e_update_tlb_entry as the argument uint64_t hostAddr.


We’ll keep you posted on our patching and testing results. If things go well, we’ll put together a push request.

Kind regards,
Shawn

Jonathan Einstoss

unread,
Oct 27, 2015, 4:13:56 PM10/27/15
to S2E Developer Forum
Vitaly,
Thanks for patching the buffer overflow so fast. The fixes you pushed did fix the buffer overflow problems. The "Assertion `tlbIt != m_tlbMap.end()' failed" error still persists.

Shawn and myself looked into the issue to try to solve assertion error. We thought we had tracked it down to some old QEMU bugs that had to do with using incorrect data types for some x64 systems. The incorrect data types were related to the QEMU TLB, so we thought that if we patched QEMU we could get rid of the error. Sadly, that was not the case.

We did manage to consistently reproduce the error using the same minimal example in the documentation and the image that is attached to the original post in this thread. I compiled S2E with the S2E_DEBUG_TLBCACHE flag, but the output wasn't very helpful.

Here is the stack trace that we got:
360 [State 8] Switching from state 8 to state 2
qemu
-system-x86_64: /home/jonathan/repos/s2e/qemu/s2e/S2EExecutionState.cpp:2023: void s2e::S2EExecutionState::flushTlbCachePage(klee::ObjectState *, int, int): Assertion `tlbIt != m_tlbMap.end()' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffb87a8700 (LWP 16227)]

0x00007ffff43a10d5 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.

(gdb) back
#0  0x00007ffff43a10d5 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007ffff43a483b in __GI_abort () at abort.c:91
#2  0x00007ffff4399d9e in __assert_fail_base (fmt=<optimized out>, assertion=0x555556a01733 "tlbIt != m_tlbMap.end()", file=0x555556a00076 "/home/jonathan/repos/s2e/qemu/s2e/S2EExecutionState.cpp", line=<optimized out>, function=<optimized out>) at assert.c:94
#3  0x00007ffff4399e42 in __GI___assert_fail (assertion=0x555556a01733 "tlbIt != m_tlbMap.end()", file=0x555556a00076 "/home/jonathan/repos/s2e/qemu/s2e/S2EExecutionState.cpp", line=2023,
    function=0x555556a0174b "void s2e::S2EExecutionState::flushTlbCachePage(klee::ObjectState *, int, int)") at assert.c:103
#4  0x0000555555a737f2 in s2e::S2EExecutionState::flushTlbCachePage (this=0x55557ae0b0b0, objectState=0x555579c20300, mmu_idx=0, index=170) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutionState.cpp:2023
#5  0x0000555555a73d68 in s2e::S2EExecutionState::updateTlbEntry (this=0x55557ae0b0b0, env=0x7ffff7e78010, mmu_idx=0, virtAddr=18446612132419097856, hostAddr=140736240571648) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutionState.cpp:2095
#6  0x0000555555a92bf3 in s2e_update_tlb_entry (state=0x55557ae0b0b0, env=0x7ffff7e78010, mmu_idx=0, virtAddr=18446612132419096576, hostAddr=140736240570368) at /home/jonathan/repos/s2e/qemu/s2e/S2EExecutor.cpp:2623
#7  0x00005555558a2ab9 in tlb_set_page (env=0x7ffff7e78010, vaddr=18446612132419096576, paddr=104878080, prot=3, mmu_idx=0, size=2097152) at /home/jonathan/repos/s2e/qemu/exec.c:2449
#8  0x00005555558c7da1 in cpu_x86_handle_mmu_fault (env=0x7ffff7e78010, addr=18446612132419099392, is_write1=0, mmu_idx=0) at /home/jonathan/repos/s2e/qemu/target-i386/helper.c:844
#9  0x00005555559039a8 in tlb_fill (env1=0x7ffff7e78010, addr=18446612132419099392, page_addr=18446612132419099392, is_write=0, mmu_idx=0, retaddr=0x400532ff) at /home/jonathan/repos/s2e/qemu/target-i386/op_helper.c:5379
#10 0x0000555555909be1 in __ldl_mmu_symb (addr=18446612132419099392, mmu_idx=0) at /home/jonathan/repos/s2e/qemu/softmmu_template.h:353
#11 0x0000000040053300 in ?? ()
#12 0x000055557ae545c0 in ?? ()
#13 0x00007fffb87a7750 in ?? ()
#14 0x00007fffb87a78a0 in ?? ()
#15 0x0000555555a74be5 in std::tr1::__detail::_Map_base<klee::ObjectState*, std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8u> >, std::_Select1st<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8u> > >, true, std::tr1::_Hashtable<klee::ObjectState*, std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8u> >, std::allocator<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8u> > >, std::_Select1st<std::pair<klee::ObjectState* const, llvm::SmallVector<std::pair<unsigned int, unsigned int>, 8u> > >, std::equal_to<klee::ObjectState*>, std::tr1::hash<klee::ObjectState*>, std::tr1::__detail::_Mod_range_hashing, std::tr1::__detail::_Default_ranged_hash, std::tr1::__detail::_Prime_rehash_policy, false, false, true> >::operator[] (this=<error reading variable: Cannot access memory at address 0xffff880006405af0>, this@entry=<error reading variable: Cannot access memory at address 0xffff880006405b08>,
    __k=<error reading variable: Cannot access memory at address 0xffff880006405ae8>) at /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/tr1/hashtable_policy.h:530
(gdb)

As you can see in #10: __ldl_mmu_symb (addr=18446612132419099392, mmu_idx=0), the addr parameter doesn't seem like a valid memory region. This is one of the things that stood out for us, but we are not sure if it's the reason why the assertion error occurs.

Another suspicious artifact is the retaddr parameter in #9. It looks like a 32bit address, but the host OS, guest OS, guest binary, and S2E QEMU are all 64bit. We also see the forked PC addresses to be 32bit. Is this expected?

Another thing we considered was just removing the assertion and ignoring any requests to flush a TLB page that does not exist. Do you think this would introduce problems to the state that is currently in context?

Thanks,
Jonathan

Vitaly Chipounov

unread,
Oct 27, 2015, 5:08:27 PM10/27/15
to s2e...@googlegroups.com
Hi,


On 27.10.2015 21:13, Jonathan Einstoss wrote:

Shawn and myself looked into the issue to try to solve assertion error. We thought we had tracked it down to some old QEMU bugs that had to do with using incorrect data types for some x64 systems. The incorrect data types were related to the QEMU TLB, so we thought that if we patched QEMU we could get rid of the error. Sadly, that was not the case.

unsigned long and uint64_t should are both 64-bits on Linux, so that shouldn't be a problem.



We did manage to consistently reproduce the error using the same minimal example in the documentation and the image that is attached to the original post in this thread. I compiled S2E with the S2E_DEBUG_TLBCACHE flag, but the output wasn't very helpful.

When I tried the device state fix on the image you provided, I haven't run in the TLB issue even after many runs. I'll try in VirtualBox to see if it helps.



As you can see in #10: __ldl_mmu_symb (addr=18446612132419099392, mmu_idx=0), the addr parameter doesn't seem like a valid memory region. This is one of the things that stood out for us, but we are not sure if it's the reason why the assertion error occurs.

Another suspicious artifact is the retaddr parameter in #9. It looks like a 32bit address, but the host OS, guest OS, guest binary, and S2E QEMU are all 64bit. We also see the forked PC addresses to be 32bit. Is this expected?

retaddr is the address of the instruction that follows the call to the helper from dynamically generated code. It should be within the dynamically generated code area. It's unrelated to guest addresses.



Another thing we considered was just removing the assertion and ignoring any requests to flush a TLB page that does not exist. Do you think this would introduce problems to the state that is currently in context?

The invariant is that if an objectstate is referenced in the TLB, then it must also be referenced in m_tlbMap (and vice versa). The purpose of this map is to keep track of all the entries where an objectstate is referenced, so that all entries can be updated when an execution state gets a private copy of that object. The assertion indicates that the TLB entries and the the map got out of sync. Removing the assert might cause other problems later.

Vitaly

Jonathan Einstoss

unread,
Oct 27, 2015, 8:00:07 PM10/27/15
to S2E Developer Forum
Vitaly,

I will clean up my testing environment and try again. I will run the test with a new snapshot to see if the TLB problem persists. Other than that, I can't think of any reason you would not run into the TLB issue with the image. We run into the TLB issue around 99% of the times we run the experiment.

We are using 64-bit Ubuntu Precise (12.04 LTS) as the host environment, in case that information matters.

Can you think of any reasons why the TLB and the map would get out of sync? Do you have any general recommendations on how we can approach on debugging this issue? 

Also, what arguments are you passing to S2E? Is your config file the same as the one used in the example?

Finally, could the type of CPU used for QEMU have anything to do with this issue?

We will continue to investigate and hopefully find the source of the problem.

Thanks for your help,
Jonathan

Vitaly Chipounov

unread,
Oct 28, 2015, 6:32:10 AM10/28/15
to s2e...@googlegroups.com
On 28.10.2015 01:00, Jonathan Einstoss wrote:
Vitaly,

I will clean up my testing environment and try again. I will run the test with a new snapshot to see if the TLB problem persists. Other than that, I can't think of any reason you would not run into the TLB issue with the image. We run into the TLB issue around 99% of the times we run the experiment.

We are using 64-bit Ubuntu Precise (12.04 LTS) as the host environment, in case that information matters.

I use 14.04 LTS, maybe that's the reason I can't reproduce it. If you could set up a VirtualBox VM that reproduces the bug, that would be useful.



Can you think of any reasons why the TLB and the map would get out of sync? Do you have any general recommendations on how we can approach on debugging this issue?

You could have an audit method called for every TLB operation (set/reset), something like this (tweaking needed):

bool audit()

{
    /**
     * Go through the TLB and make sure that all object states are
     * properly referenced.
     */
    CPUX86State* env = ...;

    foreach2(tlbIt, m_tlbMap.begin(), m_tlbMap.end()) {
        const ObjectState *os = (*tlbIt).first;
        const ObjectStateTlbReferences &vec = (*tlbIt).second;
        foreach2(vit, vec.begin(), vec.end()) {
            unsigned mmu_idx = (*vit).first;
            unsigned index = (*vit).second;
            CPUTLBEntry *entry = &env->tlb_table[mmu_idx][index];
            assert(entry->objectState == os);
        }
    }

    for(unsigned i = 0; i < CPU_TLB_SIZE; i++) {
        for (unsigned mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
            CPUTLBEntry *entry = &env->tlb_table[mmu_idx][i];
            if (!entry->objectState) {
                continue;
            }

            TlbMap::const_iterator it = m_tlbMap.find((ObjectState*) entry->objectState);
            assert(it != m_tlbMap.end());

            const ObjectStateTlbReferences &vec = (*it).second;
            unsigned foundCount = 0;
            foreach2(vit, vec.begin(), vec.end()) {
                if ((*vit).first == mmu_idx && (*vit).second == i) {
                    foundCount++;
                }
            }
            assert(foundCount == 1);
        }
    }

    return true;

}


Also, what arguments are you passing to S2E? Is your config file the same as the one used in the example?

I used the scripts that I found in your image.



Finally, could the type of CPU used for QEMU have anything to do with this issue?

I don't know.

Vitaly

Vitaly Chipounov

unread,
Oct 28, 2015, 10:35:23 AM10/28/15
to s2e...@googlegroups.com
I managed to reproduce it after many runs.
I suspect that flushTlbCachePage (S2EExecutionState.cpp:2094) is called with a stale ObjectState pointer.
The call to addressSpace.getWriteable (S2EExecutionState.cpp:2094) might have triggered S2EExecutionState::addressSpaceChange() which already updated the tlb entry, so there shouldn't be any need to flush it again.

State of variables at S2EExecutionState.cpp:2094:
(gdb) p oldObjectState
$9 = (klee::ObjectState *) 0x7f43089627d0
(gdb) p ros
$10 = (klee::ObjectState *) 0x7f4313862810
(gdb) p /x *entry
$11 = {objectState = 0x7f42a742f110, addend = 0xf7429809c8f1}
(gdb)


Vitaly

Jonathan Einstoss

unread,
Oct 29, 2015, 1:04:05 PM10/29/15
to S2E Developer Forum
Vitaly,

Thanks for sharing your findings.

We verified that in Ubuntu 14.04 LTS the TLB assertion error does not occur often. In fact, I have yet to see it... but since you were able to run into it, I'm sure it will appear eventually.

I will test a patch for the assert error, based on your findings, and let you know of the results.
...

Jonathan Einstoss

unread,
Oct 30, 2015, 3:55:33 PM10/30/15
to S2E Developer Forum
Vitaly,

I have verified that the TLB assertion error occurs because the "oldObjectState" is not in "m_tlbMap" when updateTlbEntry() is called and not because of addressSpaceChange() removing an entry from m_tlbMap.

I have confirmed that whenever we see the TLB assert error it's referencing the same parameter values every time.

Here is an example of the output, look for the lines starting with "Skipping flushTlbCachePage":
(gdb) r
Starting program: /home/jonathan/repos/s2e/build/qemu-debug/x86_64-s2e-softmmu/qemu-system-x86_64 centos7-amd64.img.s2e -loadvm 1 -net none -s2e-config-file s2e_config.lua -s2e-verbose -nographic
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
S2E: output directory = "/home/jonathan/s2e/centos-minimal-example/s2e-out-18"
WARNING: Linking two modules of different data layouts!
Creating plugin CorePlugin
Creating plugin BaseInstructions
2 [State 0] Created initial state
[New Thread 0x7fffb87a8700 (LWP 60947)]
Adding CPU (addr = 0x7ffff7e78010, size = 0x4eec0)
Initing initial device state.
WARNING!!! All writes to disk will be lost after shutdown.
s2e-block: dirty sectors on close:0
s2e-block: dirty after restore: 37577
37 [State 0] Inserting symbolic data at 0x7fffffffe560 of size 0x2 with name 'str'
37 [State 0] Forking state 0 at pc = 0x40050d
37 [State 0] Forking state 0 at pc = 0x400518
38 [State 0] Forking state 0 at pc = 0x400523
40 [State 0] Forking state 0 at pc = 0x40053d
40 [State 0] Forking state 0 at pc = 0x400556
KLEE: ERROR: failed external call: int32_to_floatx80
KLEE: NOTE: now ignoring this error at this location
42 [State 0] Switching from state 0 to state 5
47 [State 5] Switching from state 5 to state 4
Skipping flushTlbCachePage... index = 170; mmu_idx = 0; virtAddr = 0xffff880006405500; hostAddr = 0x7fffb5a05500
49 [State 4] Forking state 4 at pc = 0x400556
52 [State 4] Switching from state 4 to state 6
57 [State 6] Switching from state 6 to state 3
62 [State 3] Forking state 3 at pc = 0x40053d
64 [State 3] Forking state 3 at pc = 0x400556
65 [State 3] Switching from state 3 to state 8
69 [State 8] Switching from state 8 to state 2
Skipping flushTlbCachePage... index = 170; mmu_idx = 0; virtAddr = 0xffff880006405500; hostAddr = 0x7fffb5a05500
75 [State 2] Switching from state 2 to state 1
Skipping flushTlbCachePage... index = 170; mmu_idx = 0; virtAddr = 0xffff880006405500; hostAddr = 0x7fffb5a05500
All states were terminated
s2e-block: dirty sectors on close:37577
[Thread 0x7fffb87a8700 (LWP 60947) exited]
[Inferior 1 (process 60944) exited normally]
(gdb) 

As you can see, the updateTlbEntry() finds an entry in the s2e_tlb_table that is not cached in the m_tlbMap data member.

At this point I'm lost on how to proceed debugging this issue. Clearly an entry in the s2e_tlb_table is not being inserted into the m_tlbMap, any ideas on how we can go about fixing this?

Thanks,
Jonathan


On Wednesday, October 28, 2015 at 10:35:23 AM UTC-4, Vitaly Chipounov wrote:
...

leckerd...@googlemail.com

unread,
Jan 18, 2016, 12:48:50 PM1/18/16
to S2E Developer Forum
I ran into the same bug just following the basic tutorial with a 64bit host (in a vmware) and a 64bit guest.
I submitted an issue here:
...
Reply all
Reply to author
Forward
0 new messages