Windows CFFI ABI Non-Deterministic

35 views
Skip to first unread message

Owl Owl

unread,
Feb 7, 2016, 12:02:03 AM2/7/16
to python-cffi
Hey all,

I'm brand new to CFFI, so this may be a simple question. I've been attempting to get the UCSB's angr tool to work on Windows. For the most part, I can get it working correctly and compiling. However, a module called pyVEX utilizes CFFI to load up a libvex library. Now on to the strange part...

When I attempt to use the library to, say, change some basic assembly into VEX, it usually crashes. However, every so often (maybe 5 times out of 25) it will actually work. I'm not doing anything differently between runs, and in fact can just use a script. Using some print statements I was able to track down the segfaulting to one line:

(line 137 pyvex\__init__.py)
c_irsb = pvc.vex_block_bytes(vex_arch, arch.vex_archinfo, c_bytes + bytes_offset, mem_addr, num_bytes, 1)

where pvc was loaded via the ffi.dlopen call.

I've gone so far as to print out all the arguments to the function and pickle of the calling class and none of that is different between when it succeeds in calling the function and when it fails. It almost feels like ASLR or something is causing the dll to be opened up in a different way each time? idk..

For reference, the code can be found here:
https://github.com/Owlz/pyvex/tree/master/pyvex

And my installable wheel files for Windows can be found here:
https://github.com/Owlz/angr-Windows/blob/master/README.md

Any help would be appreciated as the non-deterministic nature is driving me bonkers.

Thanks!

Armin Rigo

unread,
Feb 10, 2016, 10:11:07 AM2/10/16
to pytho...@googlegroups.com
Hi,

On Sun, Feb 7, 2016 at 6:02 AM, Owl Owl <whoota...@gmail.com> wrote:
> Any help would be appreciated as the non-deterministic nature is driving me
> bonkers.

As far as I can tell, it can be (1) a failure to keep some object
alive, resulting in the C function being called with pointers that
point to deallocated memory; (2) a CFFI bug with dlopen on Windows;
(3) the C library that really doesn't expect to be called the way you
do. Of these, I'd say (1) is the most likely.

Sorry, I can't help much. Maybe if you give detailed step-by-step
instructions on how to reproduce the problem, I could try it on a
Windows virtual machine.


A bientôt,

Armin.

Owl Owl

unread,
Feb 10, 2016, 9:27:34 PM2/10/16
to pytho...@googlegroups.com
Should be pretty easy to recreate. First you need to install angr on Windows:

https://github.com/Owlz/angr-Windows

The Readme on that walks through the install. The following code will cause this problem quite reliably:

import pyvex
import archinfo

# translate an AMD64 basic block (of nops) at 0x400400 into VEX
irsb = pyvex.IRSB("\x90\x90\x90\x90\x90", 0x400400, archinfo.ArchAMD64())

# pretty-print the basic block
irsb.pp()


The part of the code that is causing it will be at pyvex/__init__.py line 136:

c_irsb = pvc.vex_block_bytes(vex_arch, arch.vex_archinfo, c_bytes + bytes_offset, mem_addr, num_bytes, 1)

Let me know if you need anything else to recreate it.

Thanks!


--
-- python-cffi: To unsubscribe from this group, send email to python-cffi...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/python-cffi?hl=en
---
You received this message because you are subscribed to the Google Groups "python-cffi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-cffi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Armin Rigo

unread,
Feb 11, 2016, 11:36:44 AM2/11/16
to pytho...@googlegroups.com
Hi,

On Thu, Feb 11, 2016 at 3:27 AM, Owl Owl <whoota...@gmail.com> wrote:
> Should be pretty easy to recreate. First you need to install angr on
> Windows:

I followed your instructions, and running the code you give works fine
for me. It prints something, and I added a ``print "ok"`` afterwards
to check that it really reaches that point, and it does. I repeated
10 or 20 times without getting any crash...


A bientôt,

Armin.

Owl Owl

unread,
Feb 12, 2016, 11:45:07 AM2/12/16
to pytho...@googlegroups.com
Weird. So I did figure out one difference that EMET appears to have been part of the problem (for the initial script I gave you). However, I've installed it now on a clean Windows 7 box and here's how I can reproduce:

The test file is: https://github.com/ctfs/write-ups-2016/blob/master/su-ctf-2016/reverse/serial-150/serial?raw=true

import angr
proj = angr.Project("serial")
block = proj.factory.block(proj.entry)
block.capstone.pp()

This has the same "usually crashing" behavior at the same location. If you are able to hit pp(), I've only gotten it to return an empty string, which is incorrect. Also, you will have to have the c++filt.exe inside your Scripts dir for this to load up correctly.

Thanks



A bientôt,

Armin.

Armin Rigo

unread,
Feb 12, 2016, 2:00:40 PM2/12/16
to pytho...@googlegroups.com
Hi,

On Fri, Feb 12, 2016 at 5:45 PM, Owl Owl <whoota...@gmail.com> wrote:
> import angr
> proj = angr.Project("serial")
> block = proj.factory.block(proj.entry)
> block.capstone.pp()

This now fails for me with:

...in ctypes\__init__.py...
WindowsError: [Error 126] The specified module could not be found

The module in question is a full path to libz3.dll, which exists
despite the error message. Maybe it's some issue with this DLL
depending on some other DLLs?


A bientôt,

Armin.

Owl Owl

unread,
Feb 13, 2016, 10:51:12 AM2/13/16
to pytho...@googlegroups.com
Could you tell me what your setup looks like? When it fails for me it simply crashes out of the python session without any information. I've tried installing from a blank Win7 install and still don't receive any errors.

My baseline setup is:
* On Windows 7 x64
* Install Python 2.7 x64
* Following instructions on github page for the rest

Racking my brain to figure out what would be different in our setups... Oddly enough, I would love to be getting errors at this point.



A bientôt,

Armin.

Armin Rigo

unread,
Feb 14, 2016, 3:27:55 AM2/14/16
to pytho...@googlegroups.com
Hi Owl,

Looking in more details, I think that all the cffi code involved is in
IRSB.__init__() in the case that you reported first. (I thought first
that 'bytes' could be passed from outside, and I didn't find from
where, but it seems that you're only calling IRSB("some string").)
This code looks fine to me. At this point I would suspect that it's
bad usage of the C library.

To be sure, you should try to write a tiny C program that does the
same: allocate 13 bytes, fill the first 5 with 0x90 and the last 8
with 0x00, and call vex_block_bytes(&arch, &archinfo, bytes, 0x400400,
5, 1). The first two parameters should be built to contain the exact
same value as in Python, too. Does this work?

If it does, you should also try to use LoadLibrary() to load
pyvex_static.dll (equivalent to ffi.dlopen()), instead of the more
usual "dllimport" way in C. (I remember some DLL that would not work
correctly when loaded with LoadLibrary().)


A bientôt,

Armin.
Reply all
Reply to author
Forward
0 new messages