Audio

102 views
Skip to first unread message

Chris Hanson

unread,
Feb 24, 2011, 1:22:05 PM2/24/11
to openk...@googlegroups.com
I know nobody has really done much with kinect's microphones, but I've
had a hard time even getting a snapshot of the situation.

Did I read somewhere that there was some authentication or encryption
that might be a barrier to accessing it?

Sebastian Ortiz

unread,
Feb 24, 2011, 2:22:43 PM2/24/11
to openk...@googlegroups.com, Chris Hanson
Here's some notes on what I've been able to figure out so far and how:

I returned my xbox a while ago and I've been rather busy lately, so I haven't been able to make any progress beyond what I've described in the link above.

drew.m...@gmail.com

unread,
Feb 24, 2011, 2:36:19 PM2/24/11
to openk...@googlegroups.com
On Thu, Feb 24, 2011 at 11:31 AM, Chris Hanson
<xenonof...@gmail.com> wrote:

> On Thu, Feb 24, 2011 at 12:22 PM, Sebastian Ortiz <sebo...@gmail.com> wrote:
>> Here's some notes on what I've been able to figure out so far and how:
>> http://www.keyboardmods.com/2011/02/kinect-audio-reverse-engineering.html
>> I returned my xbox a while ago and I've been rather busy lately, so I
>> haven't been able to make any progress beyond what I've described in the
>> link above.
>
>  Wow. That's awesome. It looks very do-able then.
>
>  Are you able to share the bulk transfer data and code you used to
> prod the Kinect into getting this far? I'm sure someone else might be
> interested in picking it up from here.

Be careful. We probably don't have the rights to redistribute the
firmware blob itself, but you can grab and extract it from the MS
firmware update [1]. Let's avoid legal trouble. :)

Sebastian: can you say whether the firmware found in that update
matches what you upload in the bulk transfers?

-Drew

[1] - http://groups.google.com/group/openkinect/browse_thread/thread/17d96d9c36e3effc/df0a76abb4fd8414

Sebastian Ortiz

unread,
Feb 24, 2011, 2:34:55 PM2/24/11
to Chris Hanson, openk...@googlegroups.com
I'll try to post up some more info when I get home from work.

On Thu, Feb 24, 2011 at 11:31 AM, Chris Hanson <xenonof...@gmail.com> wrote:
On Thu, Feb 24, 2011 at 12:22 PM, Sebastian Ortiz <sebo...@gmail.com> wrote:
> Here's some notes on what I've been able to figure out so far and how:
> http://www.keyboardmods.com/2011/02/kinect-audio-reverse-engineering.html
> I returned my xbox a while ago and I've been rather busy lately, so I
> haven't been able to make any progress beyond what I've described in the
> link above.

Sebastian Ortiz

unread,
Feb 24, 2011, 3:17:23 PM2/24/11
to openk...@googlegroups.com, drew.m...@gmail.com, mi...@whitewing.co.uk
It's probably audios.bin or some variant of that. At the start of the binary file that I extracted from the beagle dumps I see the string "Audios" and near 0x005bb60 I see something like: 

AudiosFakeMdd...M.i.c.r.o.s.o.f.t...X.b.o.x. .N.U.I. .A.u.d.i.o.....X.b.o.x....X.b.o.x. .S.e.c.u.r.i.t.y. .M.e.t.h.o.d. .3.,. .V.e.r.s.i.o.n. .2...0.0.,. ... .2.0.0.9. .M.i.c.r.o.s.o.f.t. .C.o.r.p.o.r.a.t.i.o.n.. .A.l.l. .r.i.g.h.t.s .r.e.s.e.r.v.e.d

The size of the bin file is 911580. I don't know if this matches the size of one of those update files, maybe Mike Harrison can comment?

Chris Hanson

unread,
Feb 24, 2011, 2:31:56 PM2/24/11
to sebo...@gmail.com, openk...@googlegroups.com
On Thu, Feb 24, 2011 at 12:22 PM, Sebastian Ortiz <sebo...@gmail.com> wrote:
> Here's some notes on what I've been able to figure out so far and how:
> http://www.keyboardmods.com/2011/02/kinect-audio-reverse-engineering.html
> I returned my xbox a while ago and I've been rather busy lately, so I
> haven't been able to make any progress beyond what I've described in the
> link above.

Wow. That's awesome. It looks very do-able then.

drew.m...@gmail.com

unread,
Feb 24, 2011, 3:51:53 PM2/24/11
to openk...@googlegroups.com
On Thu, Feb 24, 2011 at 12:17 PM, Sebastian Ortiz <sebo...@gmail.com> wrote:
> It's probably audios.bin or some variant of that. At the start of the binary
> file that I extracted from the beagle dumps I see the string "Audios" and
> near 0x005bb60 I see something like:
> AudiosFakeMdd...M.i.c.r.o.s.o.f.t...X.b.o.x. .N.U.I.
> .A.u.d.i.o.....X.b.o.x....X.b.o.x. .S.e.c.u.r.i.t.y. .M.e.t.h.o.d. .3.,.
> .V.e.r.s.i.o.n. .2...0.0.,. ... .2.0.0.9. .M.i.c.r.o.s.o.f.t.
> .C.o.r.p.o.r.a.t.i.o.n.. .A.l.l. .r.i.g.h.t.s .r.e.s.e.r.v.e.d

Interesting.

audios.bin contains the same string starting at offset 0x0005e148.
2bl.bin contains the same string at 0x00009838. I'd guess this is the
bootloader.

>
> The size of the bin file is 911580. I don't know if this matches the size of
> one of those update files, maybe Mike Harrison can comment?

The audios.bin in the firmware update is 512544 bytes.
2bl.bin in the firmware update is 196608 bytes.

So, I don't know exactly what's being sent and what isn't.

Chris 'Xenon' Hanson

unread,
Feb 24, 2011, 6:16:44 PM2/24/11
to OpenKinect
> audios.bin contains the same string starting at offset 0x0005e148.
> 2bl.bin contains the same string at 0x00009838. I'd guess this is the
> bootloader.
> The audios.bin in the firmware update is 512544 bytes.
> 2bl.bin in the firmware update is 196608 bytes.
> So, I don't know exactly what's being sent and what isn't.

Maybe some sort of binary diff algorithm could be used to extract a
set of ranges of data from the firmware update. Then, at runtime the
same ranges could be extracted and sent to the Kinect.

drew.m...@gmail.com

unread,
Mar 4, 2011, 4:38:18 AM3/4/11
to sebo...@gmail.com, openk...@googlegroups.com
On Thu, Feb 24, 2011 at 12:17 PM, Sebastian Ortiz <sebo...@gmail.com> wrote:
> It's probably audios.bin or some variant of that. At the start of the binary
> file that I extracted from the beagle dumps I see the string "Audios" and
> near 0x005bb60 I see something like:
> AudiosFakeMdd...M.i.c.r.o.s.o.f.t...X.b.o.x. .N.U.I.
> .A.u.d.i.o.....X.b.o.x....X.b.o.x. .S.e.c.u.r.i.t.y. .M.e.t.h.o.d. .3.,.
> .V.e.r.s.i.o.n. .2...0.0.,. ... .2.0.0.9. .M.i.c.r.o.s.o.f.t.
> .C.o.r.p.o.r.a.t.i.o.n.. .A.l.l. .r.i.g.h.t.s .r.e.s.e.r.v.e.d
>
> The size of the bin file is 911580. I don't know if this matches the size of
> one of those update files, maybe Mike Harrison can comment?

I haven't seen anyone post details on how the firmware upload sequence
actually works, so I looked at the adafruit dumps until it made sense.
I was able to successfully extract a byte-for-byte identical copy of
audios.bin from the USB logs. Here's what I think happens:

The transfers that begin "09 20 02 06" appear to be commands. The
first one probably means "start a firmware download". The next 32
such commands seem to be "upload firmware block" - they specify how
many more bytes to expect to receive (0x4000, except the last one,
which is 0x1220), and where in memory to put them (first one is
0x00080000, next is 0x00084000, and it grows by the byte count each
time). After each of these commands, a series of transfers containing
the next set of bytes from the firmware are sent. The 34th command
appears to be "jump to this address" which jumps to the beginning of
the just-uploaded code.

Then the device disconnects, reenumerates, and I haven't looked past there.

I'll write this up on the wiki later, but the initial firmware payload
is precisely the 512544 bytes found in audios.bin in the firmware
update, so it may be possible to have users extract the firmware from
this update.

-Drew

Carlos Roberto

unread,
Mar 4, 2011, 6:41:40 AM3/4/11
to openk...@googlegroups.com

Chris 'Xenon' Hanson

unread,
Mar 4, 2011, 11:14:27 AM3/4/11
to OpenKinect
Nice work, Drew.

An audio hardware engineer friend of mine did some calculations for
me, based on the layout and spacing of the mics in the Kinect mic
array, assuming nominal 1KHz human speech frequency and an operational
distance of about 8 feet from the Kinect. We were trying to calculate
how directional the steerable virtual array was, for isolating voice
of one user from another.

In order to achieve a -3db difference (enough to tell if one person
or another is the primary source of the sound) the Kinect mic array
needs about 5 feet of positional distance between them.

This is largely limited by the low frequency of human speech
combined with the lack of spacing from one end of the array to the
other (around a foot). These two result in a fairly wide steerable
cone.

I can probably supply the actual calculations if anyone feels them
necessary.
Reply all
Reply to author
Forward
0 new messages