I did some work on this using XSL a few years ago which luckily I seem to have saved. This transform was designed to replace the XSL at the heart of Sharepoint's Word to HTML converter. That converter handled extracting the XML from the docx (which is really just a zip file), so you'd have to build a bit of infrastructure around it, but it could get you started:
Note that this was designed for a particular use model of allowing business users to update website content by uploading word docs. We wanted the HTML to be semantic (e.g. a bulleted list in Word became a <UL> in HTML, a paragraph became a <P>, while the original DocX2HTML just used DIVs for everything). It doesn't try to preserve all in-line formats, but relied on the idea of mapping Word style names to CSS classes (the creation of the CSS stylesheets was done by hand).
--Ken