Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

is there a debian utility for this?

5 views
Skip to first unread message

Karen Lewellen

unread,
Jun 17, 2013, 11:50:02 PM6/17/13
to
Hi folks,
it may be a part of laytext, but we do not have that here on shellworld.
Still I am wondering if there is a utility in debian that will convert the
ms word .docx file format into anything else? antiword will not do this
because technically .docx is not word so to speak.
for those who do not know Microsoft created the .docx format in word 2007,
and almost nothing else can read them smiles.
ideas?
Karen


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/Pine.BSF.4.64.13...@server1.shellworld.net

Zenaan Harkness

unread,
Jun 18, 2013, 12:30:02 AM6/18/13
to
On 6/18/13, Karen Lewellen <klew...@shellworld.net> wrote:
> Hi folks,
> it may be a part of laytext, but we do not have that here on shellworld.
> Still I am wondering if there is a utility in debian that will convert the
> ms word .docx file format into anything else? antiword will not do this
> because technically .docx is not word so to speak.
> for those who do not know Microsoft created the .docx format in word 2007,
> and almost nothing else can read them smiles.
> ideas?

catdoc ?

Otherwise libreoffice from the command line. GIYF (google is your friend).


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/CAOsGNSRoNPeJb9d7OWPwJpLg=h82zNRsA11DjspHi=v2gp...@mail.gmail.com

Lisi Reisz

unread,
Jun 18, 2013, 3:30:03 AM6/18/13
to
On Tuesday 18 June 2013 05:25:46 Zenaan Harkness wrote:
> On 6/18/13, Karen Lewellen <klew...@shellworld.net> wrote:
> > Hi folks,
> > it may be a part of laytext, but we do not have that here on shellworld.
> > Still I am wondering if there is a utility in debian that will convert
> > the ms word .docx file format into anything else? antiword will not do
> > this because technically .docx is not word so to speak.
> > for those who do not know Microsoft created the .docx format in word
> > 2007, and almost nothing else can read them smiles.
> > ideas?
>
> catdoc ?
>
> Otherwise libreoffice from the command line. GIYF (google is your friend).

I have not had any difficulty reading docx for some years now. First
OpenOffice.org and now LibreOffice Just Work. Find file. Open it. Read to
your heart's content. Save in one of the many formats that OOo/LO can use,
or export as pdf.

You say neither how experienced you are, nor which version you are running.
Which version of the two applications you can install how, depends on these
factors.

Lisi


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/201306180827.0...@gmail.com

Michael Anckaert

unread,
Jun 18, 2013, 3:40:03 AM6/18/13
to
On 18/06/13 05:42, Karen Lewellen wrote:
> Hi folks,
> it may be a part of laytext, but we do not have that here on shellworld.
> Still I am wondering if there is a utility in debian that will convert
> the ms word .docx file format into anything else? antiword will not
> do this because technically .docx is not word so to speak.
> for those who do not know Microsoft created the .docx format in word
> 2007, and almost nothing else can read them smiles.
> ideas?
> Karen
>
>
While I'm not a fan of the OOXML format (official docx name), it's
actually an improvement over the binary doc format. A .docx is actually
a zip file containing HTML and CSS files. So it's really much easier to
read than the old binary format.

The reason why so few programs can read it correctly is because MS
doesn't really follow the published specification of how to
create/display an OOXML file. Pretty much the same issue as how IE
displays certain CSS properties.

Regarding your issue: you could write a script that does something like
unzip, parse with lynx, display the text contents.

Kind regards
Michael

signature.asc

Joe

unread,
Jun 18, 2013, 4:20:01 AM6/18/13
to
On Tue, 18 Jun 2013 08:27:07 +0100
Lisi Reisz <lisi....@gmail.com> wrote:

> On Tuesday 18 June 2013 05:25:46 Zenaan Harkness wrote:
> > On 6/18/13, Karen Lewellen <klew...@shellworld.net> wrote:
> > > Hi folks,
> > > it may be a part of laytext, but we do not have that here on
> > > shellworld. Still I am wondering if there is a utility in debian
> > > that will convert the ms word .docx file format into anything
> > > else? antiword will not do this because technically .docx is not
> > > word so to speak. for those who do not know Microsoft created
> > > the .docx format in word 2007, and almost nothing else can read
> > > them smiles. ideas?
> >
> > catdoc ?
> >
> > Otherwise libreoffice from the command line. GIYF (google is your
> > friend).
>
> I have not had any difficulty reading docx for some years now. First
> OpenOffice.org and now LibreOffice Just Work. Find file. Open it.
> Read to your heart's content. Save in one of the many formats that
> OOo/LO can use, or export as pdf.
>

Bearing in mind that anything more complex than a business letter is
unlikely to be displayed as the author intended. But then that's true
between different versions of Word, or even the same version if a
printer is selected which has a different printable area.

Word is really a document processor, eminently suitable for writing a
novel, but many people use it for DTP and complain about how poorly it
does the job.

--
Joe


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/20130618091...@jretrading.com

David Goodenough

unread,
Jun 18, 2013, 5:30:01 AM6/18/13
to
On Tuesday 18 Jun 2013, Karen Lewellen wrote:
> Hi folks,
> it may be a part of laytext, but we do not have that here on shellworld.
> Still I am wondering if there is a utility in debian that will convert the
> ms word .docx file format into anything else? antiword will not do this
> because technically .docx is not word so to speak.
> for those who do not know Microsoft created the .docx format in word 2007,
> and almost nothing else can read them smiles.
> ideas?
> Karen
While OpenOffice and LibreOffice will read simple documents resonably well,
I have found that KalligraSuite (the old KOffice) makes a better job of
more complicated documents. All three can then save the document in another
format.

David


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/201306181027.2414...@btconnect.com

Greg

unread,
Jun 18, 2013, 10:40:01 AM6/18/13
to
On Tue, 2013-06-18 at 14:25 +1000, Zenaan Harkness wrote:

> Otherwise libreoffice from the command line. GIYF (google is your friend).
>
>

No multi-billion dollar corporation is your friend.


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/1371566169....@fast.cercy.net

Kelly Clowers

unread,
Jun 18, 2013, 3:10:02 PM6/18/13
to
On Tue, Jun 18, 2013 at 12:35 AM, Michael Anckaert
<michael....@sinax.be> wrote:
> On 18/06/13 05:42, Karen Lewellen wrote:
>> Hi folks,
>> it may be a part of laytext, but we do not have that here on shellworld.
>> Still I am wondering if there is a utility in debian that will convert
>> the ms word .docx file format into anything else? antiword will not
>> do this because technically .docx is not word so to speak.

Antiword will not do it because it is a .doc converter, not a .docx
converter. .docx is absolutly a Word document.

>> for those who do not know Microsoft created the .docx format in word
>> 2007, and almost nothing else can read them smiles.
>> ideas?

Unoconv

>>
> While I'm not a fan of the OOXML format (official docx name), it's
> actually an improvement over the binary doc format. A .docx is actually
> a zip file containing HTML and CSS files.

It is absolutely not (X)HTML, nor CSS (it is a zip file though). It is
XML. You might be thinking of .epub ebook format...

Cheers,
Kelly Clowers


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/CAFoWM=9WsmiH6DEsUeb7e39R3VC3...@mail.gmail.com

Benedict Verheyen

unread,
Jun 19, 2013, 4:30:02 AM6/19/13
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 18/06/2013 21:04, Kelly Clowers wrote:
> On Tue, Jun 18, 2013 at 12:35 AM, Michael Anckaert
<snip>

>>> for those who do not know Microsoft created the .docx format in word 2007, and almost nothing else can read them smiles. ideas?
>
> Unoconv

A good suggestion by Kelly. We use it inhouse for a webapp that creates an
OpenOffice document, then automatically converts it to something Office can read.
Works great so far.

- --
Benedict Verheyen Debian, Python and Django user
GnuPG Public Key 0x712CBB8D
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlHBam8ACgkQ6YPsX3Esu42EJwCdFl93m9WZSh6pPkr/GNUjSl+b
BCwAn267fa5bT0BljeQQjnIGC+kF8ySL
=A4Mb
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/kprpp9$a14$2...@ger.gmane.org
0 new messages