does anyone know is there is an emacs mode to read MS word documents
(sent by colleagues), as ASCII. I'm thinking of something along the
lines of antiword, which produces text from MS Word (and even keeps
tables and so on).
Cheers
Tiarnan
--
Tiarnán Ó Corráin CMG-WDSC
Sysadmin Cork.
tiarnan.o'cor...@cmg.com +353-21-4933200
> does anyone know is there is an emacs mode to read MS word documents
> (sent by colleagues), as ASCII. I'm thinking of something along the
> lines of antiword, which produces text from MS Word (and even keeps
> tables and so on).
There's catdoc.el, (emacs interface to catdoc) which I've used in the
past. It doesn't support later versions of MS word (97/2000) well,
though. Now I use wv to translate MS word and excel. There is a helper
script (wvMime) that you can call through emacs (and gnus, vm, etc.)
included with the program to present the formatted documents as
postscript, or you can use wvText to strip the formatting to get
ASCII. If I knew how to write lisp, I'd like to write an elisp
interface for wv.
--
Bruce Mobarry
There was a question about this recently on this forum. Look for
undoc.el, I got it from the wiki (I think). It has worked very well for
me to date, although I have not attempted ro read complex documents.
Roger Mason
On 7 Nov 2002, Tiarnan wrote:
> Hi--
>
> does anyone know is there is an emacs mode to read MS word documents
> (sent by colleagues), as ASCII. I'm thinking of something along the
> lines of antiword, which produces text from MS Word (and even keeps
> tables and so on).
>
> Cheers
>
> Tiarnan
>
>
> --
> Tiarnán Ó Corráin CMG-WDSC
> Sysadmin Cork.
> tiarnan.o'cor...@cmg.com +353-21-4933200
>
>
> _______________________________________________
> Help-gnu-emacs mailing list
> Help-gn...@gnu.org
> http://mail.gnu.org/mailman/listinfo/help-gnu-emacs
>
> There was a question about this recently on this forum. Look for
> undoc.el, I got it from the wiki (I think). It has worked very well for
> me to date, although I have not attempted ro read complex documents.
Well, it makes things readable, but it is far from perfect -- it seems
to just delete any non-ascii characters, such that sometimes you will
see words such as "Alex8" where "8" is some garbage that just looked
like being part of a real word... In other words, interfacing to
something like catdoc, antiword, or wvText (included with AbiWord)
might be cool. Actually all you need is this:
(add-to-list 'auto-mode-alist '("\\.doc\\'" . no-word))
(defun no-word ()
"Run antiword on the entire buffer."
(shell-command-on-region (point-min) (point-max) "antiword - " t t))
Alex.
[...]
> does anyone know is there is an emacs mode to read MS word documents
> (sent by colleagues), as ASCII. I'm thinking of something along the
> lines of antiword, which produces text from MS Word (and even keeps
> tables and so on).
A friend of mine does :
C-u M-! strings toto.doc
That's a bit ugly ;-), but that's work when the word document is cheap...
The source of MS Word's format is hidden. That's mean that
programs who intend to read it were made from reverse engineering.
That explains the lack of good UNIX viewer.
IMO, Emacs' programmers are very implicated within Free Software and
use free softwares. So I think, most of all don't feel concern by
reading Word Doc at all.
--
Julien ``Eole'' Avarre
jul...@avarre.com
[...]
> does anyone know is there is an emacs mode to read MS word documents
> (sent by colleagues), as ASCII. I'm thinking of something along the
> lines of antiword, which produces text from MS Word (and even keeps
> tables and so on).
A friend of mine does :
C-u M-! strings toto.doc
That's a bit ugly ;-), but that's work when the word document is cheap...
--
Julien ``Eole'' Avarre
jul...@avarre.com
Just a small note and self-advertisement: filesets.el uses antiword for
displaying (nothing more) "*.doc" files in an emacs buffer. Having it
properly configured and having antiword or a similar program installed,
the command "filesets-find-or-display-file" would do the job.
Cheers,
Thomas.
Yup, works for me:
- installed wvWare
- found some emacs code for using wvText within Gnus at
http://216.239.37.100/search?q=cache:RW5fo8yVQSgC:www.rhodesmill.org/brandon/notes/emacs.txt+using+wvText+emacs&hl=en&ie=UTF-8
- used code above to modify auto-mode-alist
Working smooth in dired-mode and gnus ...
For your convenience, here are the assorted code snippets (Disclaimer:
all just stolen together, none of this is mine ...):
Of course, one could play the same trick with wvHtml and use an emacs
browser to view the resulting HTML ... hm ... I think I have to get
this wmf2png business working ...
----------------
bin/tempfile
----------------
perl -MPOSIX -e 'print tmpnam()'
----------------
bin/wvTextStdin:
----------------
#!/bin/bash
# Allow wvText to read from the standard input.
# thanks to brandon from rhodesmill.org
t=$(basename $(tempfile))
cat "$@" > /tmp/$f.doc
cd /tmp
wvText $f.doc $f.txt
cat $f.txt
rm -f $f.doc $f.txt
----------------
emacs/my-mime-types.el
----------------
;; thanks to brandon from rhodesmill.org
(defun mm-inline-msword (handle)
"Return foo bar"
(let (text)
(with-temp-buffer
(mm-insert-part handle)
(call-process-region (point-min) (point-max) "wvTextStdin" t t nil)
(setq text (buffer-string)))
(mm-insert-inline handle text)))
(setq mm-automatic-display
(append mm-automatic-display
'("application/msword")))
(setq mm-inlined-types
(append mm-inlined-types
'("application/msword" "application/octet-stream")))
(setq mm-inline-media-tests
(append mm-inline-media-tests
'(("application/msword" mm-inline-msword identity))
'(("application/octet-stream" mm-inline-msword
(lambda (handle)
(let* ((type (mm-handle-type handle))
(name-pair (assq 'name type))
(name (cdr name-pair)))
(if name (equal ".doc" (substring name -4 nil)))
))))))
----------------
emacs/my-automodes.el
----------------
;;; automodes
(setq auto-mode-alist
(append
'(
("\\.\\([pP][Llm]\\|al\\)$" . cperl-mode)
("\\.\\([xX][sS][dD]\\)$" . xml-mode)
("\\.\\([xX][mM][lL]\\)$" . xml-mode)
("\\.[jJ][sS]$" . javascript-mode)
("\\.[pP][hH][pP]$" . php-mode)
("\\.doc\\'" . my-word-converter)
)
auto-mode-alist))
;; thanks to Alex Schroeder
(defun my-word-converter ()
"Run wvTextStdin on the entire buffer."
(shell-command-on-region (point-min) (point-max) "wvTextStdin" t t))
--
Christian Lemburg, <lem...@aixonix.de>, http://www.clemburg.com/
43rd Law of Computing:
Anything that can go wr
fortune: Segmentation violation -- Core dumped
AS> (add-to-list 'auto-mode-alist '("\\.doc\\'" . no-word))
AS> (defun no-word () "Run antiword on the entire buffer."
AS> (shell-command-on-region (point-min) (point-max) "antiword - "
AS> t t))
Perfect. Just what I was looking for, since antiword makes a
reasonable stab at doing tables.
Many thanks...
Tiarnan
--
Tiarnán Ó Corráin CMG-WDSC
Sysadmin Cork.
tiarnan.o'cor...@cmg.com +353-21-4933200
"Iraq: incredible weapons - incredible weapons." How do you know that?
"Uh, well... We looked at the receipt." -- Bill Hicks, 1992
> Actually all you need is this:
>
> (add-to-list 'auto-mode-alist '("\\.doc\\'" . no-word))
>
> (defun no-word ()
> "Run antiword on the entire buffer."
> (shell-command-on-region (point-min) (point-max) "antiword - " t t))
On my system there are lots of filenames ending in .doc whose files
are not Word files. So I modified your function thusly
(defun no-word ()
"Run antiword on the entire buffer."
(if (string-match "Microsoft "
(shell-command-to-string (concat "file " buffer-file-name)))
(shell-command-on-region (point-min) (point-max) "antiword - " t t)))
Works in Solaris and Linux, and should work on other unixes as well.
am
> (defun no-word ()
> "Run antiword on the entire buffer."
> (if (string-match "Microsoft "
> (shell-command-to-string (concat "file " buffer-file-name)))
> (shell-command-on-region (point-min) (point-max) "antiword - " t t)))
Cool. I did not know about "file"... :)
My stuff is on the wiki, btw:
* http://www.emacswiki.org/cgi-bin/wiki.pl?AntiWord
Alex.
-kin