Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Word document accessing using python

0 views
Skip to first unread message

Hari

unread,
Feb 13, 2008, 8:40:12 AM2/13/08
to
Hello,
I want to fetch some data from the work document and to fill it inside
excel sheet. For this I want to perform opening word documents and do
some string manipulations for extracting data and to fill it inside
excel sheet.
Can any one help in this by saying how to do this or by giving some
link where I can find some hints.

Thanks in advance,
Hari

Juan_Pablo

unread,
Feb 13, 2008, 9:14:58 AM2/13/08
to

Reedick, Andrew

unread,
Feb 13, 2008, 10:37:47 AM2/13/08
to Hari, pytho...@python.org


Google for sample scripts. Check the python documentation about
"Quick-Starts to Python and COM' and makepy.py.


Word Object Model Reference:
http://msdn2.microsoft.com/en-us/library/bb244515.aspx

import win32com.client
word = win32com.client.Dispatch("Word.Application")
word.Visible = True
word.Documents.Open('c:\\some\\where\\foo.doc')

doc = word.Documents(1)
tables = doc.Tables
for table in tables:
for row in table.Rows:
for cell in row.Cells:
...


Excel reference: http://msdn2.microsoft.com/en-us/library/bb149081.aspx

import win32com.client
import os
excel = win32com.client.Dispatch("Excel.Application", "Quit")
excel.Visible = 1

dir = os.getcwd()
book = excel.Workbooks.Open(dir + "/test.xls")
sheet = book.Worksheets(1)

for i in sheet.Range("A8:B9"):
print i
print("active chart = " + str(excel.ActiveChart))
print("active sheet= " + str(excel.ActiveSheet))
print("\t" + str(excel.ActiveSheet.Name))
print("active workbook = " + str(excel.ActiveWorkbook))
print("\t" + str(excel.ActiveWorkbook.Name))


new_sheet = excel.Sheets.Add(None, None, None,
win32com.client.constants.xlWorksheet)
new_sheet.Name = "foo"

## import from a csv file
query_results = new_sheet.QueryTables.Add("TEXT;" + dir + "\\data.csv",
new_sheet.Cells(1,1))
query_results.TextFileParseType = win32com.client.constants.xlDelimited;
query_results.TextFileCommaDelimiter = 1;
query_results.Refresh();

*****

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. GA623


Juan_Pablo

unread,
Feb 13, 2008, 1:07:19 PM2/13/08
to

> import win32com.client

but, window32com.client is only functional in windows

casti...@gmail.com

unread,
Feb 13, 2008, 5:19:05 PM2/13/08
to
On Feb 13, 12:07 pm, Juan_Pablo <jabar...@gmail.com> wrote:
> > import win32com.client
>
>  but, window32com.client is only functional in  windows

Excel can read XML.

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
<Worksheet ss:Name="Sheet1">
<Table>
<Row>
<Cell><Data ss:Type="String">abc</Data></Cell>
<Cell><Data ss:Type="Number">123</Data></Cell>
</Row>
</Table>
</Worksheet>
</Workbook>

Word command line to save as text, or the reader from Microsoft?

Reedick, Andrew

unread,
Feb 13, 2008, 5:50:04 PM2/13/08
to Juan_Pablo, pytho...@python.org
> -----Original Message-----
> From: python-list-bounces+jr9445=att...@python.org [mailto:python-


Correct. Microsoft tries very hard to make sure that its applications
and their saved data are only readable/usable/automate-able using
MS-Office and MS-Windows (and preferably the current versions of each.)
If you can't create a quality product that people will willingly pay
money for, then use lock-in to squeeze money out of them. =/

As suggested by castironpi, you could save the word docs in a neutral
format such as text or xml. Another idea is to find a module or
commercial product that will read word files. In theory, from what I've
heard, Open Office can read some (most?) word formats, and since OO has
a scriptable api (does it?), you could use OO to read the word doc and
use OO's api to get the relevant data. You would need to test it out.

*****

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. GA621


0 new messages