ArcGIS (arcpy module) and openpyxl

307 views
Skip to first unread message

schiffba...@gmail.com

unread,
Aug 20, 2015, 1:07:07 AM8/20/15
to openpyxl-users
Hello openpyxl-users.
I am a GIS Analyst with a very large task of replacing all data sources with new data sources within over 1000 map documents. I have spoken with an ESRI rep regarding the issue and have been told that I am not the only one, it is a known bug, and the rep is working on a solution - but I believe my deadline is shorter than the time it will take for the solution. So, I am looking for other potential solutions that might be able to work with the arcpy module. I was wondering if anyone has a solution using openpyxl (I don't even know if it's possible, just came across this while searching for python solutions) to the following scenario:

I have a spreadsheet with columns identifying the old datasource and old data name, and the new datasource with the new data name. I am wondering if there is a way to iterate through a folder of map documents (this part I can do - next I have been unable to) and then using the spreadsheet name/source crosswalk, have python identify datasource == a in column c and replace it with datasource = b in column d.

I appreciate your help... we have been working with code that should successfully complete, but something with the changing from one Workspace to an SDE Workspace is causing it to halt after running through 3 datasources.

Thanks!
Crystal Schiffbauer-Bowles

Charlie Clark

unread,
Aug 20, 2015, 4:58:30 AM8/20/15
to openpyx...@googlegroups.com
Am .08.2015, 07:07 Uhr, schrieb <schiffba...@gmail.com>:

> Hello openpyxl-users.

Hi Crystal,

> I am a GIS Analyst with a very large task of replacing all data sources
> with new data sources within over 1000 map documents. I have spoken with
> an ESRI rep regarding the issue and have been told that I am not the
> only one it is a known bug, and the rep is working on a solution - but I
> believe my deadline is shorter than the time it will take for the
> solution. So, I am
> looking for other potential solutions that might be able to work with the
> arcpy module. I was wondering if anyone has a solution using openpyxl (I
> don't even know if it's possible, just came across this while searching
> for python solutions) to the following scenario:
>
> I have a spreadsheet with columns identifying the old datasource and old
> data name, and the new datasource with the new data name. I am wondering
> if there is a way to iterate through a folder of map documents (this
> part I
> can do - next I have been unable to) and then using the spreadsheet
> name/source crosswalk, have python identify datasource == a in column c
> and replace it with datasource = b in column d.
>
> I appreciate your help... we have been working with code that should
> successfully complete, but something with the changing from one Workspace
> to an SDE Workspace is causing it to halt after running through 3
> datasources.

I suspect that it's currently not possible to do this due to the way data
sources are handled by Excel but I'd need a sample file and code to check.
If this is the case then it's probably possible to add support reasonably
easily. What's your deadline?

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

Adam Morris

unread,
Aug 20, 2015, 10:09:20 AM8/20/15
to openpyxl-users, schiffba...@gmail.com
If it's a simple case of replacing data in cells, you can do that like this:


from openpyxl import load_workbook

input_filename
= 'data.xlsx'
output_filename
= 'data_out.xlsx'

wb
= load_workbook(input_filename)
ws
= wb.get_sheet_by_name('Sheet1')
for row in ws.iter_rows('C1:D21'):
   
if row[0].value and row[0].value == 'a':
        row
[1].value = 'b'

wb
.save(output_filename)

        
In addition, I wrote a small utility with an interface based on openpxyl, which edits existing cell values quickly (https://bitbucket.org/amorris/editpyxl) -- but it requires that the cells being edited exist, and the strings being changed are are unique.  (Excel has a shared string table -- and unlike openpyxl, if a string is edited, it will change that value in the shared table, effectively changing all occurrences of that string in the entire workbook.  Also, it removes formulas when updating values.  It just updates the xml inside the excel sheet, and doesn't touch what it doesn't modify, so it is significantly faster but extremely limited compared to openpyxl.)


from editpyxl import Workbook

input_filename
= 'data.xlsx'
output_filename
= 'data_out.xlsx'

wb
= Workbook()
wb
.open(input_filename)
ws
= wb.get_sheet_by_name('Sheet1')
       
for row in range(1, 21):
   
if ws.cell(row=row, column=3).value == 'a':
        ws
.cell(row=row, column=4).value = 'b'
       
wb
.save(output_filename)
wb
.close()

data.xlsx

schiffba...@gmail.com

unread,
Aug 20, 2015, 1:29:41 PM8/20/15
to openpyxl-users, schiffba...@gmail.com
Hello Adam. Thank you for the response. What I am trying to do is replace the data within Map Documents, not within excel (that's already done). I just want to be able to use the spreadsheet as a cross-reference for what the datasources should be in all of my map documents. I appreciate your time and response.

~Crystal

Charlie Clark

unread,
Aug 20, 2015, 1:45:29 PM8/20/15
to openpyx...@googlegroups.com
Am .08.2015, 19:29 Uhr, schrieb <schiffba...@gmail.com>:

> Hello Adam. Thank you for the response. What I am trying to do is replace
> the data within Map Documents, not within excel (that's already done). I
> just want to be able to use the spreadsheet as a cross-reference for what
> the datasources *should* be in all of my map documents. I appreciate your
> time and response.

Can you provide a sample file? (remove any sensitive data, of course).

Excel implements data sources using definedNames. Our support for this is
currently limited to ranges and constants within workbook and we
explicitly drop some names including data sources (you should get a
warning informing you about this if you open one of the files).

A substantial rewrite of this code is slated for 2.4 but we need to finish
2.3 first. Might have some time at our sprint in September. But it might
be possible to create a temporary workaround so that you can do what you
need.

schiffba...@gmail.com

unread,
Aug 20, 2015, 2:36:05 PM8/20/15
to openpyxl-users
Hello Charlie. Well, our deadline... as soon as it can possibly be done - budgeting can be tricky you know, and we have lots to do. I don't have any sample code for the excel python, but I do have sample code for a single map document arcpy module replaceDataSource. The indentation in my actual module comes out right - so that's not the issue, just how it pasted in here. I have attached a sample of the spreadsheet (this is truncated and edited so that sensitive information has been removed). I appreciate your help/comments very much.

arcpy module code for replacing data sources within a map document:
>>>import arcpy
>>>mxd = arcpy.mapping.MapDocument (r"CURRENT")
>>> for x,y,z in zip (a,d,g):
>>>   for lyr in arcpy.mapping.ListLayers(mxd):
...           if lyr.supports ("DATASOURCE"):
...              if lyr.dataSource == x:
...                lyr.replaceDataSource (r"Database Connections\My data.sde", "SDE_WORKSPACE", y)
...                  lyr.name = z      
sampleDataForPyxlHelp.xlsx

Charlie Clark

unread,
Aug 20, 2015, 3:38:55 PM8/20/15
to openpyx...@googlegroups.com
Am .08.2015, 20:36 Uhr, schrieb <schiffba...@gmail.com>:

> Hello Charlie. Well, our deadline... as soon as it can possibly be done -
> budgeting can be tricky you know, and we have lots to do. I don't have
> any
> sample code for the excel python, but I do have sample code for a single
> map document arcpy module replaceDataSource. The indentation in my actual
> module comes out right - so that's not the issue, just how it pasted in
> here. I have attached a sample of the spreadsheet (this is truncated and
> edited so that sensitive information has been removed). I appreciate your
> help/comments very much.

So, this file just contains the data source listed in the cells and not
connected to the file?

The second row looks like this:

Database Connections\My
data.sde project.SDE.i15_USGS_FedIndLand_03 "HCP_Basemap.gdb\Administrative\fedlanp020"

This is an edited file because it has New_SDE_Connection and NewFCName?

But based on what you said earlier, this isn't the case. You want to use
this file to manage edits to the myterious mxd files, right? And this is
what is currently broken.

Please write some pseudo to show what you want to with the data from this
file.

In openpyxl the following might be useful:

wb = load_workbook("SampleDataForPyxlHelp.xlsx")
ws = wb['Sheet1']
for row in ws.iter_rows():
new_conn, new_name, old_path = row[:3]

schiffba...@gmail.com

unread,
Aug 21, 2015, 7:29:32 PM8/21/15
to openpyxl-users
Hello Charlie. After reading through your post and a little more of the pyxl documentation, I think that what I am trying to do is NOT what pyxl will be able to. I need to find a way to use the arcpy module (since that is what can talk to ArcGIS) to cross-reference the spreadsheet - NOT using pyxl to cross-reference a spreadsheet and talk with ArcGIS. Does that make sense? I appreciate your replies, because you definitely helped me understand where I need to direct my attention.
Thanks again and have a great weekend - you've already started yours... jealous.
Crystal
Reply all
Reply to author
Forward
0 new messages