[postgis-users] pgsql2shp dbf file encoding

0 views
Skip to first unread message

Denis Rykov

unread,
Oct 24, 2010, 11:15:02 PM10/24/10
to PostGIS Users Discussion
Try to export postgis data to shapefiles with pgsql2shp (pgsql2shp-core.h 5870 2010-08-28 09:16:32Z mcayland)
If open *.dbf file I see
the value in my dbf files at byte 29 is 0x57h. Is the 0x57h value is default? Why not 0x00h?
With 0x57h encoding my shapefiles looks not correct in any GIS software.

Mark Cave-Ayland

unread,
Oct 25, 2010, 5:05:58 AM10/25/10
to PostGIS Users Discussion
Denis Rykov wrote:

I don't think it's currently set to anything, so I guess this would be
the default? Perhaps we should provide a mapping from PostgreSQL
database encoding names to shapefile encoding values in a table somewhere?

Anyone know which encoding 0x57h represents?


ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
_______________________________________________
postgis-users mailing list
postgi...@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

Francis Markham

unread,
Oct 25, 2010, 6:42:32 AM10/25/10
to PostGIS Users Discussion
0x57h is the dreaded Windows-1252 codepage. I believe new versions of
shapelib allow this to be set when the shapefile is created.

Cheers,

Francis Markham


On 25 October 2010 20:05, Mark Cave-Ayland

Denis Rykov

unread,
Oct 25, 2010, 11:24:25 AM10/25/10
to PostGIS Users Discussion
I don't quite understand why pgsql2shp is writing this encoding to our shapes, our database is in UTF-8 and we never use win1252

Mark Cave-Ayland

unread,
Oct 25, 2010, 12:57:51 PM10/25/10
to PostGIS Users Discussion
Denis Rykov wrote:

> I don't quite understand why pgsql2shp is writing this encoding to our
> shapes, our database is in UTF-8 and we never use win1252

Well pgsql2shp has never contained any code to set the encoding field
(mainly because until recently the version of shapelib included with
PostGIS didn't support the encoding field), so I guess WIN1252 must be
the shapelib default.

Paul Ramsey

unread,
Oct 25, 2010, 1:03:25 PM10/25/10
to PostGIS Users Discussion
And that DBF field dates from the Time Before UTF-8, so there won't be
a "UTF8" number to put in it, in any event. DBF files with UTF in
them (OSM!) are scary scary scary (for example, should your code for
reading a CHAR(8) field in DBF expect 8 bytes, or 8 characters? yay!)
It would be nice to support transcoding down to the code pages that
*are* supported in that field, I suppose. I wonder how much software
actually supports it.

P.

Denis Rykov

unread,
Oct 25, 2010, 1:25:35 PM10/25/10
to PostGIS Users Discussion
Will it make sense to set 0 as a default? In the current case some software (ArcGIS) does not override correct CPG setting with obviously incorrect 1252 from the header.

Paul Ramsey

unread,
Oct 25, 2010, 1:30:36 PM10/25/10
to PostGIS Users Discussion
Can you hexedit it and see if it works better?

Denis Rykov

unread,
Oct 25, 2010, 1:53:12 PM10/25/10
to PostGIS Users Discussion
After editing dbf file in hex editor and set value at byte 29 to 00h shapefile opens in ArcGIS without
encoding troubles (get codepage value from *.cpg file).

Mark Cave-Ayland

unread,
Oct 26, 2010, 5:11:32 AM10/26/10
to PostGIS Users Discussion
Denis Rykov wrote:

> After editing dbf file in hex editor and set value at byte 29 to 00h
> shapefile opens in ArcGIS without
> encoding troubles (get codepage value from *.cpg file).

That's strange. Does anyone know what the behaviour of the
psDBF->iLanguageDriver field should be in terms of how it reacts with a
.cpg file? Frank?

Reply all
Reply to author
Forward
0 new messages