Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

floats truncated by strtod() if decimal point character differs

25 views
Skip to first unread message

Michael Weiser

unread,
Jul 1, 2019, 11:15:02 AM7/1/19
to dbd...@perl.org
Hello,

I've run into a problem where floats loose their fractions on the way
from the database to a perl script using DBD::Pg. A minimal reproducer:

Database content:

test=> select * from foo;
foo
-----
1.1
(1 row)

Perl script:

#!/usr/bin/perl

use DBI;
use DBD::Pg;
use POSIX;

POSIX::setlocale(&POSIX::LC_NUMERIC, "");
$dbh = DBI->connect("dbi:Pg:dbname=test", '', '', {AutoCommit => 0});
$sth = $dbh->prepare("SELECT * FROM foo;");
$result = $sth->execute();

print($sth->fetchrow_array(), "\n");

Output:
# LANG=C perl ~/t.pl
FOO: 1.1:1.100000 C
1.1
# LANG=de_DE.UTF-8 perl ~/t.pl
FOO: 1.1:1,000000 de_DE.UTF-8
1

A number of factors seem to have to come together to trigger the
problem:
- the client locale's decimal point character must be something other
than what the database (or the on-wire protocol?) is using, in my
case comma (,) from locale de_DE
- the script needs to reset LC_NUMERIC to empty, the reason for which is
lost to the depths of time. I guess it's because as a
side-effect it seems to enable locale-awareness in perl, particularly
in my case commas being used when printing float values.
- so far only macOS's system perl seems to be affected. A perl 5.29.9
compiled myself on the same system does not exhibit the problem and
neither does a Debian testing box.

I've traced the problem to the following code in dbdimp.c:dbd_st_fetch,
which is also where I've added the above FOO debug output:

AV * dbd_st_fetch (SV * sth, imp_sth_t * imp_sth)
{
[...]
switch (type_info->type_id) {
[...]
case PG_FLOAT4:
case PG_FLOAT8:
TRC(DBILOGFP, "FOO: %s:%2f %s\n", value, strtod((char *)value, NULL), getenv("LANG"));
sv_setnv(sv, strtod((char *)value, NULL));
break;

Where the database is delivering strings with dots as decimal point
character, strtod() in my case expects commas. It appears that macOS's
system perl uses a strtod() implementation that respects the locale
settings as per documentation in the man page:

The decimal point character is defined in the program's locale
(category LC_NUMERIC).

Since I've not been able to reproduce this with my self-compiled perl or
Debian perl I can only assume that they're using another implementation
and not the one from libc.

Software versions:

affected system perl: v5.18.4
unaffected vanilla perl: 5.29.9
DBD::Pg: 3.7.4
Postgres client library: vanilla 11.2 compiled myself
Postgres server: Debian testing 11.4-1, locale en_GB.UTF-8

Any advice on how to proceed here would be highly appreciated,
particularly regarding:

- Is it at all valid to tweak LC_NUMERIC using POSIX::setlocale() when
using DBD::Pg?

- Could the macOS system perl at runtime (without recompilation) be
convinced to use another strtod() implementation/behaviour? (... which
doesn't involve mucking about with LD_PRELOAD/DYLD_INSERT_LIBRARIES :)

- Could this reliably be fixed by switching to a non-locale-aware strtod()
implementation in DBD::Pg? Based on the protocol specification
(https://www.postgresql.org/docs/current/protocol-overview.html#PROTOCOL-FORMAT-CODES)
I suspect that the on-wire representation could be similarly affected by
server-side locale settings:

The text representation of values is whatever strings are
produced and accepted by the input/output conversion functions
for the particular data type.

I haven't yet looked at the Postgres code to find out for sure.

Thanks in advance,
--
Michael

Michael Weiser

unread,
Aug 5, 2019, 9:30:03 AM8/5/19
to dbd...@perl.org
Hi,

On Mon, Jul 01, 2019 at 03:56:28PM +0200, Michael Weiser wrote:

> I've run into a problem where floats loose their fractions on the way
> from the database to a perl script using DBD::Pg. A minimal reproducer:

Am I wrong on this list with my question or was there just too much or
too little or too specific information at once? :)

I'd appreciate any advice since I'm totally lost how to proceed here.

Thanks,
Michael
tschuess, Micha
Dagegen!

Michael Weiser

unread,
Aug 27, 2019, 8:15:03 AM8/27/19
to Noah Misch, dbd...@perl.org
Hello Noah,

On Mon, Aug 26, 2019 at 11:28:09PM -0700, Noah Misch wrote:

> > > I've run into a problem where floats loose their fractions on the way
> > > from the database to a perl script using DBD::Pg. A minimal reproducer:
> > Am I wrong on this list with my question or was there just too much or
> > too little or too specific information at once? :)
> It's the right list. The issue was complex enough to require some
> uninterrupted study time.

Thanks for getting back to me!

> > > # LANG=C perl ~/t.pl
> > > FOO: 1.1:1.100000 C
> > > 1.1
> > > # LANG=de_DE.UTF-8 perl ~/t.pl
> > > FOO: 1.1:1,000000 de_DE.UTF-8
> > > 1
> > > - so far only macOS's system perl seems to be affected. A perl 5.29.9
> > > compiled myself on the same system does not exhibit the problem and
> > > neither does a Debian testing box.
> I can reproduce it on RHEL 7 with DBD::Pg git head.

That's a relief for sure. :)

> That strtod() call is a
> somewhat-recent addition (added in DBD::Pg 3.6.0, from 2017). What is the
> output of these two commands with each of those Perl installations?

> perl -MDBD::Pg -e 'print $DBD::Pg::VERSION, "\n"'
> LANG=de_DE.UTF-8 perl -MPOSIX -e 'use strict; use warnings; setlocale(LC_NUMERIC, ""); print join " ", strtod("1.1"), "\n"'

Here's the output plus some unsolicited version info:

# /usr/bin/perl -V
Summary of my perl5 (revision 5 version 18 subversion 4) configuration:

Platform:
osname=darwin, osvers=18.0, archname=darwin-thread-multi-2level
uname='darwin osx316.apple.com 18.0 darwin kernel version 17.0.0: fri may 4 10:33:38 pdt 2018; root:xnu-4570.1.46.100.2~1development_x86_64 x86_64 '
config_args='-ds -e -Dprefix=/usr -Dccflags=-g -pipe -Dldflags= -Dman3ext=3pm -Duseithreads -Duseshrplib -Dinc_version_list=none -Dcc=cc'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags =' -g -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector',
optimize='-Os',
cppflags='-g -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector'
ccversion='', gccversion='4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -fstack-protector'
libpth=/usr/lib /usr/local/lib
libs=
perllibs=
libc=, so=dylib, useshrplib=true, libperl=libperl.dylib
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -fstack-protector'


Characteristics of this binary (from libperl):
Compile-time options: HAS_TIMES MULTIPLICITY PERLIO_LAYERS
PERL_DONT_CREATE_GVSV
PERL_HASH_FUNC_ONE_AT_A_TIME_HARD
PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP
PERL_PRESERVE_IVUV PERL_SAWAMPERSAND USE_64_BIT_ALL
USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF
USE_REENTRANT_API
Locally applied patches:
/Library/Perl/Updates/<version> comes before system perl directories
installprivlib and installarchlib points to the Updates directory
Built under darwin
Compiled at Apr 1 2019 13:12:58
@INC:
/Library/Perl/5.18/darwin-thread-multi-2level
/Library/Perl/5.18
/Network/Library/Perl/5.18/darwin-thread-multi-2level
/Network/Library/Perl/5.18
/Library/Perl/Updates/5.18.4
/System/Library/Perl/5.18/darwin-thread-multi-2level
/System/Library/Perl/5.18
/System/Library/Perl/Extras/5.18/darwin-thread-multi-2level
/System/Library/Perl/Extras/5.18
.
# /usr/bin/perl -I/tmp/dbdpg/lib/perl5 -MDBD::Pg -e 'print $DBD::Pg::VERSION, "\n"'
3.7.4
# /usr/bin/perl -MPOSIX -e 'use strict; use warnings; setlocale(LC_NUMERIC, ""); print join " ", strtod("1.1"), "\n"'
1.1 0
# LANG=de_DE.UTF-8 /usr/bin/perl -MPOSIX -e 'use strict; use warnings; setlocale(LC_NUMERIC, ""); print join " ", strtod("1.1"), "\n"'
1 2

# /tmp/myperl/bin/perl5.29.9 -V
Summary of my perl5 (revision 5 version 29 subversion 9) configuration:

Platform:
osname=darwin
osvers=18.5.0
archname=darwin-thread-multi-2level
uname='darwin mac.fritz.box 18.5.0 darwin kernel version 18.5.0: mon mar 11 20:40:32 pdt 2019; root:xnu-4903.251.3~3release_x86_64 x86_64 '
config_args='-ds -e -Dprefix=/tmp/myperl -Dcc=clang -Dccflags= -mmacosx-version-min=10.11 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector-strong -Adefine:cppflags=-no-cpp-precomp -mmacosx-version-min=10.11 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector-strong -Adefine:ld=clang -Aappend:ldflags= -Wl,-syslibroot,/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -mmacosx-version-min=10.11 -Aappend:lddlflags= -Wl,-syslibroot,/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -mmacosx-version-min=10.11 -Dman3ext=3pm -Duseithreads -Dinc_version_list=none -Duserelocatableinc -Dusedevel'
hint=recommended
useposix=true
d_sigaction=define
useithreads=define
usemultiplicity=define
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='clang'
ccflags ='-mmacosx-version-min=10.11 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector-strong -DPERL_USE_SAFE_PUTENV'
optimize='-O3'
cppflags='-no-cpp-precomp -mmacosx-version-min=10.11 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector-strong'
ccversion=''
gccversion='4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.46.3)'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='clang'
ldflags =' -mmacosx-version-min=10.14 -Wl,-syslibroot,/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -mmacosx-version-min=10.11 -fstack-protector-strong'
libpth=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.1/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/lib /usr/lib
libs=-lpthread -ldbm -ldl -lm -lutil -lc
perllibs=-lpthread -ldl -lm -lutil -lc
libc=
so=dylib
useshrplib=false
libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=bundle
d_dlsymun=undef
ccdlflags=' '
cccdlflags=' '
lddlflags=' -mmacosx-version-min=10.14 -bundle -undefined dynamic_lookup -Wl,-syslibroot,/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -mmacosx-version-min=10.11 -fstack-protector-strong'


Characteristics of this binary (from libperl):
Compile-time options:
HAS_TIMES
MULTIPLICITY
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_IMPLICIT_CONTEXT
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
PERL_USE_DEVEL
PERL_USE_SAFE_PUTENV
USE_64_BIT_ALL
USE_64_BIT_INT
USE_ITHREADS
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
USE_REENTRANT_API
USE_THREAD_SAFE_LOCALE
Built under darwin
Compiled at Apr 3 2019 22:40:04
@INC:
/tmp/myperl/lib/site_perl/5.29.9/darwin-thread-multi-2level
/tmp/myperl/lib/site_perl/5.29.9
/tmp/myperl/lib/5.29.9/darwin-thread-multi-2level
/tmp/myperl/lib/5.29.9
# /tmp/myperl/bin/perl5.29.9 -MDBD::Pg -e 'print $DBD::Pg::VERSION, "\n"'
3.7.4
# /tmp/myperl/bin/perl5.29.9 -MPOSIX -e 'use strict; use warnings; setlocale(LC_NUMERIC, ""); print join " ", strtod("1.1"), "\n"'
1.1 0
# LANG=de_DE.UTF-8 /tmp/myperl/bin/perl5.29.9 -MPOSIX -e 'use strict; use warnings; setlocale(LC_NUMERIC, ""); print join " ", strtod("1.1"), "\n"'
1 2

While this would seem to suggest that both use a locale-aware strtod in the
POSIX module, the reproducer still holds:

# /usr/bin/perl -I/tmp/dbdpg/lib/perl5 t.pl
1.1
# LANG=de_DE.UTF-8 /usr/bin/perl -I/tmp/dbdpg/lib/perl5 t.pl
1
# /tmp/myperl/bin/perl5.29.9 t.pl
1.1
# LANG=de_DE.UTF-8 /tmp/myperl/bin/perl5.29.9 t.pl
1.1

That would seem to suggest that DBD::Pg in my self-compiled perl uses another
strtod than the POSIX module. The same seems to be the case on Debian testing:

# perl -MDBD::Pg -e 'print $DBD::Pg::VERSION, "\n"'
3.9.1
# perl -MPOSIX -e 'use strict; use warnings; setlocale(LC_NUMERIC, ""); print join " ", strtod("1.1"), "\n"'
1.1 0
# LANG=de_DE.UTF-8 perl -MPOSIX -e 'use strict; use warnings; setlocale(LC_NUMERIC, ""); print join " ", strtod("1.1"), "\n"'
1 2
# perl t.pl
1.1
# LANG=de_DE.UTF-8 perl t.pl
1.1
# perl -V
Summary of my perl5 (revision 5 version 28 subversion 1) configuration:

Platform:
osname=linux
osvers=4.9.0
archname=x86_64-linux-gnu-thread-multi
uname='linux localhost 4.9.0 #1 smp debian 4.9.0 x86_64 gnulinux '
config_args='-Dusethreads -Duselargefiles -Dcc=x86_64-linux-gnu-gcc -Dcpp=x86_64-linux-gnu-cpp -Dld=x86_64-linux-gnu-gcc -Dccflags=-DDEBIAN -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=/build/perl-5WfRyb/perl-5.28.1=. -fstack-protector-strong -Wformat -Werror=format-security -Dldflags= -Wl,-z,relro -Dlddlflags=-shared -Wl,-z,relro -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.28 -Darchlib=/usr/lib/x86_64-linux-gnu/perl/5.28 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/x86_64-linux-gnu/perl5/5.28 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.28.1 -Dsitearch=/usr/local/lib/x86_64-linux-gnu/perl/5.28.1 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -Ui_xlocale -Uversiononly -DDEBUGGING=-g -Doptimize=-O2 -dEs -Duseshrplib -Dlibperl=libperl.so.5.28.1'
hint=recommended
useposix=true
d_sigaction=define
useithreads=define
usemultiplicity=define
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='x86_64-linux-gnu-gcc'
ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fwrapv -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
optimize='-O2 -g'
cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fwrapv -fno-strict-aliasing -pipe -I/usr/local/include'
ccversion=''
gccversion='8.3.0'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='x86_64-linux-gnu-gcc'
ldflags =' -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
perllibs=-ldl -lm -lpthread -lc -lcrypt
libc=libc-2.28.so
so=so
useshrplib=true
libperl=libperl.so.5.28
gnulibc_version='2.28'
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags='-Wl,-E'
cccdlflags='-fPIC'
lddlflags='-shared -L/usr/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl):
Compile-time options:
HAS_TIMES
MULTIPLICITY
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_IMPLICIT_CONTEXT
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
USE_64_BIT_ALL
USE_64_BIT_INT
USE_ITHREADS
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
USE_REENTRANT_API
Locally applied patches:
[...]
Built under linux
Compiled at Mar 31 2019 11:51:22
@INC:
/etc/perl
/usr/local/lib/x86_64-linux-gnu/perl/5.28.1
/usr/local/share/perl/5.28.1
/usr/lib/x86_64-linux-gnu/perl5/5.28
/usr/share/perl5
/usr/lib/x86_64-linux-gnu/perl/5.28
/usr/share/perl/5.28
/usr/local/lib/site_perl
/usr/lib/x86_64-linux-gnu/perl-base

> > > The text representation of values is whatever strings are
> > > produced and accepted by the input/output conversion functions
> > > for the particular data type.
> PostgreSQL always prints ".", not a locale-specific radix character. A
> locale-ignorant strtod() would suffice in DBD::Pg.

Awesome. Is there anyway I can help with this?
--
Thanks,
Michael

Michael Weiser

unread,
Sep 2, 2019, 5:45:02 AM9/2/19
to Noah Misch, dbd...@perl.org
Hi Noah,

On Sat, Aug 31, 2019 at 12:59:43AM -0700, Noah Misch wrote:

> > That would seem to suggest that DBD::Pg in my self-compiled perl uses another
> > strtod than the POSIX module. The same seems to be the case on Debian testing
> I now find https://perldoc.perl.org/perlxs.html explains this. It writes,
> 'starting in v5.22, perl tries to keep LC_NUMERIC always set to "C"'. Hence,
> one will see the problem only with DBD::Pg >= 3.6.0 and Perl < 5.22.

Ah.

> > > PostgreSQL always prints ".", not a locale-specific radix character. A
> > > locale-ignorant strtod() would suffice in DBD::Pg.
> > Awesome. Is there anyway I can help with this?
> Would you like to write a patch?

Happy to. While pondering whether to fiddle with LANG/setlocale() around
a standard strtod() or adding a custom locale-unaware strtod() I found
the STORE_*_LC_NUMERIC macros in perl.h
(https://github.com/Perl/perl5/blob/blead/perl.h#L6361), particularly
STORE_LC_NUMERIC_SET_STANDARD. This is used at least in Perl's own sv.c
and dump.c to get dotted decimals in debug output. More details are
given in perlxs CAVEATS (https://perldoc.perl.org/perlxs.html#CAVEATS)
and perlapi
(https://perldoc.perl.org/perlapi.html#STORE_LC_NUMERIC_SET_TO_NEEDED).

Unfortunately, it seems the whole mechanism was introduced only in perl
5.20, so would be unavailable in macOS's system perl 5.18, making the
point moot.

There does not seem to be a locale-unaware strtod() in perl already we
could use.

So we're back to the first two choices of fiddling with LANG/setlocale()
or implementing a custom strtod(). I'd prefer the latter. What are your
thoughts on this, particularly license-wise where to grab an
implementation from (BSD?)?
--
Thanks,
Michael

Michael Weiser

unread,
Sep 17, 2019, 9:15:03 AM9/17/19
to Noah Misch, dbd...@perl.org
Hello Noah,

On Mon, Sep 02, 2019 at 10:06:00AM -0700, Noah Misch wrote:

> > What are your
> > thoughts on this, particularly license-wise where to grab an
> > implementation from (BSD?)?
> BSD is a good bet. More often than not, I use NetBSD as a source of library
> code like this.

I've finally had some time to look into this. The NetBSD as well as
glibc implementations of strtod() are huge. Duplicating them in DBD::Pg
just for this single call would be a nightmare IMO.

I did however stumble across strtod_l() which takes a locale as a third
argument. On OS X and the BSDs a NULL for this argument stands for the C
locale which would be exactly what we need here.

Unfortunately, glibc requires a valid locale_t pointer for the third
argument. This would lead to code like this:

diff --git a/dbdimp.c b/dbdimp.c
index c88321f..f261a86 100644
--- a/dbdimp.c
+++ b/dbdimp.c
@@ -3739,8 +3739,15 @@ AV * dbd_st_fetch (SV * sth, imp_sth_t * imp_sth)
break;
case PG_FLOAT4:
case PG_FLOAT8:
- sv_setnv(sv, strtod((char *)value, NULL));
- break;
+ {
+ /* use extended locale version of strtod with C
+ * locale argument to force . as decimal point
+ * character for perl versions < 5.19 */
+ locale_t cloc = newlocale(LC_ALL_MASK, "C", NULL);
+ sv_setnv(sv, strtod_l((char *)value, NULL, cloc));
+ freelocale(cloc);
+ break;
+ }
default:
sv_setpvn(sv, (char *)value, value_len);
}

It should be possible to keep the C locale_t around in a static local or
global variable for performance. As further optimization we could check
the decimal point character of the current locale and just keep using
strtod() if it's the dot already. This is discussed in below lua thread
as well as here: https://github.com/nlohmann/json/issues/302. I'm not
quite sure which direction they went in the end.

I was not able to find any better overview on standards conformance and
availability than this discussion of the same problem by the lua folks:
http://lua-users.org/lists/lua-l/2016-04/msg00215.html. Seems to me this
could severely constrain portability of DBD::Pg. Since only old perl
versions are affected, we could use ifdefs to conditionalise it and
maybe adopt a best-effort approach.

And finally: Without _GNU_SOURCE defined, the above code still compiles
and links with glibc but returns 0 with my 1,1 test case. So we'd likely
need to do some more testing and sanity checks here.

What do you think?
--
Michael

Michael Weiser

unread,
Sep 23, 2019, 6:15:03 AM9/23/19
to Noah Misch, David Christensen, dbd...@perl.org
Hello Noah,

On Tue, Sep 17, 2019 at 11:07:21PM -0700, Noah Misch wrote:

> > I've finally had some time to look into this. The NetBSD as well as
> > glibc implementations of strtod() are huge. Duplicating them in DBD::Pg
> > just for this single call would be a nightmare IMO.
> That is unfortunate. How huge?

The NetBSD implementation itself is about 24KiB but isn't
self-contained. It depends to an unknown extent on a whole floating
point library gdtoa:

# du -sb src/lib/libc/gdtoa/strtod.c
23576 src/lib/libc/gdtoa/strtod.c
# wc -l src/lib/libc/gdtoa/strtod.c
1117 src/lib/libc/gdtoa/strtod.c
# du -sk src/lib/libc/gdtoa
1004 src/lib/libc/gdtoa
# wc -l src/lib/libc/gdtoa/*.[ch]
[...]
11107 total

Separating it out of that would be a medium-sized trial-and-error
project potentially introducing new bugs. And each change removing a
dependency would incur additional syncing effort in the future -
assuming that this library isn't totally bug-free and gets updates. cvs
log says that it's been quite stable since 2014 but there were some
updates this year.

The glibc implementation is twice the size but may be a bit more
self-contained (just a feeling based on a blog post I read about it) -
but has its licensing issue.

> > I did however stumble across strtod_l() which takes a locale as a third
> > argument.
> That function is not portable enough:
> https://www.gnu.org/software/gnulib/manual/html_node/strtod_005fl.html

Found that as well and wanted to check how gnulib implements
replacements on those systems. But it simply doesn't. Duh.

On Wed, Sep 18, 2019 at 09:52:03AM -0500, David Christensen wrote:

> > That function is not portable enough:
> > https://www.gnu.org/software/gnulib/manual/html_node/strtod_005fl.html
> This code claims to be in the public domain; no idea if it’s any good, but looks like a reasonable size at least:

> https://gist.github.com/mattn/1890186

There's a comment from end of last year pointing out misbehaviour on a
specific input format.

OTOH: If we can be *really* certain about the format generated by
Postgres we might be able to get away with a very reduced version of
strtod_l(). A generic implementation deals with a number of different
formats such as 1.2e12, 0x prefix for hex base, infinity, nan and so on
which postgres might never generate.

I'm really starting to wonder if solving a problem that's already been
solved by perl itself is worth the effort. If only Apple were to update
their system perl once every five years or so we'd be golden.
--
Thanks,
Michael

Michael Weiser

unread,
Sep 26, 2019, 1:30:03 PM9/26/19
to Noah Misch, David Christensen, dbd...@perl.org
Hello Noah,

On Tue, Sep 24, 2019 at 12:06:33AM -0700, Noah Misch wrote:

> Another possibility is to disable the strtod() call when building against older
> Perl; instead, use sv_setpv() to store the float as a string. Perl would

> Does that approach have material disadvantages?

I like this approach since it aligns with what a pre 3.6.0 version of
DBD::Pg will silently do with a current perl as well. I have been using
it this way for a couple of years and not noticed any problems.

BTW: What *was* the trigger for adding strtod() in the first place if
perl does the conversion on the fly?

Should I try my luck at a patch or is it too trivial now?
--
Thanks, Michael
0 new messages