Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

LOAD TABLE (or ISQL INPUT) handling of quotes

140 views
Skip to first unread message

davidwinkler

unread,
Nov 17, 2009, 4:06:44 PM11/17/09
to
Running SQL Anywhere 11.0.1.2308, I am unable to
successfully load pipe-delimited, unquoted data via the LOAD
TABLE statement or INPUT ISQL command when the data value
contains an initial apostrophe immediately following a
delimiter. Here's the statement:

load table foo using client file 'C:\\bar.txt' delimited by
'|' quotes off;

If the apostrophe or apostrophes are in any other position
than the first character in a data element they all load
fine; when it's the first position, I'll typically get a
right-truncation error as the remainder of the line of text
is apparently trying to get stuffed into a single field.
Curiously, when an initial apostrophe is used, a second
apostrophe is treated as the closing quote (even with the
QUOTES OFF option) and remaining characters after the
closing quote but before the next delimiter are "dropped on
the floor" by LOAD TABLE. To try the same test (iAnywhere
folks), the table data I'm trying to load is here:

http://wireless.fcc.gov/uls/data/complete/l_ship.zip
(problem data lines are in the larger files)

as defined in DDL here:

http://wireless.fcc.gov/uls/data/documentation/pa_ddef44.txt

Maybe a bug, maybe I'm nuts. Confirmation either way is
appreciated!

John Smirnios [Sybase]

unread,
Nov 17, 2009, 5:04:29 PM11/17/09
to
The problem is not the quotes. Lines 275680 & 275681 of en.dat (for
example) are actually a single record:

EN|2373743|||WDA6635|L|L00115054|Academy of Natural Sciences
Estuarine Research Center|||||| (...)

LOAD TABLE and INPUT will interpret them as separate records so it is
trying to cram 'Estuarine Research Center' into a "record_type" column
which is char(2).

You need to do one of the following:
1. Verify whether any of these fields actually should/can contain
carriage returns. Maybe there shouldn't be a newline character after
"Academy of Natural Sciences". If carriage returns are not allowed,
cleanse your original data.
2. Use escapes in your original data file so that carriage returns don't
occur inside column data (eg. Academy of Natural Sciences \x0aEstuarine
Research Center). Remember to double-up backslashes in the original data
too.
3. Use a different end-of-record marker via "ROW DELIMITED BY" and make
sure your source files are corrected to use that row delimiter too.

-john.
--
John Smirnios
Senior Software Developer
iAnywhere Solutions Engineering

Whitepapers, TechDocs, bug fixes are all available through the iAnywhere
Developer Community at http://www.ianywhere.com/developer

davidwinkler

unread,
Nov 18, 2009, 10:03:15 AM11/18/09
to
Thanks, John for the quick response. I apologize that I
didn't point out a specific file & line number in my earlier
post, but try importing the file PUBACC_SH, which contains
the following on line # 24173:

SH|1330925|||WAL9515|R||PL|PH|'LECTRIC
LADY|WN2160U|Y|N||||8||| ||366051610|||||

This line exhibits the behavior I described in my original
post which I believe could be a bug.

The line you had identified below is certainly bad data
formatting on the FCC's part by not escaping CRLF's within
varchar fields for an ASCII export, but any problems with
trying to import it would be expected as normal behavior of
the DBMS.

Thanks again for looking into this!

Dave Winkler
Data Architect, Stanley Associates

John Smirnios [Sybase]

unread,
Nov 18, 2009, 2:29:42 PM11/18/09
to
The problem with that file isn't the quotes either. Line 303464 contains
the following (all on one line -- the newsgroup editor is word-wrapping):

SH|2956069|||WDE2090|R||FV|PH|Bay Fisher ||
(2)|1196757|N|N|||32|15|||||367320280|N|N|Y|Y|N

So that's
'Bay Fisher ' as ship_name varchar(35)
'' as ship_number char(12)
' (2)' as international_voyages char(2)
which, of course, doesn't fit.

Looking at the rest of the lines, I'd guess that the ship's name is
supposed to be 'Bay Fisher || (2)' which is a bit strange. So the source
data is either invalid or, at the very least, contains strings with
unescaped column delimiters.

I will take from this an implicit enhancement request to report the line
number on which a LOAD TABLE fails as well as the value & column that
caused the problem. :)

Breck Carter [TeamSybase]

unread,
Nov 18, 2009, 3:21:12 PM11/18/09
to
On 18 Nov 2009 11:29:42 -0800, "John Smirnios [Sybase]"
<smirnios_at_sybase.com> wrote:

>I will take from this an implicit enhancement request to report the line
>number on which a LOAD TABLE fails as well as the value & column that
>caused the problem. :)

Ohhhhh yeah, me too! I want to vote yes!

Breck

--
Breck Carter - Blog: http://sqlanywhere.blogspot.com/

SQLA questions and answers: http://sqla.stackexchange.com

RisingRoad helps SQL Anywhere developers make better databases
http://www.risingroad.com/
Breck.Carter at gmail

davidwinkler

unread,
Nov 18, 2009, 6:27:57 PM11/18/09
to
I wonder if it's possible that the behavior which I've very
specifically described regarding the apostrophe immediately
following the delimiter is actually only occurring when
using the INPUT command in ISQL:

input into FCC.PUBACC_SH from C:\SH.dat delimited by '|';

since it reports the line which it fails on, which is
whatever the first line with a delimiter then an apostrophe
(in this case the 'LECTRIC LADY). Because INPUT INTO will
not roll back the preceding lines it has imported, I can
observe the aberrant behavior I described, and maybe LOAD
TABLE is successfully processing that line?

OK, I just tried another test with the line in question &
the lines just before and after it (so my source file to
test is just 3 lines). You're correct; LOAD TABLE is
failing for other reasons, it is only INPUT INTO which
behaves badly, just as I'd described. Excellent suggestion
though about the LOAD TABLE enhancement, because if it
reported which line it was failing on I wouldn't have leaped
to the conclusion that LOAD and INPUT were failing on the
same line. Since you've spent this much time with this file
(sorry!) could you test out the INPUT command? Throw an old
programmer a bone and tell me I'm right about this? Thanks,
John!

anil k goel [Sybase]

unread,
Nov 19, 2009, 12:58:11 PM11/19/09
to
Also, if it is a problem getting the load file to the server machine, take a
look at a new feature in SA11 called client file loading.

LOAD TABLE USING CLIENT FILE 'abc' ...

makes the server transparently fetch the contents of file 'abc' from the
client machine instead of looking for the file on the server machine.

In that sense, LOAD TABLE USING CLIENT FILE can be a server side alternative
to DBISQL's INPUT statement.

-anil

"John Smirnios [Sybase]" <smirnios_at_sybase.com> wrote in message
news:4b057bc4@forums-1-dub...
> Apparently, dbisql's INPUT statement doesn't have a QUOTES OFF option so
> that would explain the behaviour. It is always behaving the same as LOAD
> TABLE with QUOTES ON and so the error you are seeing is expected.
>
> I have no idea why such an important option would have been left off of
> the INPUT statement implementation. You should make an enhancement request
> for dbisql to support the QUOTES option. Dbisql is handled by another team
> so it's better if you make the request.
>
> In the mean time, you can use LOAD TABLE or get the data properly
> sanitized (escapes where necessary, use quotes, etc) so that it can be
> used by LOAD or INPUT. You should be able to throw together a quick Python
> or Perl script to sanitize the data into a format appropriate for either
> input method.
>
> If the FCC provides the raw data in any other format, that format may be
> more convenient. For example if the data is in a format that can be
> accessed via an ODBC driver, you could also use remote tables to import
> the data.

0 new messages