Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

bulk insert : ignore blank rows

1,753 views
Skip to first unread message

John A Grandy

unread,
Sep 11, 2008, 10:41:03 PM9/11/08
to
My testing shows that for a table with at least one non-null column , BULK
INSERT ignores blank rows. ( Note that I am not using a format file. )

Is BULK INSERT itself ignoring blank rows before it attempts to insert them
into the table ?

Or is BULK INSERT attempting to load the blank row , but on attempting to
insert null into non-nullable col , failing , and moving on ?


Erland Sommarskog

unread,
Sep 12, 2008, 5:54:46 PM9/12/08
to
I would seriously doubt that BULK INSERT ignores any bytes in the file.
And, yes, I said bytes. BULK INSERT reads bytes, not rows. If you have
run with the defaults, and you have a file that goes:

ABC 123 DEF

XYZ 99 POL

(Assume tabs where the spaces are.)

I exepct this to load two rows into the table. The first row will have
the tuple ('ABC', 123, 'DEF') and the second ('\r\nXYZ', 99, 'POL').

--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

John A Grandy

unread,
Sep 12, 2008, 8:34:32 PM9/12/08
to
Ok, but I have

FIELDTERMINATOR = ','

and

ROWTERMINATOR = '\n'

so it knows the blank rows are there ...


"Erland Sommarskog" <esq...@sommarskog.se> wrote in message
news:Xns9B17F343A...@127.0.0.1...

Erland Sommarskog

unread,
Sep 13, 2008, 5:24:16 AM9/13/08
to
John A Grandy (johnagrandy-at-gmail-dot-com) writes:
> Ok, but I have
>
> FIELDTERMINATOR = ','
>
> and
>
> ROWTERMINATOR = '\n'
>
> so it knows the blank rows are there ...

I'm not sure that I understand what you are trying to say here. As I said,
BULK INSERT does not know about rows. What you specify as the row
terminator, is simply the terminator for the last field. If BULK INSERT
sees a line break when it's reading the bytes for another field, that
line break is regarded as data.

Take this file: (The hyphens not part of the file).

--------------
ABC,122,DEF

XYZ,989,PLJ
--------------

And this SQL:

DROP TABLE UserPart, Part, UserWhole, Usel, Whole

create table gurka (first varchar(20) NULL,
second int NULL,
third varchar(29) NULL)
go
bulk insert gurka from 'c:\temp\slask.txt'
with (fieldterminator=',', rowterminator='\n')
go
select datalength(first),* from gurka
go
drop table gurka

You get this output in text mode:


(2 row(s) affected)
first second third
----------- -------------------- ----------- ----------------------------
3 ABC 122 DEF
5
XYZ 989 PLJ

(2 row(s) affected)

Roy Harvey (SQL Server MVP)

unread,
Sep 13, 2008, 6:46:04 AM9/13/08
to
On Fri, 12 Sep 2008 17:34:32 -0700, "John A Grandy"
<johnagrandy-at-gmail-dot-com> wrote:

>Ok, but I have
>
>FIELDTERMINATOR = ','
>
>and
>
>ROWTERMINATOR = '\n'
>
>so it knows the blank rows are there ...

Another way to say what Erland was explaining....

It is easy to think that BULK INSERT first parses the input data into
rows, and then parses the rows into columns. The problem is, that
isn't what happens. It treats the file as a stream of columns, and
parses each column in turn. So the first column ends with the first
comma and the second column ends with the second comma and the third
column ends with the first row terminator after the second comma, and
the fourth column ends with the first comma after the row terminator
after the second comma, and on and on.

What this means is that any line that does not have EXACTLY the right
combination of terminators screws up at least two lines. If the
second comma is missing, it will find it on the next line off the
file, and then everything else on that line will be part of the column
ended with the row terminator. A blank line with two commas missing,
followed by a normal line, the entire blank line becomes part of the
first column of the next line.

BCP worked this way when Sybase wrote it for UNIX (which is very
"stream" oriented) back before release 1.1, and BULK INSERT was
written to follow the established BCP conventions.

Roy Harvey
Beacon Falls, CT

0 new messages