best practice for loading large files into SQL Server database

조회수 4회

읽지 않은 첫 메시지로 건너뛰기

Phil Johnson

읽지 않음,

2007. 11. 23. 오전 7:33:0007. 11. 23.

받는사람

Hello,

I am working on a peice of code that takes a file and loads it into a SQL
Server DB.

When I load large files (over 70Mb) I get a system out of memory exception.
My machine has 2G RAM

I think the problem is that a file stream pointing to the file on the file
system is loaded into a memory stream one byte at a time. Does anybody have
any pointers on doing this type of thing and how to do this so that the file
is actually streamed from the file system right into the SQL Server DB?

Any advice or links to articles on how to do this best etc would be really
appreciated.

Thanks in advance for any help

--
Regards,

Phil Johnson (MCAD)

Marc Gravell

읽지 않음,

2007. 11. 23. 오전 7:42:3707. 11. 23.

받는사람

Well, what is the file and what code are you using to look at it?

To insert lots of rows (i.e. a huge CSV) you should be using
SqlBulkCopy with a streaming IDataReader; I recommend the one here:
http://www.codeproject.com/cs/database/CsvReader.asp

To insert a single huge BLOB, you must use chunking techniques - i.e.
you need to allocate a buffer (8040 bytes is optimal for SQL Server, I
believe) and read buffer-fulls of info, and pass this down to the db
in chunks, using the appropriate BLOB SQL commands [the commands
change between SQL 2000 with "image", and SQL 2005 with
"varbinary(max)"].

If you try and load the entire BLOB into memory (as a huge byte[]),
yes it will fail.

If you try and load the entire file into memory (as a DOM, DataTable,
etc), yes it will fail.

Marc

Marc Gravell

읽지 않음,

2007. 11. 23. 오전 7:47:1007. 11. 23.

받는사람

I found some of my old BLOB examples...

stream=>database
http://groups.google.co.uk/group/microsoft.public.dotnet.languages.csharp/msg/314e7e3782e59a93

database=>stream
http://groups.google.co.uk/group/microsoft.public.dotnet.languages.csharp/msg/fcd173f1db2951f1

Marc

Phil Johnson

읽지 않음,

2007. 11. 23. 오전 10:21:0007. 11. 23.

받는사람

Thanks Marc,

It is indeed a BLOB that I am loading, not a csv.

Those links look ideal.

Thanks again.
--
Regards,

Phil Johnson (MCAD)

Marc Gravell

읽지 않음,

2007. 11. 23. 오전 10:45:3707. 11. 23.

받는사람

No problem. If you need fewer round trips when saving, I believe that
any multiple of 8040 will work; likewise, if you are using SQL Server
2005 you should definitely use the varbinary(max) and associated
commands (instead of UPDATETEXT) - but the "read" stays the same.

For info, SQL Server 2008 has a new file-stream data type / attribute
(or something!). I haven't played with it yet, but that may also be
useful for file-based scenarios. YMMV.

Marc

새 메시지 0개