Number of Header Lines

42 views
Skip to first unread message

GeoK

unread,
Aug 20, 2009, 11:13:41 AM8/20/09
to CSVChat
I am working on a project at the moment where most of the csv files
have two text header lines.

Can you tell csvdatareader to skip two headers lines instead of one ?

Thanks
George

shriop

unread,
Aug 20, 2009, 11:31:13 AM8/20/09
to CSVChat
Yes, there's a solution I think if I understand the problem. No, you
can't directly tell CsvDataReader that there's two header lines. I'm
making the assumption that the first header line is some bogus data
that you just need to skip, so my solution may not work if that
assumption is incorrect. If you create an instance of StreamReader
passing in the filename, then call ReadLine on the StreamReader, then
pass the StreamReader into the constructor overload for CsvDataReader
that takes a TextReader, CsvDataReader will pick up after the first
line. Does this accomplish what you needed? If not, can you give some
more details about what you need to do with the two header lines and
how they're formatted?

Bruce Dunwiddie

GeoK

unread,
Aug 20, 2009, 2:20:41 PM8/20/09
to CSVChat
Hi Bruce,

In the CSV file it is the first line that is required, the second is
not.

The header format is along the lines of:

(line1) date-time,interval,speed,voltage
(line2) ,,cm-sec,

The "cm-sec" corresponds to "speed". The system that generates the csv
file puts it on two lines instead of "speed cm-sec". So, need to keep
the first line as it contains the headers and discard the second.

Thanks
George

shriop

unread,
Aug 20, 2009, 5:27:37 PM8/20/09
to CSVChat
Since it looks like the second line is formatted as an actual data
line, you can attach to the ReadRecord event of CsvDataReader, and in
that callback you can check the values of columns as strings before
they are parsed into the destination data type, and the event
arguments support a Skip property that can be set to true. So you can
check to see if the speed column has a value of "cm-sec", and if so,
skip that record.

Bruce Dunwiddie
> > > George- Hide quoted text -
>
> - Show quoted text -

GeoK

unread,
Aug 31, 2009, 9:18:15 AM8/31/09
to CSVChat
Do you have an example on how I might use the ReadRecord event ?

shriop

unread,
Sep 1, 2009, 1:28:21 PM9/1/09
to CSVChat
Try this earlier post, http://groups.google.com/group/CSVChat/browse_thread/thread/972d75d46da34c24
. Let me know if you need something more drawn out. I need to probably
put up some more complicated examples on the site, I just need to find
good examples that don't just add to the confusion.

Bruce Dunwiddie
> > > - Show quoted text -- Hide quoted text -

bellsy

unread,
Sep 7, 2009, 10:13:12 PM9/7/09
to CSVChat
I've been using the demo and have wondered how to do this, but I can't
quite figure out what you're trying to do - I'm fairly new to VB so
you'll have to bear with me!

I was originally declaring my Excel reader like this (where sourcePath
is the filename) :

Dim reader As New XlsReader(sourcePath)

reader.Settings.HasHeaders = True

This works fine, however as you've stated this means that the headers
must be on the first row. Some of the files here at work have 1 or
more lines prior to the header row, and I'd like not to have to
manually remove those lines or change other systems to accomodate
(where possible). I've tried to follow your solution, however the
declaration of the XlsDataReader returns an error 'Overload Resolution
Failed':

Dim strRead As New StreamReader(sourcePath)

strRead.ReadLine()

Dim xlsDR As New XlsDataReader(strRead)

xlsDR.Settings.HasHeaders = True


Where am I going wrong?

Thanks!
bellsy

shriop

unread,
Sep 8, 2009, 11:13:08 AM9/8/09
to CSVChat
It's the difference between CsvDataReader that works with CSV data and
XlsDataReader that works with Excel files. The strategy I was
suggesting won't work with Excel files. You didn't say what you were
needing to do with the data in the file. If you need to load the data
into SQL Server, here's my best suggestion. You would use XlsReader
instead of XlsDataReader. You would set the file as not having
headers. You would grab the DataTable from the Table property. You
would then be able to delete certain records from the DataTable that
you don't need. You can then grab an IDataReader instance from the
DataTable's CreateReader method and use it with SqlBulkCopy to bulk
insert into the table. You'll either have to map the columns by
ordinal, or change the DataColumn's names when you find the header row
to be able to map by name.

Bruce Dunwiddie
> > - Show quoted text -- Hide quoted text -

bellsy

unread,
Sep 8, 2009, 7:09:00 PM9/8/09
to CSVChat
Sorry I should have been more descriptive in what I'm doing.

I'm feeding XlsReader (defined as reader) with the path to a file that
I'm then searching for a particular record on a particular date, with
unique records on each row and dates across in columns.

eg. I am searching for record 'def' on the '3/1/2009', which returns
me the value 5.

Record 1/1/2009 2/1/2009 3/1/2009
abc 1 1.5 2.1
def 3 4 5
ghi 2 4.1 10

I have this working successfully by storing the headers in an array
using reader.Headers, which I use it to find the correct column. I
then use reader.ReadRecord() to loop through each row until I find the
datapoint in the first column, which then gives me the row number. To
get the value I then use reader.Item(row#, column#) to give me the
value I'm looking for.

Everything works 100% fine, except a number of the files I need to
import have one or more rows above the 'header row'. In my development
I've just removed the rows manually to get it working, however in
practise I want the user to specify on which row the headers reside.

Any ideas?

It would be great if you could set a parameter such as reader.HeaderRow
(row#), and then the reader would just pick up the data from the rows
below it.

Thanks!
bellsy

shriop

unread,
Sep 9, 2009, 12:36:53 PM9/9/09
to CSVChat
reader.Headers is settable. reader.CurrentRecord is also settable. You
can also get an array of the current record's values from
reader.Values. When you combine all of these I'm seeing something
like: set reader.CurrentRecord to the row number that you're expecting
the headers to be on. Then set reader.Headers = reader.Values. Then
you can just start your loop like you have been looping over
reader.ReadRecord unless there are also non data rows between the
header row and first data row, which you could just again use
reader.CurrentRecord to jump to the first data row.

Bruce Dunwiddie

bellsy

unread,
Sep 9, 2009, 9:21:24 PM9/9/09
to CSVChat
Works a treat - thanks!

For anyone else trying to do the same, here is the code for setting up
the headers:

Dim reader As New XlsReader(sourcePath)

reader.CurrentRecord = sourceExcelHeaderRow - 1
reader.Headers = reader.Values
Dim headers As String() = reader.Headers

With sourcePath being the filename, and sourceExcelHeaderRow being the
row number specified by the user.
Reply all
Reply to author
Forward
0 new messages