1GB bloat while importing

17 views
Skip to first unread message

George Nicholson

unread,
Oct 18, 2007, 1:24:00 PM10/18/07
to
I've run into a problem that I don't quite understand, so at a loss to a
solution.

SETUP:
I have a mdb file which includes 2 tables: datOrderHeader * datOrderDetail.

I have weekly csv files that need to be imported into these tables. The 1st
35 values in the csv go into datOrderHeader. The remaining csv values go
into datOrderDetail in groups of 3 per record. i.e., values 36, 37, 38 make
one detail record. 39, 40, 41 (if they exist) make a second detail record
for that order, etc., etc. No limit to the number of detail records per
order (so no telling how long my longest csv record will be either and the
import wizard is useless unless your longest record is included in 1st 20 or
so records, which is why I'm doing it this way).

I have working code that prompts user for file location, opens the file,
reads a line, adds the data to my tables, reads next line, etc. and it works
fine.

HERE IS MY PROBLEM: I start with a compacted mdb of 354mb. I try to import a
csv file with 434k records and my code crashes at record 409k because the
mdb is now 2gb (Jet file size limit). That (incomplete) file compacts to
514mb.

What the *!#% is causing bloat of 1.5GB???? No temp tables. No field
modifications. No deletions of anything. Simply open a text file and add
records.

Here is most of the code I'm using (sorry for length). There are a couple of
"GetYadayada" functions which are all very straightforward "evaluate &
return" functions. One of them has a Dlookup, otherwise they are all self
contained, so I seriously doubt they are causing any bloat.

*Any* insights into the bloat issue would be appreciated. I'd love to be
able to eliminate as much the bloat as I can so I can import complete files
in one pass.

TIA,

****************************

Public Function ReadCSVData(Optional strFilename As String) As Boolean
' Purpose: read comma-delimited data from text file and store in Access
tables

' In: name of text file containing information
' Out:
On Error GoTo ErrHandler

Dim db As DAO.Database
Dim rsHeader As DAO.Recordset
Dim rsDetail As DAO.Recordset

Dim str As String
Dim strArray() As String
Dim intE As Integer 'E for Element(s) of Array
Dim strDescription As String
Dim iOrdCount As Long
Dim iBatchLineItems As Long
Dim lngOrderID As Long
Dim iDetailFields As Long
Dim varSysCmd As Variant
Dim dtmStart As Date
Dim strElapsed As String
Dim iLineItems As Long
Dim iTotalItems As Long


ReadCSVData = False

If Len(strFilename) = 0 Then
strFilename = CStr(CommonFileOpenSave)
If Len(Dir(strFilename)) = 0 Then
MsgBox "Cancelling: User canceled or file does not exist"
GoTo ExitHere
End If
End If

Do Until Len(strDescription) > 0
strDescription = InputBox("Enter the DataBatchID # for this data or
type 'CANCEL'.")
Select Case strDescription
Case ""
'Do nothing. Loop
Case "Cancel", "CANCEL"
GoTo ExitHere
Case Else
If IsNumeric(strDescription) Then
'Continue
Else
' invalid entry
strDescription = ""
End If
End Select
Loop

dtmStart = Now()
Set db = CurrentDb
Set rsHeader = db.OpenRecordset("datOrderSummary")
Set rsDetail = db.OpenRecordset("datOrderDetail")

Open strFilename For Input As #1
Do While Not EOF(1)
Line Input #1, str

strArray = Split(str, ",", -1)
iLineItems = 0
iTotalItems = 0
If UBound(strArray) < 35 Then
MsgBox "Logic Error in code for 'ReadCSVData'. Record is shorter
than expected."
GoTo NextRecord
End If

With rsHeader
.AddNew

iOrdCount = iOrdCount + 1
For intE = LBound(strArray) To 35
Select Case intE + 1
' The + 1 is strictly for convenience and ease of
debugging/reading,
' so the case statements correspond to the documented
field numbering.
Case 1
!StoreNo = CLng(strArray(intE))
Case 2
!TCDate = CDate(strArray(intE))
Case 3
!TCTime = CDate(strArray(intE))
Case 4
!StorewideTCNum = CLng(strArray(intE))
Case 5
!KSNo = CInt(strArray(intE))
Case 6
!KSOrdNo = CLng(strArray(intE))
Case 7
!NetAmount = CCur(strArray(intE))
Case 8
!Tax = CCur(strArray(intE))
Case 9
!NonProductAmt = CCur(strArray(intE))
Case 10
!DiscAmount = CCur(strArray(intE))
Case 11
!GCertRedeemedAmt = CCur(strArray(intE))
Case 12
!GCardRedeemedAmt = CCur(strArray(intE))
Case 13
!GCardRedeemedQty = CInt(strArray(intE))
Case 14
!GCertSoldAmt = CCur(strArray(intE))
Case 15
!GCardSoldAmt = CCur(strArray(intE))
Case 16
!GCardSoldQty = CInt(strArray(intE))
Case 17
!Tendered = CCur(strArray(intE))
Case 18
!PaymentType = CByte(strArray(intE))
Case 19
!DTFlag = CByte(strArray(intE))
If !DTFlag = 0 Then !DTFlag = 2
Case 20
!CarryOutFlag = CByte(strArray(intE))
Case 21
!RefundFlag = CByte(strArray(intE))
Case 22
!EmpDiscFlag = CByte(strArray(intE))
Case 23
!MgrDiscFlag = CByte(strArray(intE))
Case 24
!OtherDiscFlag = CByte(strArray(intE))
Case 25
!OverringFlag = CByte(strArray(intE))
Case 26
!OtherReceiptFlag = CByte(strArray(intE))
Case 27
![Stored/HeldFlag] = CByte(strArray(intE))
Case 28
!KioskFlag = CByte(strArray(intE))
Case 29
!KVSPrepLine = CByte(strArray(intE))
Case 30
!TotalServiceTime = CLng(strArray(intE))
Case 31
!OrderTime = CLng(strArray(intE))
Case 32
!LineOrAssemblyTime = CLng(strArray(intE))
Case 33
!WindowOrCashierTime = CLng(strArray(intE))
Case 34
!ServeOrStorageTime = CLng(strArray(intE))
Case 35
!HoldOrGlobalTime = CLng(strArray(intE))
Case 36
!POSItemCount = CInt(strArray(intE))
Case Else
MsgBox "Error: Logic error in code 'ReadCSVData'.
Incorrect Header field numbering."
End Select
Next intE

lngOrderID = !OrderID
!AddedOn = Now()
!DataBatchID = CLng(strDescription)
!Weekending = GetWeekending(!TCDate)
!Weekpart = GetWeekPart(!TCDate)
!Daypart = GetDayPart(!TCDate, !TCTime)
!QHour = GetQHour(!TCTime)
.Update
End With

If UBound(strArray) = 35 Then
' No detail records apparently
GoTo NextRecord
End If
iDetailFields = UBound(strArray) - 35
If iDetailFields Mod 3 <> 0 Then
MsgBox "Error: Logic error in code 'ReadCSVData'. Incorrect
LineItem field count."
End If

With rsDetail
For intE = 36 To UBound(strArray)
Select Case intE Mod 3
Case 0
'MenuItemID
.AddNew
iBatchLineItems = iBatchLineItems + 1
!Referenced = False
!OriginalSequence = CByte((intE - 33) / 3)
!OrderID = lngOrderID
!MenuItemID = Format(strArray(intE), "00000")
Case 1
'QtyServed
!QtyServed = CInt(strArray(intE))
If !QtyServed <> 0 Then
iLineItems = iLineItems + 1
iTotalItems = iTotalItems + !QtyServed
End If
Case 2
'QtyPromo
!QtyPromo = CInt(strArray(intE))
.Update
Case Else
MsgBox "Error: Logic error in code 'ReadCSVData'.
Incorrect Detail field numbering."
End Select
Next intE
End With

With rsHeader
.Bookmark = .LastModified
.Edit
!LineItems = iLineItems
!TotalItems = iTotalItems
!Complexity = GetComplexity(iLineItems, iTotalItems)
.Update
End With

If iOrdCount Mod 1000 = 0 Then
varSysCmd = SysCmd(acSysCmdSetStatus, Format(iOrdCount,
"#,###,###") & " orders and " & Format(iBatchLineItems, "#,###,###") & "
line items have been processed...")
DoEvents
End If
NextRecord:
Loop

strElapsed = Format(Now() - dtmStart, "hh:nn:ss")
MsgBox "File processed in " & strElapsed & vbCrLf & vbCrLf & _
Format(iOrdCount, "#,###,###") & " orders added." & vbCrLf & _
Format(iBatchLineItems, "#,###,###") & " line items added."

ReadCSVData = True
ExitHere:
On Error Resume Next
varSysCmd = SysCmd(acSysCmdSetStatus, " ")
Close #1
Set rsDetail = Nothing
Set rsHeader = Nothing
Set db = Nothing
Exit Function
ErrHandler:
Select Case Err
Case Else
MsgBox "UNLOGGED ERROR: " & vbCrLf & vbCrLf & Err.Number & " - "
& Err.Description
Resume ExitHere
End Select
End Function


Dennis

unread,
Oct 18, 2007, 2:39:01 PM10/18/07
to
First, the way I understand it, Access needs "scratch-pad space" while doing
"stuff." Unfortunately, it doesn't release that space, ever, unless you do a
Compact and Repair operation. You have two choices:

- Put the TABLES into SQL, which doesn't have the limitation

- Stop the import after "X" records have been imported, and prompt the user
to do a Compact and Repair. Then resume the import, but remember where you
were, so that you skip those first "X" records. Repeat as necessary.

George Nicholson

unread,
Oct 18, 2007, 3:29:21 PM10/18/07
to
Thanks for the response, but I'm having trouble swallowing a 1.5gb "scratch
pad" while doing a straightforward append of approx 200mb of compacted data.
I've got to be doing something that could be done more efficiently.

"Dennis" <Den...@discussions.microsoft.com> wrote in message
news:19E17735-B5AF-4EEB...@microsoft.com...

Dennis

unread,
Oct 18, 2007, 5:06:04 PM10/18/07
to
You could always call MS & complain (heh). Look, Access does lots of stuff
that I hate. However, since my app development is in that venue, I have
little choice but to go with it. I really suggest that you spoon-feed it 100k
recs at a time, then compact & repair, and repeat until all the recs are in.
Then do a final C&R and see what your final application size is. There IS
overhead in terms of creating keys and links while adding records, and you
clearly stated that you're adding 400+k recs. That's a butt-load of recs for
the Access Jet database engine to handle.

John W. Vinson

unread,
Oct 18, 2007, 5:21:15 PM10/18/07
to
On Thu, 18 Oct 2007 12:24:00 -0500, "George Nicholson"
<Georg...@Junkmsn.com> wrote:

>I have working code that prompts user for file location, opens the file,
>reads a line, adds the data to my tables, reads next line, etc. and it works
>fine.
>

I suspect that Access may be reserving a big block of scratch space at every
addition, not knowing in advance how much will be needed and making a
pessimistic assumption.

Why go to all the elaboration, though? Could you not use File... Get External
Data... Link to link to the file; and then simply run one or two Append
queries to get it all into the table in one operation?

John W. Vinson [MVP]

Albert D. Kallal

unread,
Oct 18, 2007, 5:43:45 PM10/18/07
to
You don't mention what version of ms-access.

make sure you installed the SP updates to JET, there is a number of bloat
issues fixed.

TURN OFF ROW LOCKING!!!!!!!

row locking is faked in ms-access and is achieved by PADDING the record to
fill a frame (2000 characters).

so, DO NOT use row locking unless you ABSOLUTE must use it.


Check the above two issues, You not toast on this just yet!!!

--
Albert D. Kallal (Access MVP)
Edmonton, Alberta Canada
pleaseNOO...@msn.com


George Nicholson

unread,
Oct 18, 2007, 5:55:13 PM10/18/07
to
> Why go to all the elaboration, though? Could you not use File... Get
> External
> Data... Link to link to the file; and then simply run one or two Append
> queries to get it all into the table in one operation?

Nope, at least not without putzing with the raw csv file first. The
import/link wizard functionality is too limited to rely on in this case.
That's how I was originally doing it until I realized (in a very nasty
meeting) that I was missing 10% of my Order Detail.

>>...no telling how long my longest csv record will be either and the


>>import wizard is useless unless your longest record is included in 1st 20
>>or
>>so records, which is why I'm doing it this way

Example: when using the wizard (to either import or link), it volunteers to
bring in 51 fields.
If I move the longest/widest record to the top of the file, it volunteers to
bring in 96 fields. (A difference of 15 line items for that order alone).

However, I may not have any choice but to go back to that method. I could
create a dummy single record in a notepad file with more line items detail
than I would ever expect and then add that record to every csv file before
trying to link/import it. That would force the wizard to "see" all the
necessary fields.

It just seems like such a kludge. But so does Access' need for a 1.5gb
scratch pad to import a 63mb csv file (which would amount to 150mb of data
after import and compacting)..

Thanks for the assist.

HTH,

"John W. Vinson" <jvinson@STOP_SPAM.WysardOfInfo.com> wrote in message
news:cfjfh35afi395lp6a...@4ax.com...

George Nicholson

unread,
Oct 18, 2007, 7:21:57 PM10/18/07
to
WOOT! Thank you Albert! That was it.

Turned OFF "Open databases using record-level locking"
("Default record locking" was already set to "No Locks")

404mb compacted file + (434k orders w/detail in a 65mb csv file) = 604mb
uncompacted = 566mb compacted
Now *that* seems like a reasonable amount of overhead. Better than I would
expect, actually.

While turning off Row Locking I realized AutoCorrect Track/Perform (but not
Log) was on (not my usual machine) and turned it off as well. Will do a test
tommorow to see if that was a contributing factor, but I suspect Row Locking
was the biggest culprit.

I'm not familiar with Row vs Page Locking (which seem to be the only
options). What are their pros & cons? (Other than the one I just learned
about)?

btw, Access 2003 with all updates - even SP3. My bad for not mentioning it.

Thank you Thank you again


"Albert D. Kallal" <PleaseNOOO...@msn.com> wrote in message
news:uZo51$cEIHA...@TK2MSFTNGP02.phx.gbl...

Albert D. Kallal

unread,
Oct 18, 2007, 8:16:02 PM10/18/07
to
"George Nicholson" <Georg...@Junkmsn.com> wrote in message

>
> I'm not familiar with Row vs Page Locking (which seem to be the only
> options). What are their pros & cons? (Other than the one I just learned
> about)?
>

Row locking was a very common request,and MS added this feature to access
2000
(and jet 4.0).

What happens is that when ms-access locks a reocrd, no other usrees can edit
this record. Access was based on a "frame" system (2000, or 4000 charanters
..can't remember which right now). So, what would happen is if you had two
users working on a adatbase, if they were editign reocrds *right* beside
each other, one would get locked out (assuming your using opportunistic
locking).

If your using pessimistic locking it was generally not a problem because the
lock only occurs only at write time.

The simple solution was to change jet to expand the size of a reocrd to that
of a database frame, and presto - locking was solved because you could never
have two or more records in the same frame.

In fact the locking problem was usually worse when you're using sub forms
because often those sub form records are really small.

if you're not using opportunistic locking (that means the instant one uses
starts editing the record others are 100% locked out of that record - you'll
even see a Ghostbusters lock symbol on the left side of the form when you
use this feature). if you're not using locking and are not having problems,
then I would simply use page locking all the time.

> btw, Access 2003 with all updates - even SP3. My bad for not mentioning
> it.

Ah, very good it seems you are very much up to date....

I do appreciate your posting back on this issue, as then others can search
and find this message (we all benefit here).

Albert D. Kallal

unread,
Oct 18, 2007, 9:17:44 PM10/18/07
to
"Dennis" <Den...@discussions.microsoft.com> wrote in message
news:2FF6129F-59A4-4455...@microsoft.com...

>That's a butt-load of recs for
> the Access Jet database engine to handle.

Actually, I don't consider 400k records that much at all.

See my other post, we resolved the bloat problem...and the poster is having
no problems at all....

Tony Toews [MVP]

unread,
Oct 20, 2007, 9:47:51 PM10/20/07
to
"Albert D. Kallal" <PleaseNOOO...@msn.com> wrote:

>What happens is that when ms-access locks a reocrd, no other usrees can edit
>this record. Access was based on a "frame" system (2000, or 4000 charanters
>..can't remember which right now).

Actually it's called pages not frames. A97 and earlier (actually Jet
3.5) pages use 2048 bytes while A2000 and later (actually Jet 4.0)
pages use 4096 bytes.

>The simple solution was to change jet to expand the size of a reocrd to that
>of a database frame, and presto - locking was solved because you could never
>have two or more records in the same frame.

Actually I suspect this is more when adding records compared to
updating them. Furthermore I think if you are the only user in the
MDB Access/Jet is smart enough to not use one page per record.

Although it is quite possible that Access/Jet, if you are updating
records that it moves the updated record to a page all by itself and
deletes the data out of the page that it was on.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/

Reply all
Reply to author
Forward
0 new messages