Also,
Check to make sure you aren't swapping out to disk. On lower memory
systems, you'll see the pagefile/swapfile usage increase, indicating
that your memory accesses are going to disk. This can also happen if
you have many programs using memory. Disk is slow. Using disk in a
program makes your program slow.
Scanner is also not as efficient as the other input methods.
-Robert
- Use an initially large heap size. As your program reaches the
heap limit it will become slow because it needs to find blocks of
memory that are no longer in use and recycle them.
- Related to the above is to make sure that you store only what's
absolutely necessary. I can comfortably load twitterLarge using -Xmx512m.
- Use top (mac, linux) or the windows task manager while your program
is running to see how fast memory is being used and if you quickly
reach figures close to the heap limit which will make your program
slow down. Also you can periodically print your progress in the file
(number of lines you have parsed).
- Read the file using a bufferedReader with a large buffer. you can
specify the size of the buffer as a second argument in the constructor
of bufferedreader. I use 8192 = 8 kilobytes.
- Do not use scanner. Instead, split each line by yourself to three
strings (i use the indexOf lastIndexOf and substring methods; there's
also a method called split) and use Integer.ParseInt()
-Nikos