I was very interested to see a cookbook for c++ because I really like
the perl and python cookbooks and c++ is my native tongue. Flipping
randomly through it, the first example I ran across is your code to
count lines, words etc. Is this really the best way to count things in
a file?
I would have thought that using the streambuf rather than a stream
would be much more efficient:
A cursory web search
http://groups.google.co.uk/group/alt.comp.lang.learn.c-c++/browse_thr...
finds this from Dietmar Kuehl
std::ifstream in("large.file");
std::count(std::istreambuf_iterator<char>(in),
std::istreambuf_iterator<char>(), '\n');
James Kanze is right that memory mapped files are much faster (3-4
times at least, on my system). Happily, there are now a system
independent io library from boost.org which provides this utility.
Either way the code is considerably faster than counting using get().
Am I being pedantic?
Is the point of the c++ cookbook to present easy to understand code
rather than practical/efficient code.
Thanks
Llew Goodstadt
My version of the code would be:
#include <boost/iostreams/code_converter.hpp>
#include <boost/iostreams/device/mapped_file.hpp>
#include <algorithm>
#include <iostream>
#include <fstream>
#include <stdexcept>
#include <boost/filesystem/path.hpp>
#include <boost/filesystem/convenience.hpp>
#include "progress_indicator.h"
#include "count_lines_in_file.h"
namespace io = boost::iostreams;
std::streamsize count_lines_in_file(
std::string& file_name,
char eol,
const t_progress_indicator& seq_progress)
{
try
{
io::mapped_file_source mapfile;
mapfile.open(file_name);
return std::count(mapfile.data(), mapfile.data() +
mapfile.size(), eol);
} catch ( std::ios::failure& fail )
{
// probably file too large. or does not map for whatever reason
}
// decay to using safer but slow code
return count_lines_in_file_std(
file_name,
eol,
seq_progress);
}
std::streamsize count_lines_in_file_std(
std::string& file_name,
char eol,
const t_progress_indicator& seq_progress)
{
std::ifstream in(file_name.c_str());
if (!in)
throw std::runtime_error("Could not open " + file_name);
return std::count(std::istreambuf_iterator<char>(in),
std::istreambuf_iterator<char>(), eol);
}