Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Saving a binary file into a string

3 views
Skip to first unread message

Dominik Schmidt

unread,
Dec 27, 2009, 10:39:22 AM12/27/09
to
Hi,

I'm new to C++, so I have a very basic question.
I wrote a function which opens a file and saves it into a string variable.
Another function can save a string variable into a file.
I tried to combine those two functions, because in combination they should
make an exact copy of a file. And it seems to work, in every case the copy
had the exact same hash value (MD5) as the original file.

I'd like to know if I made an error, so if there's any possibility that my
functions won't work in some case?

Just to be sure: 1 Byte can have 2^8 = 256 different states, so each
"character" of a string variable can have those 256 states, right? Every
byte of a file can be saved in a string variable? If this is true, there
can be no file which can *not* be copied by my functions (except it's too
big)?

Thanks in advance for any tip!

================
Here is my code:
================

int main()
{
string var1;

var1 = OpenFile("D:\\myfile1.bin");
SaveFile("D:\\myfile1-copy.bin", var1);

return 0;
}

string OpenFile(string FilePath)
{
string Out1;
ifstream file1;
char Chr1;
string Str1;
long long filesize;

filesize = GetFileSize(FilePath);
file1.open (FilePath.c_str(), ios::binary);
if (file1.is_open())
{
for (long long i = 0; i < filesize; i++)
{
file1.read (&Chr1, 1);
Str1 = Chr1;
Attach (&Out1, Str1);
}
}
file1.close ();

return Out1;
}

void SaveFile(string FilePath, string FileContent)
{
ofstream file1;
long long filesize;
char Chr1;

filesize = FileContent.length();

if (FileExists(FilePath)) KillFile(FilePath);
if (FileExists(FilePath)) return;
if (filesize < 0) return;

file1.open (FilePath.c_str(), ios::binary);
if (file1.is_open())
{
for (long long i = 0; i < filesize; i++)
{
Chr1 = FileContent[i];
file1.write (&Chr1, 1);
}
}
file1.close();
}

//Here are the dependencies:

long long GetFileSize(string FilePath)
{
long long Out1 = -1;
ifstream file1;

file1.open (FilePath.c_str());
if (file1.is_open())
{
file1.seekg(0, ios::end);
Out1 = file1.tellg();
}
file1.close();

return Out1;
}

bool FileExists(string FilePath)
{
//http://msdn.microsoft.com/en-us/library/aa365740%28VS.85%29.aspx

WIN32_FIND_DATA FindFileData;
HANDLE hFind = FindFirstFile(FilePath.c_str(), &FindFileData);

if (hFind == INVALID_HANDLE_VALUE || (FindFileData.dwFileAttributes &
FILE_ATTRIBUTE_DIRECTORY))
{
return false;
}
else
{
return true;
}
}

void Attach(string *OldString, string AddString, bool InNewLine)
{
if (InNewLine)
{
*OldString += Cr;
*OldString += Lf;
}
*OldString += AddString;
}

bool KillFile(string FilePath)
{
int Out1;

Out1 = remove(FilePath.c_str());

return Out1 == 0;
}

Rune Allnor

unread,
Dec 27, 2009, 12:39:12 PM12/27/09
to
On 27 Des, 16:39, Dominik Schmidt <dominikschmi...@gmx.net> wrote:
> Hi,
>
> I'm new to C++, so I have a very basic question.
> I wrote a function which opens a file and saves it into a string variable.
> Another function can save a string variable into a file.
> I tried to combine those two functions, because in combination they should
> make an exact copy of a file. And it seems to work, in every case the copy
> had the exact same hash value (MD5) as the original file.
>
> I'd like to know if I made an error, so if there's any possibility that my
> functions won't work in some case?

There is the possibility, yes.

Some byte values or sequences of byte values take on special
meanings like 'end of line' or 'end of string' in the context
of text files and strings.

If you want to work with binary files, use std::vector<char>
or something like that, instead of std::string. That way all
characters are treated as arbitrary numbers, with no special
significance attached to any of them.

Rune

Dominik Schmidt

unread,
Dec 27, 2009, 1:11:06 PM12/27/09
to
On Sun, 27 Dec 2009 09:39:12 -0800 (PST), Rune Allnor wrote:

> Some byte values or sequences of byte values take on special
> meanings like 'end of line' or 'end of string' in the context
> of text files and strings.

You mean bytes like CR (carriage return, 13), LF (line feed, 10) or null
(0)?

I've copied many files with my code, including text files using only CR
*or* LF as line break (insted of CR LF) and files including null characters
or other bytes which don't represent a visible character.
And all of those copies had the same hashes as the original files. So it
worked (so far). Even with large files > 2 GB.

> If you want to work with binary files, use std::vector<char>
> or something like that, instead of std::string. That way all
> characters are treated as arbitrary numbers, with no special
> significance attached to any of them.

Actually I thought my code would already do this by using "ios::binary" in
both functions?
Could you give me an example (file) that doesn't work with my code?

Thanks for your help

Rune Allnor

unread,
Dec 27, 2009, 1:38:55 PM12/27/09
to
On 27 Des, 19:11, Dominik Schmidt <dominikschmi...@gmx.net> wrote:

> > If you want to work with binary files, use std::vector<char>
> > or something like that, instead of std::string. That way all
> > characters are treated as arbitrary numbers, with no special
> > significance attached to any of them.
>
> Actually I thought my code would already do this by using "ios::binary" in
> both functions?

It might have done, but by using std::string you confuse readers
of your code about what you are doing. std::string is intended
for human-readable text, and anywhere one sees it, one expects
to handle human-readable text. You break with that expectation,
which will only cause confusion later on. You also expose your
data to functions and manipulations intended for strings.

If you use std::vector<char> you first of all make it clear to
the reader that these are binary data, and the reader of your
code will relate to your code accordingly. Secondly you prevent
your data from being accessed through string-manipulating
functions. Which would only have caused mayhem and misery,
if it occured.

The key to C++ is to express the data in terms of the type
or class that best represents the properties of the data.

Since you do *not* have text data, std::string, which is
intended for precisely text data, is the *wrong* container
to use.

Rune

Jonathan Lee

unread,
Dec 27, 2009, 1:43:10 PM12/27/09
to
On Dec 27, 10:39 am, Dominik Schmidt <dominikschmi...@gmx.net> wrote:
> Just to be sure: 1 Byte can have 2^8 = 256 different states, so each
> "character" of a string variable can have those 256 states, right?

Strictly speaking, char may be larger than an octet in C++. The
value of CHAR_BIT in <climits> will tell you exactly how big. I
think both POSIX and Windows require it to be 8, though. And even
if it weren't, I don't think your code would be affected as is.

> string OpenFile(string FilePath)

Nothing major.
- Str1 seems completely unnecessary. You could write the attach
function to work with chars.
- long long isn't an official type in C++ yet. C++0x will add
it as a type.
- I don't see why you don't just read the whole file instead of
a char at a time. That's gotta be slow. If you aren't
expecting the files to be large, do whatever processing you
have to in memory.

> void SaveFile(string FilePath, string FileContent)

Again, you could write all of FileContent.data() at once.

> long long GetFileSize(string FilePath)

I would probably use the return type of tellg() (i.e.,
std::streampos). No biggy.

--Jonathan

Dominik Schmidt

unread,
Dec 27, 2009, 3:03:47 PM12/27/09
to
On Sun, 27 Dec 2009 10:38:55 -0800 (PST), Rune Allnor wrote:

OK, this is what I wanted to know.
Thank you very much for your explanation.
I won't use the data type std::string for binary data anymore.

Dominik Schmidt

unread,
Dec 27, 2009, 3:40:53 PM12/27/09
to
On Sun, 27 Dec 2009 10:43:10 -0800 (PST), Jonathan Lee wrote:

> Strictly speaking, char may be larger than an octet in C++. The
> value of CHAR_BIT in <climits> will tell you exactly how big. I
> think both POSIX and Windows require it to be 8, though. And even
> if it weren't, I don't think your code would be affected as is.

Right now, I'm working on a Windows only application, so this shouldn't be
a problem.
But could this cause trouble if I'm writing an application for UN*X (or
another common OS)? I guess no?

>> string OpenFile(string FilePath)
>
> Nothing major.
> - Str1 seems completely unnecessary. You could write the attach
> function to work with chars.

Yes, you're right, and I've done this by now.

> - long long isn't an official type in C++ yet. C++0x will add
> it as a type.

long long isn't an official type?
What alternative do I have to save a large integer number? As far as I know
the largest possible number in a long long variable is 9223372036854775807,
so which official data type could save this number?
Can I get into trouble if I continue using long long?

> - I don't see why you don't just read the whole file instead of
> a char at a time. That's gotta be slow. If you aren't
> expecting the files to be large, do whatever processing you
> have to in memory.

Is there a significant time difference when using large (maybe > 500 MB)
files?

> I would probably use the return type of tellg() (i.e.,
> std::streampos). No biggy.

OK, I assume this would be a better way of coding?
Can I do anything wrong by using long long here?

Jonathan Lee

unread,
Dec 27, 2009, 4:06:03 PM12/27/09
to
On Dec 27, 3:40 pm, Dominik Schmidt <dominikschmi...@gmx.net> wrote:
> But could this cause trouble if I'm writing an application for UN*X (or
> another common OS)? I guess no?

Linux and BSDs follow POSIX which requires CHAR_BIT == 8. I don't
really
know about other OS's.

> > - long long isn't an official type in C++ yet. C++0x will add
> >   it as a type.
> long long isn't an official type?

Not in the current version of C++. Though most compilers support it
as an extension since it's in C99. See here:

http://en.wikipedia.org/wiki/C%2B%2B0x#Type_long_long_int
Or http://www2.research.att.com/~bs/C++0xFAQ.html#long-long

> What alternative do I have to save a large integer number?

long, or unsigned long. If you're on a 64-bit OS these will
be 64-bits anyway. But see below about streampos/tellg()

> so which official data type could save this number?

None. Though long *could* be large enough.

> Can I get into trouble if I continue using long long?

Not really. It'll be in C++0x which is on the horizon, and
I think most/all compilers support it as an extension. I
don't think it's a problem most people get upset about. I
was just giving you a heads up, really.

> Is there a significant time difference when using large
> (maybe > 500 MB) files?

Probably not significant. Your OS should cache the reads
from disk so that you aren't going to disk for each char.
That would be the big performance penalty. You'll just
incur the call overhead. This is all platform dependent.
You could test to find out the difference on your
computer.

But still, it just seems odd. If you follow Rune's advice
and use a std::vector, just resize it to the file size
and read directly into the vector:

yourvector.resize(filesize) // make room
yourfile.read(&yourvector[0], filesize); // done.

> OK, I assume this would be a better way of coding?
> Can I do anything wrong by using long long here?

I'm just anal about types. tellg() returns a streampos,
which is defined by the standard to be a signed integral
type. So it will fit into long long; you won't have a
problem with that.

But also keep in mind that you don't really need
long long, then. The value in your long long vars will
never be larger than what can fit in a streampos (which
in turn will be no larger than a long). So if it were
me, I'd just get rid of long long and use streampos.
I'd kill two "problems" with one stone.

--Jonathan

Jorgen Grahn

unread,
Dec 27, 2009, 5:54:16 PM12/27/09
to
On Sun, 2009-12-27, Dominik Schmidt wrote:
> Hi,
>
> I'm new to C++, so I have a very basic question.
> I wrote a function which opens a file and saves it into a string variable.
> Another function can save a string variable into a file.

...


> void SaveFile(string FilePath, string FileContent)

You should learn to use references; that's better written as

void SaveFile(const string& path, const string& s)

Having to fit the whole file in memory *once* is problematic enough;
here you double the memory needed.

You also almost always need a way to report errors.

> {
> ofstream file1;
> long long filesize;
> char Chr1;
>
> filesize = FileContent.length();
>
> if (FileExists(FilePath)) KillFile(FilePath);
> if (FileExists(FilePath)) return;

Don't ask for permission; just try to do it (truncating the existing
file as you go). You ask the wrong question anyway -- you might just
as well fail because you don't have write permissions, or the disk is
almost full, or ...

The checks are also leaky because the thing you try to check for and
make sure is true, can become false a microsecond later, when some
other process creates that file.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Kaz Kylheku

unread,
Dec 27, 2009, 11:21:00 PM12/27/09
to
On 2009-12-27, Dominik Schmidt <dominik...@gmx.net> wrote:
> Hi,
>
> I'm new to C++, so I have a very basic question.
> I wrote a function which opens a file and saves it into a string variable.
> Another function can save a string variable into a file.
> I tried to combine those two functions, because in combination they should
> make an exact copy of a file.

Can't be done in standard C++. To make an exact copy of a file requires
platform-specific functions. You have to retrieve all of the file's
OS-specific attributes like ownership, permissions, modification
timestamps, and other features like ``forks'', etc.

Kaz Kylheku

unread,
Dec 28, 2009, 1:07:44 AM12/28/09
to
On 2009-12-27, Jorgen Grahn <grahn...@snipabacken.se> wrote:
> On Sun, 2009-12-27, Dominik Schmidt wrote:
>> Hi,
>>
>> I'm new to C++, so I have a very basic question.
>> I wrote a function which opens a file and saves it into a string variable.
>> Another function can save a string variable into a file.
>
> ...
>> void SaveFile(string FilePath, string FileContent)
>
> You should learn to use references; that's better written as
>
> void SaveFile(const string& path, const string& s)

Under any sanely implemented compiler, there is no difference.

So this is like writing x >> 2 instead of x / 2.

The caller passes pointers to the objects; the callee will make copy
if it modifies the arguments, otherwise it works with the caller's
objects directly.

> Having to fit the whole file in memory *once* is problematic enough;
> here you double the memory needed.

Even if the compiler is stupid to actually copy the objects in argument
passing, the implementation of std::string can prevent copying the
actual data by managing references.

Joshua Maurice

unread,
Dec 28, 2009, 1:31:03 AM12/28/09
to
On Dec 28, 1:07 am, Kaz Kylheku <kkylh...@gmail.com> wrote:
> void SaveFile(string FilePath, string FileContent)
> void SaveFile(const string& path, const string& s)
>
> Under any sanely implemented compiler, there is no difference.
>
> So this is like writing x >> 2 instead of x / 2.
>
> The caller passes pointers to the objects; the callee will make copy
> if it modifies the arguments, otherwise it works with the caller's
> objects directly.

Really? This surprises me, considering this is the first I've heard
about such magics, and many longtime posters here seem to assume /
know otherwise. Division by two vs bitshifting is very simply,
provably equivalent code, but changing a pass by value to a pass by
const ref is much more involved, and I believe the analogy does not
work.

So, how would this be implemented and still have a separate
compilation model?

Ex:

//
#include <iostream>
using namespace std;

class foo
{
public:
foo() { cout << "foo()" << endl; }
foo(const foo& ) { cout << "foo(const foo& )" << endl; }
foo& operator= (const foo& ) { cout << "foo& operator= (const foo& )"
<< endl; return *this; }
~foo() { cout << "~foo()" << endl; }
};

void passByValue(foo ) {}

int main()
{
foo x;
passByValue(x);
}
//

I think that this example must have exactly one ctor call, one copy
ctor call, and 2 dtor calls. I don't see how you can change it and
still have it be standard compliant.

On a more general program, barring whole program optimization with the
standard's "as if" rule, that is still with separate compilation, I
don't see how you can do it.

> Even if the compiler is stupid to actually copy the objects in argument
> passing, the implementation of std::string can prevent copying the
> actual data by managing references.

Except it might not do copy on write strings because that can be more
expensive for certain applications. That's a tradeoff optimization,
not a globally better one.

Dominik Schmidt

unread,
Dec 28, 2009, 5:14:59 AM12/28/09
to
On Mon, 28 Dec 2009 04:21:00 +0000 (UTC), Kaz Kylheku wrote:

> Can't be done in standard C++. To make an exact copy of a file requires
> platform-specific functions. You have to retrieve all of the file's
> OS-specific attributes like ownership, permissions, modification
> timestamps, and other features like ``forks'', etc.

Misunderstanding.
I don't want to copy the file header (like ownership, permissions,
whatever). I only want to copy the body of the file.
If the hash values of my original file and its copy match, I'm happy.

Fred Zwarts

unread,
Dec 28, 2009, 8:09:52 AM12/28/09
to

Even de separation between "body" and "meta-data" is not completely
OS independent. E.g., on some systems the separation between
"records" is part of the meta-data", whereas on others it is part
of the body.
So, even making an exact copy of the body may be OS specific.
In practice, it may work as long as you stick to one file-type on
one OS.
In today's most popular OSs, Windows and Unix/Linux,
file bodies have almost no structure at OS level. (The structure is
determined at application level.) For those OSs a copy of the
body is made very easily.

James Kanze

unread,
Dec 28, 2009, 3:53:46 PM12/28/09
to
On Dec 27, 3:39 pm, Dominik Schmidt <dominikschmi...@gmx.net> wrote:

> I'm new to C++, so I have a very basic question.
> I wrote a function which opens a file and saves it into a
> string variable. Another function can save a string variable
> into a file. I tried to combine those two functions, because
> in combination they should make an exact copy of a file. And
> it seems to work, in every case the copy had the exact same
> hash value (MD5) as the original file.

> I'd like to know if I made an error, so if there's any
> possibility that my functions won't work in some case?

The question is: how portable do you want to be?

> Just to be sure: 1 Byte can have 2^8 = 256 different states,
> so each "character" of a string variable can have those 256
> states, right?

On most machines. (I know of at least one where bytes have 9
bits, and 512 possible states. And more than one where a signed
char would have 2^n-1 possible states, rather than 2^n; but on
all of those machines, plain char is unsigned---probably in
order to avoid such problems.)

> Every byte of a file can be saved in a string variable?

Yes. Getting it from the file into the string variable isn't
necessarily trivial, however. In fact, it's impossible to do
"portably", if by portably, you mean for all legal
implementations of C++.

For this sort of thing, I'd generally prefer vector< char >, but
string will work too.

> If this is true, there can be no file which can *not* be
> copied by my functions (except it's too big)?

> ================


> Here is my code:
> ================

> int main()
> {
> string var1;

> var1 = OpenFile("D:\\myfile1.bin");
> SaveFile("D:\\myfile1-copy.bin", var1);

> return 0;

Not related to your question, but if either OpenFile or SaveFile
fails, you shouldn't return 0.

> }

> string OpenFile(string FilePath)
> {
> string Out1;
> ifstream file1;
> char Chr1;
> string Str1;
> long long filesize;

Strictly speaking, long long isn't portable (yet). Also
strictly speaking, it may not be large enough to hold the size
of a file (but practically, I can't imagine this being a problem
anytime soon).

> filesize = GetFileSize(FilePath);
> file1.open (FilePath.c_str(), ios::binary);
> if (file1.is_open())
> {
> for (long long i = 0; i < filesize; i++)
> {
> file1.read (&Chr1, 1);
> Str1 = Chr1;
> Attach (&Out1, Str1);
> }
> }
> file1.close ();

All of which is doing things the hard way. Why not simply:

char tmp;
while ( file1.get( tmp ) )
Out1 += tmp;
if ( !file1.eof() )
// read error...

(once you've verified that the open has succeeded, of course)?
Or even:

string Out1( (std::istreambuf_iterator< char >( file1 )),
(std::istreambuf_iterator< char >()) );

(About the only way you'll get a read error here is through an
exception.)

> return Out1;
> }

> void SaveFile(string FilePath, string FileContent)
> {
> ofstream file1;
> long long filesize;
> char Chr1;

> filesize = FileContent.length();

> if (FileExists(FilePath)) KillFile(FilePath);
> if (FileExists(FilePath)) return;
> if (filesize < 0) return;

If filesize < 0, here, you've got a very serious bug in your
implementation.

> file1.open (FilePath.c_str(), ios::binary);
> if (file1.is_open())
> {
> for (long long i = 0; i < filesize; i++)
> {
> Chr1 = FileContent[i];
> file1.write (&Chr1, 1);
> }
> }

Again, what's wrong with:

for ( std::string::const_iterator iter = FileContent.begin();
iter != FileContent.end();
++ iter )
file1.put( *iter );

? Or just:

std::copy( FileContent.begin(), FileContent.end(),
std::ostreambuf_iterator< char >( file1 ) );

> file1.close();

And you have to check the status of file1 after the close, and
report an error.

(While I'm at it, what's with these file1? It's output, or what
ever, or even file, given the context, but without the 1.)

> }

> //Here are the dependencies:

> long long GetFileSize(string FilePath)
> {
> long long Out1 = -1;
> ifstream file1;

> file1.open (FilePath.c_str());
> if (file1.is_open())
> {
> file1.seekg(0, ios::end);
> Out1 = file1.tellg();

Formally, this is not guaranteed to work. Practically, it
doesn't work under Windows if the file is opened in text mode
(as you've done). Nor under most other systems: Unix is the
exception.

> }
> file1.close();

> return Out1;
> }

> WIN32_FIND_DATA FindFileData;
> HANDLE hFind = FindFirstFile(FilePath.c_str(), &FindFileData);

> if (hFind == INVALID_HANDLE_VALUE || (FindFileData.dwFileAttributes &
> FILE_ATTRIBUTE_DIRECTORY))
> {
> return false;
> }
> else
> {
> return true;
> }
> }

If you're using system specific code... Use system specific
versions of open/read/write/close. They should offer the
necessary options for causing open to fail if the file doesn't
exist (input) or causing your file to replace any existing file
(output). For that matter, IIRC, this is guaranteed when you
open an ifstream or an ofstream, so you don't need any of this
anywhere.

> void Attach(string *OldString, string AddString, bool InNewLine)
> {
> if (InNewLine)
> {
> *OldString += Cr;
> *OldString += Lf;
> }
> *OldString += AddString;

And what on earth is this? (You only pass two arguments to
Attach, so your code shouldn't even compile.)

> }

> bool KillFile(string FilePath)
> {
> int Out1;

> Out1 = remove(FilePath.c_str());

> return Out1 == 0;
> }

Just for the record, this can have interesting consequences
under Unix, and might not be what you want.

--
James Kanze

James Kanze

unread,
Dec 28, 2009, 3:56:26 PM12/28/09
to
On Dec 27, 6:11 pm, Dominik Schmidt <dominikschmi...@gmx.net> wrote:
> On Sun, 27 Dec 2009 09:39:12 -0800 (PST), Rune Allnor wrote:

[...]


> > If you want to work with binary files, use std::vector<char>
> > or something like that, instead of std::string. That way all
> > characters are treated as arbitrary numbers, with no special
> > significance attached to any of them.

> Actually I thought my code would already do this by using
> "ios::binary" in both functions?

It should. There could still formally be a problem, since
formally, implementations aren't required to recognize the end
of file correctly for binary files; e.g. you might read 128
bytes, when the input really only contained 10. But in
practice, systems where this would occur are almost non-existent
today.

--
James Kanze

James Kanze

unread,
Dec 28, 2009, 4:00:46 PM12/28/09
to
On Dec 28, 6:07 am, Kaz Kylheku <kkylh...@gmail.com> wrote:
> On 2009-12-27, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:

[...]


> >> void SaveFile(string FilePath, string FileContent)

> > You should learn to use references; that's better written as

> > void SaveFile(const string& path, const string& s)

> Under any sanely implemented compiler, there is no difference.

So g++, VC++ and Sun CC aren't sanely implemented. (Avoiding
the copy is very tricky, since the legality of the optimization
cannot be determined until link time.)

--
James Kanze

Kaz Kylheku

unread,
Dec 29, 2009, 6:14:41 PM12/29/09
to
On 2009-12-28, Joshua Maurice <joshua...@gmail.com> wrote:
> On Dec 28, 1:07 am, Kaz Kylheku <kkylh...@gmail.com> wrote:
>> void SaveFile(string FilePath, string FileContent)
>> void SaveFile(const string& path, const string& s)
>>
>> Under any sanely implemented compiler, there is no difference.
>>
>> So this is like writing x >> 2 instead of x / 2.

Of course, that should have been x >> 1; sorry.

>> The caller passes pointers to the objects; the callee will make copy
>> if it modifies the arguments, otherwise it works with the caller's
>> objects directly.
>
> Really? This surprises me, considering this is the first I've heard
> about such magics, and many longtime posters here seem to assume /
> know otherwise.

Re-reading paragraph 15 of 12.8 of ISO/IEC 14882-2003, I'm not so sure any
more.

It's quite certain that copy construction can be elided when a class value is
returned, but it's not so clear that the optimization is allowed in parameter
passing, except when a temporary is involved.

That is to say, if we have a function void f(string s); and we call it such
that the parameter is initialized with a temporary object, for instance:

f(string("abc"))

the copying of the temporary can be elided; the parameter is simply
constructed from "abc". Argument passing of non-temporaries does not meet the
listed criteria for elision. I.e. it really is the case that the C++
programmer must manually arrange this by using the const reference syntax.

The optimization can only be done for classes with implicitly declared copy
constructors and destructors, in keeping with the as-if principle.

Dominik Schmidt

unread,
Dec 31, 2009, 5:01:40 PM12/31/09
to
On Sun, 27 Dec 2009 13:06:03 -0800 (PST), Jonathan Lee wrote:

> But still, it just seems odd. If you follow Rune's advice
> and use a std::vector, just resize it to the file size
> and read directly into the vector:

> I'm just anal about types. tellg() returns a streampos,


> which is defined by the standard to be a signed integral
> type. So it will fit into long long; you won't have a
> problem with that.

Thank you all for your many tips.

Here is my code (the new part). Error handling is still missing.


std::streampos GetFileSize(string FilePath)
{
std::streampos Out1 = -1;
ifstream file1;

file1.open (FilePath.c_str(), ios::binary);
if (file1.is_open())
{


file1.seekg(0, ios::end);
Out1 = file1.tellg();
}
file1.close();

return Out1;
}

std::vector<char> OpenFile(string FilePath)
{
std::vector<char> Out1;
ifstream file1;
std::streampos filesize;

filesize = GetFileSize(FilePath);
file1.open (FilePath.c_str(), ios::binary);
if (file1.is_open())
{

Out1.resize(filesize);
file1.read(&Out1[0], filesize);
}
file1.close ();

return Out1;
}

void SaveFile(string FilePath, std::vector<char> FileContent)
{
ofstream file1;
std::streampos filesize;

filesize = FileContent.size();

if (FileExists(FilePath)) KillFile(FilePath);
if (FileExists(FilePath)) return;

file1.open (FilePath.c_str(), ios::binary);
if (file1.is_open())
{
file1.write(&FileContent[0], filesize);
}
file1.close();
}

Dominik Schmidt

unread,
Dec 31, 2009, 5:04:42 PM12/31/09
to
On Mon, 28 Dec 2009 12:53:46 -0800 (PST), James Kanze wrote:

>> I'd like to know if I made an error, so if there's any
>> possibility that my functions won't work in some case?
>
> The question is: how portable do you want to be?

Currently I'm working on a Windows application, but I'm sure I'll use those
functions in other projects in the future.
So it should at least work on Windows and on common Linux distributions
(like Debian).

> Not related to your question, but if either OpenFile or SaveFile
> fails, you shouldn't return 0.

Yes, I'll change this...

>> void SaveFile(string FilePath, string FileContent)
>> {
>> ofstream file1;
>> long long filesize;
>> char Chr1;
>
>> filesize = FileContent.length();
>
>> if (FileExists(FilePath)) KillFile(FilePath);
>> if (FileExists(FilePath)) return;
>> if (filesize < 0) return;
>
> If filesize < 0, here, you've got a very serious bug in your
> implementation.

This line must have been a typo.

> (While I'm at it, what's with these file1? It's output, or what
> ever, or even file, given the context, but without the 1.)

I like names with numbers!? Is there something wrong about calling this
variable "file1"? Would you prefer "Johnson"? :-)

>> file1.open (FilePath.c_str());


>
> Formally, this is not guaranteed to work. Practically, it
> doesn't work under Windows if the file is opened in text mode
> (as you've done). Nor under most other systems: Unix is the
> exception.

Thanks, see my answer to Jonathan...

> If you're using system specific code... Use system specific
> versions of open/read/write/close. They should offer the
> necessary options for causing open to fail if the file doesn't
> exist (input) or causing your file to replace any existing file
> (output). For that matter, IIRC, this is guaranteed when you
> open an ifstream or an ofstream, so you don't need any of this
> anywhere.

Yes, I'm using system specific code in FileExists().
The only reason is that I'm not sure yet how to do this in a better way.
I want to get a false (as return value) only if the file does not exist,
but a true, if it exists, even if it's locked or hidden.

>> void Attach(string *OldString, string AddString, bool InNewLine)
>> {
>> if (InNewLine)
>> {
>> *OldString += Cr;
>> *OldString += Lf;
>> }
>> *OldString += AddString;
>
> And what on earth is this? (You only pass two arguments to
> Attach, so your code shouldn't even compile.)

My code does compile, because I have this in the header:
void Attach(string *OldString, string AddString, bool InNewLine = false);

>> bool KillFile(string FilePath)
>> {
>> int Out1;
>
>> Out1 = remove(FilePath.c_str());
>
>> return Out1 == 0;
>> }
>
> Just for the record, this can have interesting consequences
> under Unix, and might not be what you want.

What (consequences) do you mean by that?

James Kanze

unread,
Jan 1, 2010, 8:08:02 AM1/1/10
to
On Dec 31 2009, 10:04 pm, Dominik Schmidt <dominikschmi...@gmx.net>
wrote:

> On Mon, 28 Dec 2009 12:53:46 -0800 (PST), James Kanze wrote:
> >> I'd like to know if I made an error, so if there's any
> >> possibility that my functions won't work in some case?

> > The question is: how portable do you want to be?

> Currently I'm working on a Windows application, but I'm sure
> I'll use those functions in other projects in the future. So
> it should at least work on Windows and on common Linux
> distributions (like Debian).

In sum, very limited portability.

> > (While I'm at it, what's with these file1? It's output, or
> > what ever, or even file, given the context, but without the
> > 1.)

> I like names with numbers!? Is there something wrong about
> calling this variable "file1"? Would you prefer "Johnson"? :-)

There's definitely something wrong with names with no semantic
content. If the function is clearly dealing with only one file,
then "file" may be sufficient. Generally, however, you'd want
"inputfile" and "outputfile" (or simply "source" and
"destination", if it's clear that they are files). About the
only time something like "file1" and "file2" would be
appropriate is if the function outputs exactly the same thing to
two different files.

[...]


> > If you're using system specific code... Use system specific
> > versions of open/read/write/close. They should offer the
> > necessary options for causing open to fail if the file doesn't
> > exist (input) or causing your file to replace any existing file
> > (output). For that matter, IIRC, this is guaranteed when you
> > open an ifstream or an ofstream, so you don't need any of this
> > anywhere.

> Yes, I'm using system specific code in FileExists(). The only
> reason is that I'm not sure yet how to do this in a better
> way. I want to get a false (as return value) only if the file
> does not exist, but a true, if it exists, even if it's locked
> or hidden.

Hidden only affects things like dir, I think; open shouldn't
care if a file is hidden or not. But for getting any more
information than whether you can write it or not, you will have
to use system level reqests.

[...]


> >> bool KillFile(string FilePath)
> >> {
> >> int Out1;

> >> Out1 = remove(FilePath.c_str());

> >> return Out1 == 0;
> >> }

> > Just for the record, this can have interesting consequences
> > under Unix, and might not be what you want.

> What (consequences) do you mean by that?

The semantics when hard links are involved.

Not knowing the application, it's hard to know what the
"correct" solution is, but typically, just opening an existing
file for writing results in the correct semantics. If there are
reasons you don't want to do this (transactional integrity, for
example), then the solution is more complex, and typically
involves using a temporary file (necessary for transactional
integrity anyway) and possibly physical copying; the important
thing is that if the number of hard links to the file is greater
than one, you can't delete and recreate it---you have to
overwrite the original, or some of the hard links will continue
to refer to the old version.

--
James Kanze

Richard Herring

unread,
Jan 15, 2010, 9:02:22 AM1/15/10
to
In message
<c949edc9-dd5a-4d1a...@k17g2000yqh.googlegroups.com>,
Rune Allnor <all...@tele.ntnu.no> writes

>On 27 Des, 16:39, Dominik Schmidt <dominikschmi...@gmx.net> wrote:
>> Hi,
>>
>> I'm new to C++, so I have a very basic question.
>> I wrote a function which opens a file and saves it into a string variable.
>> Another function can save a string variable into a file.
>> I tried to combine those two functions, because in combination they should
>> make an exact copy of a file. And it seems to work, in every case the copy
>> had the exact same hash value (MD5) as the original file.
>>
>> I'd like to know if I made an error, so if there's any possibility that my
>> functions won't work in some case?
>
>There is the possibility, yes.
>
>Some byte values or sequences of byte values take on special
>meanings like 'end of line' or 'end of string' in the context
>of text files

Yes. But the OP is using binary files, so that's irrelevant.

>and strings.

No. std::string can contain arbitrary sequences of characters without
any constraints on their values, and doesn't impose any semantics on
them.

(The only potential problem is that '\0' is treated as a terminator when
assigning or constructing std::strings from C-style null-terminated
strings, i.e. when calling member functions or operators which take a
single const char * argument. The corresponding functions which take two
iterators (or pointers) don't share this problem.)

>
>If you want to work with binary files, use std::vector<char>
>or something like that, instead of std::string.

Probably good advice, since it makes the intent clearer, but not for the
stated reason:

> That way all
>characters are treated as arbitrary numbers, with no special
>significance attached to any of them.
>

--
Richard Herring

0 new messages