Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

way to make std::string use a backing char* without allocating?

21 views
Skip to first unread message

L.Suresh

unread,
Aug 24, 2005, 5:28:47 AM8/24/05
to
Hi,
I have a char buffer whose length i know. I need to do some string
manipulation on that. I do not want to use C style string manipulation
because my buffer does not have a trailing null.

Is there any way of using std::string to use my buffer as a backing
array? (the constructed string is a constant, my program is single
threaded.)

for example,

{
char *buf = ...; (i know len also)
const std::string& s(buf, len);
// do string calculations
}

Here i do not want string to copy contents from buffer. Is there a
way to do it? I was convinced that its not possible, posting this
hoping if someone has an idea.

thanks
--lsu


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Yuri Khan

unread,
Aug 24, 2005, 8:45:24 AM8/24/05
to
No, you cannot. But you can use many standard algorithms such as
std::copy and std::find, giving to them pointers within your buffer.

What kind of calculations do you want to do?

tony_i...@yahoo.co.uk

unread,
Aug 24, 2005, 8:51:23 AM8/24/05
to
Hi. The only potentially portable solution I can envisage is to
instantiate basic_string<> with a trivial custom allocator that wraps
your buffer. Still, custom STL allocators are notoriously finicky, and
not guaranteed to actually be used in all the situations you'd expect.
I think there's an article on implementation issues in one of the
Exception C++ series of books, and there are examples on the web.
Alternatively, you could consider using a vector<char>, and then
algorithms to modify it. FWIW, we might be able to offer other ways to
improve the efficiency if you described how/why you are using the char
buffer. Cheers, Tony

Ismail Pazarbasi

unread,
Aug 24, 2005, 10:57:58 AM8/24/05
to
I thought that, too. But it's not guaranteed to work properly (AFAIK).
After all, string is not designed for that, neither allocator. It may
also differ from one STL implementation to another.

Ismail

Bharat Karia

unread,
Aug 25, 2005, 8:15:37 AM8/25/05
to
Hi Suresh,

You can try using vector<char> if your needs can be satisfied with the
standard algorithms.

i.e. if u have a char buffer
char* buf; // len of this buf is 10

vector<char> vec(&buf[0], &buf[10]);

... now use algorithms from <algorithm> etc.

Thanks
Bharat Karia

> Hi,
> I have a char buffer whose length i know. I need to do some string
> manipulation on that. I do not want to use C style string manipulation
> because my buffer does not have a trailing null.

-- snip --

Joey

unread,
Aug 25, 2005, 11:15:02 AM8/25/05
to
I'm not quite sure if I understand what your trying to do, but I think
there are 3 ways to do what you are looking for.

First off, you can do a fair number of string manipulations using the C
api by using the strn<whatever> commands, almost all str functions have
a strn equivilant, strncpy, strncmp, strncat, etc... so you do your
manipulations using them, or using standard str commands and use
strncpy to copy the results into the storage buffer.

Second. the contents of an std::string can be coppied into a buffer of
arbitrary length using the copy(char *buf, int size) member.

Of course, both of these require deliberate action to set the data into
the buffer.

If you want it to happen automagically, then one possible answer is
using std::stringstream and not std::string

Given a buffer and its size you would need to create a stringstream
like so
std::stringstream myString;
myString.rdbuf()->psetbuf(buf, size-1);

then you can issue a command like
myString << "Hello World"

and if you were to print out the contents of buf, it would
automagically yield the "Hello World" string. and the standard
std::string methods can be applied as well by using the
stringstream::str() member function.

Oh, and if your manipulations exceed the size of the buffer, then an
exception will be thrown, so you can check for that too.

Maybe that gets close to what your wanting, if not then I believe a
little creative thinking will get you the rest of the way.

Yuri Khan

unread,
Aug 25, 2005, 7:08:36 PM8/25/05
to
If you were to use a custom allocator to make a std::string operate on
an external buffer, the pointer to the buffer and its size would be the
allocator's state. However, allocators with state are almost
prohibited, or rather made useless, by the 20.1.5/4 requirement that
"All instances of a given allocator type are required to be
interchangeable and always compare equal to each other". That is, you
may have state but you may not depend on it.

Martin Bonner

unread,
Aug 25, 2005, 7:15:22 PM8/25/05
to
Bharat Karia wrote:

> Suresh wrote:
> > I have a char buffer whose length i know. I need to do some string
> > manipulation on that. I do not want to use C style string manipulation
> > because my buffer does not have a trailing null.
> -- snip --

You snipped the vital restriction that makes this solution invalid:

> > I do not want string to copy contents from buffer.

> Hi Suresh,
>
> You can try using vector<char> if your needs can be satisfied with the
> standard algorithms.

Or even if you want to write your own.

> i.e. if u have a char buffer
> char* buf; // len of this buf is 10

Why not use "len", as in the original example?


>
> vector<char> vec(&buf[0], &buf[10]);

Um. I am pretty sure that in C++, if buf is actually of type char[10],
&buf[10] is undefined behaviour. Why?

- The definition of a[b] is that it is equivalent to *(a+b).
- Thus &buf(10) is equivalent to &( * (buf+10) )
- The problem is that this involves indirecting through the pointer to
the one-past-the-end element of the array, and THAT involves undefined
behaviour.

Note that a) I don't know of an implementation where &array[len] will
behave differently to array+len; b) C99 has added special case wording
to make this expression legal - it is possible that C++ has been so
changed (although I don't believe so).

Carl Barron

unread,
Aug 26, 2005, 5:40:35 AM8/26/05
to
In article <1124911129.7...@g14g2000cwa.googlegroups.com>,
Joey <joey.r...@gmail.com> wrote:

> I'm not quite sure if I understand what your trying to do, but I think
> there are 3 ways to do what you are looking for.
>
> First off, you can do a fair number of string manipulations using the C
> api by using the strn<whatever> commands, almost all str functions have
> a strn equivilant, strncpy, strncmp, strncat, etc... so you do your
> manipulations using them, or using standard str commands and use
> strncpy to copy the results into the storage buffer.
>

strn* functions handle '\0' as a string terminator as well as the
buffer size
char buf[10];
strncpy(buf,"Hello",10) only copies 5 chars.. If that is what you
want fine, but std::string does nothing special with '\0' in its
content.

> Second. the contents of an std::string can be coppied into a buffer of
> arbitrary length using the copy(char *buf, int size) member.
>

that is double allocation, one for the string and one for the buffer,

> Of course, both of these require deliberate action to set the data into
> the buffer.
>
> If you want it to happen automagically, then one possible answer is
> using std::stringstream and not std::string
>
> Given a buffer and its size you would need to create a stringstream
> like so
> std::stringstream myString;
> myString.rdbuf()->psetbuf(buf, size-1);

psetbuf? not in <streambuf> The only buffer setting
functions to actually set the buffer are protected. They
are setp and setg to set the output and input buffer pointers.

depending on how general a stream you will use with this stream
buffer class, it can be as simple as only providing a constructor
that sets the buffer used. perhaps boost::iostreams can be of
assistance for a more general streambuf but I have not used
this new boost library.
struct membuf:std::streambuf
{
membuf(char *p,std::size_t n)
{
setg(buf,buf,buf+n); // set the get buffer
setp(buf,buf+n); // set the put buffer
}
};
will allow writing or reading the buffer [use with an istream or
ostream, but not an iostream].
see also boost::iostream library, a recent addition to boost.

ka...@gabi-soft.fr

unread,
Aug 26, 2005, 5:53:27 AM8/26/05
to
Martin Bonner wrote:

[...]


> > vector<char> vec(&buf[0], &buf[10]);

> Um. I am pretty sure that in C++, if buf is actually of type
> char[10], &buf[10] is undefined behaviour. Why?

> - The definition of a[b] is that it is equivalent to *(a+b).
> - Thus &buf(10) is equivalent to &( * (buf+10) )
> - The problem is that this involves indirecting through the
> pointer to the one-past-the-end element of the array, and THAT
> involves undefined behaviour.

> Note that a) I don't know of an implementation where
> &array[len] will behave differently to array+len;

If we're talking about std::vector, I don't know of an
implementation where they will behave the same, at least at the
lowest level. &v[10] will call vector<>::operator[], and
calculate the address based on the returned reference. In a
debugging implementation, calling operator[] with an index >=
size() will likely cause an assertion failure.

> b) C99 has added special case wording to make this expression
> legal - it is possible that C++ has been so changed (although
> I don't believe so).

The C99 special case only applies to C style arrays (obviously),
and not to std::vector. And C++ does not have the special
wording, so the case remains formally undefined in C++, even
with C style arrays.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Tony Delroy

unread,
Aug 27, 2005, 3:20:09 PM8/27/05
to
A vital consideration is whether you can own the buffer and know the
maximum size it might need to be, or whether you're simply handed a
pointer and length to some pre-existing buffer.

If the former is in fact the case, then the trick is to make the string
as large as your needs, then populate directly into the string. The
custom allocator approach can work, but may not be portable. A hassle
is that there's no way to pass a buffer address to the allocator
object's constructor, so the buffer address needs to be either a
template parameter or a static that the class is hard-coded to see.
You can pass const char*s as function parameters, but I'm not sure
about char*!

Frankly, if you're just hacking something up that you want to be
lightning quick (why else refuse to copy the character buffer), then
you might try your luck at resizing the string and handing the
populating function const_cast<char*>(s.c_str()). Note that this is
not likely to be the same thing as const_char<char*>(&s).

Similarly, using a vector requires resizing, population (at least you
can legitimately expose a char* perspective on it to the populating
function), then use.

In the latter case, you're up the creek without a paddle (if you want
to use string). You could still use STL algorithms over the array
though.

A hacked up allocator implementation follows... apologies to the
purists.

Tony

#include <string.h>
#include <iostream>
#include <memory>

using namespace std;

static char g_buffer[128];

template <typename T>
class Allocator;

template <>
class Allocator<void>
{
typedef void value_type;
typedef size_t size_type;
typedef ptrdiff_t difference_type;
typedef void* pointer;
typedef const void* const_pointer;

template <typename U>
struct rebind
{
typedef Allocator<U> other;
};
};

template <typename T>
class Allocator
{
public:
typedef T value_type;
typedef size_t size_type;
typedef ptrdiff_t difference_type;

typedef T* pointer;
typedef const T* const_pointer;

typedef T& reference;
typedef const T& const_reference;

pointer address(reference r) const { return &r; }
const_pointer address(const_reference r) const { return &r; }

Allocator() throw()
{
}

template <class U>
Allocator(const Allocator<U>&) throw()
{
}

~Allocator() throw()
{
}

pointer allocate(size_type n, Allocator<void>::const_pointer hint =
0)
{
// space for n Ts
return g_buffer;
}

void deallocate(pointer p, size_type n)
{
// deallocate n Ts, don't destroy
}

void construct(pointer p, const T& val)
{
new(p) T(val); // initialise *p by val
}

void destroy(pointer p)
{
p->~T(); // destroy *p but don't deallocate
}

size_type max_size() const throw()
{
return sizeof g_buffer;
}

// in effect: typedef Allocator<U> other
// (way to specify same type of allocator for another data type)
template <class U>
struct rebind
{
typedef Allocator<U> other;
};
};

template <typename T>
bool operator==(const Allocator<T>& t1, const Allocator<T>& t2) throw()
{
return true;
}

template <typename T>
bool operator!=(const Allocator<T>& t1, const Allocator<T>& t2) throw()
{
return !(t1 == t2);
}

int main()
{
basic_string<char, std::char_traits<char>, Allocator<char> > s;

s = "hello world!";
// g_buffer[0] = 'H'; <--- don't do this! (see below)

size_t n = s.size();
cerr << "size " << n << endl;
string s2(s.c_str(), n);

cerr << "&g_buffer " << (void*)&g_buffer
<< ", s.c_str() " << (void*)s.c_str() << endl;

cerr << '"' << s2 << '"' << endl;

ap...@student.open.ac.uk

unread,
Sep 16, 2005, 11:53:47 AM9/16/05
to

Tony Delroy wrote:
> A vital consideration is whether you can own the buffer and know the
> maximum size it might need to be, or whether you're simply handed a
> pointer and length to some pre-existing buffer.
>
> If the former is in fact the case, then the trick is to make the string
> as large as your needs, then populate directly into the string.

John Panzer did this in the C/C++ users Journal July 2001. The buffer
size was a template parameter to a buffer allocator policy template
parameter. The interface is fully compatible with std::string. See
http://johnpanzer.com/tsc_cuj/ToolboxOfStrings.html for details.

Regards,

Andrew Marlow

Allan W

unread,
Sep 17, 2005, 9:58:42 AM9/17/05
to
Bahat Karia wrote:
> i.e. if u have a char buffer
> char* buf; // len of this buf is 10
> vector<char> vec(&buf[0], &buf[10]);
>
> ... now use algorithms from <algorithm> etc.

Martin Bonner retorted:


> Um. I am pretty sure that in C++, if buf is actually of
> type char[10], &buf[10] is undefined behaviour. Why?
>
> - The definition of a[b] is that it is equivalent to *(a+b).
> - Thus &buf(10) is equivalent to &( * (buf+10) )
> - The problem is that this involves indirecting through the pointer to
> the one-past-the-end element of the array, and THAT involves undefined
> behaviour.
>
> Note that a) I don't know of an implementation where &array[len] will

> behave differently to array+len; b) C99 has added special case wording


> to make this expression legal - it is possible that C++ has been so
> changed (although I don't believe so).

Then James Kanze said:
> If we're talking about std::vector, I don't know of an
> implementation where they will behave the same, at least at the
> lowest level. &v[10] will call vector<>::operator[], and
> calculate the address based on the returned reference. In a
> debugging implementation, calling operator[] with an index >=
> size() will likely cause an assertion failure.

But we're NOT talking about std::vector, we're talking about an
array of char.

At least one of you didn't read carefully enough...

0 new messages