how to convert char[2] to a short

57 views
Skip to first unread message

Aaron Prillaman

unread,
Mar 6, 2003, 12:25:46 PM3/6/03
to
I'm writing a sockets program that sends long strings of data, so I have
the first two bytes designated to hold the size of the entire packet.
when I receive this data, I need to look at the size to figure out when
I should stop receiving.

seems like there should be an easy way to do this without having to
multiply the high byte and add the two.

Greg Comeau

unread,
Mar 6, 2003, 12:32:19 PM3/6/03
to
In article <3E6632D4...@cox.net>,

If reader and writer are not the same platform, the above math
may not work and may involve even additional gyrations.

Assuming it's written on the same platform, and it was written
out as say an int, then just read it back in (properly aligned)
as an int. If not, post a small example, and folks can
advise from there.
--
Greg Comeau/ 4.3.0.1: 6 New Windows Backends + export now shipping
Comeau C/C++ ONLINE ==> http://www.comeaucomputing.com/tryitout
World Class Compilers: Breathtaking C++, Amazing C99, Fabulous C90.
Comeau C/C++ with Dinkumware's Libraries... Have you tried it?

Aaron Prillaman

unread,
Mar 6, 2003, 12:58:06 PM3/6/03
to
Greg Comeau wrote:
> In article <3E6632D4...@cox.net>,
> Aaron Prillaman <april...@cox.net> wrote:
>
>>I'm writing a sockets program that sends long strings of data, so I have
>>the first two bytes designated to hold the size of the entire packet.
>>when I receive this data, I need to look at the size to figure out when
>>I should stop receiving.
>>
>>seems like there should be an easy way to do this without having to
>>multiply the high byte and add the two.
>
>
> If reader and writer are not the same platform, the above math
> may not work and may involve even additional gyrations.
>
> Assuming it's written on the same platform, and it was written
> out as say an int, then just read it back in (properly aligned)
> as an int. If not, post a small example, and folks can
> advise from there.

The sender and receiver will not be on the same platform. server is on
FreeBSD and client is windows.

example:

void send_message(){
//this is a very simple version of what i'm doing-- so as not to offend
//any who might think this is a socket-related question.

char msg[500];
short size = 300;

msg[0] = (high byte here);
msg[1] = (low byte here);
msg[2-299] = /* message */;

send(socket_descriptor, msg, size, 0);

}

I just need to fill in the (high/low byte here) parts and be able to put
them back together on the other side.

Neil Butterworth

unread,
Mar 6, 2003, 1:06:01 PM3/6/03
to

"Aaron Prillaman" <april...@cox.net> wrote in message
news:3E663A68...@cox.net...

What is your reason for not wanting to use the multiply and add method?

NeilB


Aaron Prillaman

unread,
Mar 6, 2003, 1:10:57 PM3/6/03
to

Maybe I'm wrong, but it just seems like there should be some way to
stick those two bytes together or pull them apart without doing any math.

Alan Krueger

unread,
Mar 6, 2003, 1:12:25 PM3/6/03
to
"Aaron Prillaman" <april...@cox.net> wrote in message
news:3E663A68...@cox.net...
> The sender and receiver will not be on the same platform. server is on
> FreeBSD and client is windows.
>
> example:
>
> void send_message(){
> //this is a very simple version of what i'm doing-- so as not to offend
> //any who might think this is a socket-related question.
>
> char msg[500];
> short size = 300;
>
> msg[0] = (high byte here);
> msg[1] = (low byte here);
> msg[2-299] = /* message */;
>
> send(socket_descriptor, msg, size, 0);
>
> }
>
> I just need to fill in the (high/low byte here) parts and be able to put
> them back together on the other side.

Although it's not part of standard C++, the htons/ntohs functions may be of
some use to you. They should be available in both FreeBSD and Winsock.

Neil Butterworth

unread,
Mar 6, 2003, 1:24:27 PM3/6/03
to

"Aaron Prillaman" <april...@cox.net> wrote in message
news:3E663D6B...@cox.net...

Well, yes, you could use the shift operator instead, but I don't see any
advantage in this. I can't see why you want to avoid:

int n = msg[0] * BYTEBITS + msg[1];

C++ code doesn't get much simpler than this :-)

NeilB

Ian

unread,
Mar 6, 2003, 1:53:07 PM3/6/03
to
These functions/macros are intended for this purpose, to provide a
platform independent way of reading shorts over a network.


Ian

Aaron Prillaman

unread,
Mar 6, 2003, 1:56:57 PM3/6/03
to

just curiosity I guess... So, to humor me how would I do this with the
shift operator?

..also anyone have recommendations for a good sockets message board?

Jerry Coffin

unread,
Mar 6, 2003, 3:59:52 PM3/6/03
to
In article <3E6632D4...@cox.net>, april...@cox.net says...

char buffer[2];

ntohs(*reinterpret_cast<short *>(buffer));

The reinterpret_cast gets the compiler to treat that address as the
addrss of a short. You then dereference the pointer to short and pass
it to ntohs, which converts it to your local host's format.

Of course reinterpret_cast produces implementation defined results, so
this isn't guaranteed to work on all architectures, though it's likely
to do what you want on most common ones.

Likewise, ntohs is a part of the sockets library, so it's not portable
in the usual C sense, though if you're using sockets in the first place,
you probably have it.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Vladimir Shiryaev

unread,
Mar 6, 2003, 4:14:15 PM3/6/03
to
"Jerry Coffin" <jco...@taeus.com> wrote in message
news:MPG.18d16521f97d4e1989895@news...

> In article <3E6632D4...@cox.net>, april...@cox.net says...
> > I'm writing a sockets program that sends long strings of data, so I have
> > the first two bytes designated to hold the size of the entire packet.
> > when I receive this data, I need to look at the size to figure out when
> > I should stop receiving.
> >
> > seems like there should be an easy way to do this without having to
> > multiply the high byte and add the two.
>
> char buffer[2];
>
> ntohs(*reinterpret_cast<short *>(buffer));

Won't work on *many* platforms.

The address of the `buffer' array may be odd, but
the address of a short may be required to be even.

On x86 it'll work, but be a bit inefficient (require an
extra memory cycle). On some platforms it may work,
but be *terribly* inefficient -- it may generate an
invalid address interrupt, but the situation will be
fixed by a handler in the kernel to produce the
expected result.

Ron Natalie

unread,
Mar 6, 2003, 4:15:22 PM3/6/03
to

"Jerry Coffin" <jco...@taeus.com> wrote in message news:MPG.18d16521f97d4e1989895@news...

> Of course reinterpret_cast produces implementation defined results, so


> this isn't guaranteed to work on all architectures, though it's likely
> to do what you want on most common ones.
>

It won't work on anything with alighment constraints (like a Sparc) if the 2 chars aren't
sufficiently aligned.

Jerry Coffin

unread,
Mar 7, 2003, 12:43:16 AM3/7/03
to
In article <tSO9a.4198$6z.8...@news20.bellglobal.com>,
vladimir.sh...@NOSPAMsympatico.ca says...

[ ... ]

> > char buffer[2];
> >
> > ntohs(*reinterpret_cast<short *>(buffer));
>
> Won't work on *many* platforms.

Oops -- you're absolutely right. I started with something like this:

short buffer;

recv(s, reinterpret_cast<char *>(&buffer), sizeof(buffer), 0);

short length = ntohs(buffer);

but in trying to shorten it, I broke it.

As far as x86 goes: no, the code I originally posted can be slower if
the buffer is at an odd boundary, but the other methods that have been
posted will have exactly the same penalty under exactly the same
circumstances. The bus on something like a 486 simply doesn't allow you
to read two bytes starting at an odd boundary in a single cycle,
regardless. Using a shift/or, multiply/add, etc., won't help at all --
the only difference is that something like a multiply on one of these
older processors will be so slow that you'll get poor performance ALL
the time instead of only when the pair happen to be allocated
incorrectly.

-wiseguy

unread,
Mar 7, 2003, 12:57:09 AM3/7/03
to
Jerry Coffin <jco...@taeus.com> wrote in
news:MPG.18d1ba85b243daf1989897@news:

Yes, alignment problems for integer data can be a problem so avoid them
altogether. Quit whining about the multiply/shift solution and just do it.

inline unsigned int size_of_msg(const char *msg)
{ return ((unsigned int)msg[0]<<8) | (unsigned int)msg[1]; }

It dosen't get any faster than this on many platforms since the left and
right shift & store operations are often directly in microcode. Besides
which, if you are doing network socket apps then your bottleneck is gonna be
the transport media, not the time it takes to translate two bytes of data.

-----------== Posted via Newsfeed.Com - Uncensored Usenet News ==----------
http://www.newsfeed.com The #1 Newsgroup Service in the World!
-----= Over 100,000 Newsgroups - Unlimited Fast Downloads - 19 Servers =-----

Karl Heinz Buchegger

unread,
Mar 7, 2003, 10:30:55 AM3/7/03
to

Aaron Prillaman wrote:
>
> >
> > Well, yes, you could use the shift operator instead, but I don't see any
> > advantage in this. I can't see why you want to avoid:
> >
> > int n = msg[0] * BYTEBITS + msg[1];
> >
> > C++ code doesn't get much simpler than this :-)
> >
> > NeilB
> >
> >
> >
>
> just curiosity I guess... So, to humor me how would I do this with the
> shift operator?
>


n = ( msg[0] << BITS_PER_BYTE ) | msg[1];

If this is indeed faster on your platform, chances are high
that the compiler will transform Neils suggestion to the
exact smae thing internally, if it is possible at all.

Note: BYTEBITS in Neils suggestion is not the same number
as BITS_PER_BYTE.

--
Karl Heinz Buchegger
kbuc...@gascad.at

Neil Butterworth

unread,
Mar 7, 2003, 10:52:52 AM3/7/03
to

"Karl Heinz Buchegger" <kbuc...@gascad.at> wrote in message
news:3E68BB2F...@gascad.at...

Yes, that was bad naming on my part. It should really have been something
like MAX_BYTE_VAL_PLUS_ONE :-)

NeilB

Gianni Mariani

unread,
Mar 7, 2003, 11:09:31 AM3/7/03
to

It's NEVER this simple. Just like every piece of code ever written
you'll need to do more. Write it so that you can add to it.

At least start by using a struct like this:

struct Packet
{
NetworkOrder< unsigned short > m_length;
char m_rest_of_data[1];
};

and your serializer/deserializer becomes somthing like:

string serialize( const string & foo )
{
Packet * l_packet;
int l_size = foo.size();

int packet_size =
(int)&((Packet*)0)->m_rest_of_data[l_size];

l_packet = ( Packet * ) alloca( packet_size );

l_packet->m_length = l_size;
memcpy( l_packet->m_rest_of_data, foo.data(), l_size );

return string( ( char * ) l_packet, packet_size );
}

Where the NetworkOrder class does somthing like the one below.

But wait - there's more. You'll want to know other things
about your protocol. How about a magic number - to make sure you're
reading from who you thought you were reading from.


struct Packet
{
NetworkOrder< unsigned > m_magic;
NetworkOrder< unsigned short > m_length;
char m_rest_of_data[1];
};


But wait - there's more. You'll want to expand your protocol to have
different messages. How about a message type - to make it so that you
can send different types of messages.

struct Packet
{
NetworkOrder< unsigned > m_magic;
NetworkOrder< unsigned short > m_message_type;
NetworkOrder< unsigned short > m_length;
char m_rest_of_data[1];
};

Then this becomes known as a message header and your
serializer/deserializer knows how to pull this out.

There are a few other helper classes you need.


BUT the interface should be a pure virtual abstract class.

The nice thing about C++ is that you can push the knowledge of how to
serialize and deserialize right into various member variables themselves
as demonstrated with the NetworkOrder class below. I've only used the
class below for integral types from 0 -> 16 bytes. But I'd suggest that
it's probably ok for some floating point types as well.


template <class base_type >
class NetworkOrder
{
public:

base_type m_uav;

// a good optimizing compiler will reduce this to a constant
static inline bool IsBigEndianbool()
{
unsigned x = 1;
return ! ( * ( char * )( & x ) );
}

static inline void OrderRead(
const base_type & i_val,
base_type & i_destination
)
{
unsigned char * src = ( unsigned char * ) & i_val;
unsigned char * dst = ( unsigned char * ) & i_destination;

if (
( sizeof( base_type ) == 1 )
|| IsBigEndianbool()
) {
//
// Alignment is an issue some architectures so
// even for non-swapping we read a byte at a time

if ( sizeof( base_type ) == 1 ) {
dst[0] = src[0];
} else if ( sizeof( base_type ) == 2 ) {
dst[0] = src[0];
dst[1] = src[1];
} else if ( sizeof( base_type ) == 4 ) {
dst[0] = src[0];
dst[1] = src[1];
dst[2] = src[2];
dst[3] = src[3];
} else {

for (
int i = sizeof( base_type );
i > 0;
i --
) {
* ( dst ++ ) = * ( src ++ );
}
}

} else {

if ( sizeof( base_type ) == 2 ) {
dst[1] = src[0];
dst[0] = src[1];
} else if ( sizeof( base_type ) == 4 ) {
dst[3] = src[0];
dst[2] = src[1];
dst[1] = src[2];
dst[0] = src[3];
} else {
dst += sizeof( base_type ) -1;
for ( int i = sizeof( base_type ); i > 0; i -- ) {
* ( dst -- ) = * ( src ++ );
}
}
}


static inline void OrderWrite(
const base_type & i_val,
base_type & i_destination
)
{
// for the time being this is the same as OrderRead
OrderRead( i_val, i_destination );
}

inline operator base_type () const
{
base_type l_value;
OrderRead( m_uav, l_value );
return l_value;
}


inline base_type operator=( base_type in_val )
{
OrderWrite( in_val, m_uav );
return in_val;
}
};


Now that was a ramble.

Aaron Prillaman

unread,
Mar 7, 2003, 1:49:33 PM3/7/03
to

Thank you very very much for this ramble. With help from others in
previos threads, I have sort of worked out a system for adding new
messages ---

---out_messages.h---

namespace omsg{

const char invalid_login = 1;//one of these for every kind of message
//in the client version of this file

class OMessage{
protected:
char *msg;
public:
OMessage() : msg(0) {}
OMessage(OMessage const &copy_from){
msg = 0;
if (copy_from.msg){
msg = new char[*copy_from.msg];
memcpy(msg, copy_from.msg, *copy_from.msg);
}
}
~OMessage(){
if(msg) delete[] msg;
}
char *GetMsg(){ return msg; }
};

class Minvalid_login : public OMessage{
public:
Minvalid_login(){
msg = new char[2];
msg[0] = 2;
msg[1] = invalid_login;
}
};

}

I also have a user class which handles the sending of these packets in a
template function... You've probably seen my questions about these in
previous posts.. what do you think?

I'm considering reworking my system with some of your ideas as well.

thanks again

Vladimir Shiryaev

unread,
Mar 8, 2003, 1:53:28 AM3/8/03
to
"Jerry Coffin" <jco...@taeus.com> wrote in message
news:MPG.18d1ba85b243daf1989897@news...

> the only difference is that something like a multiply on one of these
> older processors will be so slow that you'll get poor performance ALL
> the time instead of only when the pair happen to be allocated
> incorrectly.

No compiler will ever generate a multiplication opcode
for an expression like `x * 256' (unless it's faster then a shift),
many of them will not even generate a shift :-)

Jerry Coffin

unread,
Mar 8, 2003, 9:59:57 AM3/8/03
to
In article <rtgaa.7683$KJ3.1...@news20.bellglobal.com>,
vladimir.sh...@sympatico.ca says...

[ ... ]

> No compiler will ever generate a multiplication opcode
> for an expression like `x * 256' (unless it's faster then a shift),
> many of them will not even generate a shift :-)

While I have no difficulty agreeing that they _shouldn't_, experience
indicates that a number did anyway...

Reply all
Reply to author
Forward
0 new messages