Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Writing an Emulator

2 views
Skip to first unread message
Message has been deleted

Gerry Quinn

unread,
Nov 24, 2003, 11:40:26 AM11/24/03
to
In article <Pine.LNX.4.58-035....@unix46.andrew.cmu.edu>, "Arthur J. O'Dwyer" <a...@nospam.andrew.cmu.edu> wrote:
>> I posted a newsgroup question here a few weeks back, asking some
>> questions that related to my 10 year quest (so far) to understand
>> pointers.

One may spend decades trying to understand quantum theory, or women.
Pointers aren't worth the effort. Though if you can't understand them,
programming may not be your metier.

But try this:

My house is a data structure.
Its address is a pointer.

An array of houses is a street.
An array of house-pointers (addresses) fits on a piece of paper.

Think of why this feature could be useful, and you'll have thought of
reasons why pointers are useful.

Gerry Quinn
--
http://bindweed.com
Screensavers and Games for Windows
Download free trial versions
New arcade-puzzler just out - "Volcano"

Corey Murtagh

unread,
Nov 24, 2003, 1:06:25 PM11/24/03
to
Gerry Quinn wrote:

> In article <Pine.LNX.4.58-035....@unix46.andrew.cmu.edu>, "Arthur J. O'Dwyer" <a...@nospam.andrew.cmu.edu> wrote:
>
>>>I posted a newsgroup question here a few weeks back, asking some
>>>questions that related to my 10 year quest (so far) to understand
>>>pointers.
>
> One may spend decades trying to understand quantum theory, or women.
> Pointers aren't worth the effort. Though if you can't understand them,
> programming may not be your metier.

Pointers are simple creatures with simple rules. There /may/ be slight
variances of these rules depending on exactly which language you're
using, but the core concepts are the same. It's getting really boring
watching people make like pointers are some magical, mystical concept
that is far too difficult for anyone to really understand.

Here are some 'rules' that are fairly generic, although I'm using C or
C++ for examples:

1) A pointer holds the memory address of an object.

char c = 'A'; // an object - a character in this case
char* p = &c; // p holds address of object 'c'

2) Assigning to a pointer changes the object it is pointing to:

char c2 = 'a';
p = &c2; // now points to c2

3) Dereferencing a pointer references the pointed-to object:

*cp = 'B'; // assigns to object c
*cp == 'B'; // tests value of object c

4) Incrementing a pointer points to the next object:

int ia[10];
int* p = &ia[0]; // or: p = ia, if you prefer
p++; // increment, now points to ia[1]
*p = 1; // assign to ia[1]

5) Pointers can be indexed to access other objects in an array

p = &ia[0]; // again: p = ia if you prefer
p[2] = 2; // assign to ia[2]. Also 2[p] works.
p++; // now points to ia[1]
p[2] = 3; // assign to ia[1 + 2 = 3]

NB: there's nothing inherently stopping you from indexing or
incrementing the pointer beyond the ends of the array. Bad Things can
happen when you do this.

Also note: an array is /not/ a pointer, although it can be treated as
one in many cases. The expression "&ia[0]" is /roughly/ equivalent to
"ia" though.


There are a bunch of other simple, valid statements like the above that
you can make about pointers. The problem comes when people stop making
simple statements.

But if anyone can provide a pointer 'rule' that can't be expressed
simply, or can't be broken down into multiple simple statements, I'd
love to hear it. Now's your chance people... prove that pointers are
the complex, hairy beast that everyone's been claiming for so long that
they are!

Oh, and debunking misconception #1 for C (and C++) people: there is
nothing (and I do mean NOTHING) special about 'char*'. It's a pointer
to a char. Don't ever think it's anything else, no matter what your
books and professors tell you.

--
Corey Murtagh
The Electric Monk
"Quidquid latine dictum sit, altum viditur!"

Strangely Placed

unread,
Nov 24, 2003, 2:24:11 PM11/24/03
to
[Google's news "editor" isn't thrillingly good. Apologies if the text
flow is munged.]

"Arthur J. O'Dwyer" <a...@nospam.andrew.cmu.edu> wrote in message news:<Pine.LNX.4.58-035....@unix46.andrew.cmu.edu>...

> On Sun, 23 Nov 2003, Douglas Garstang wrote:
> >
> > I posted a newsgroup question here a few weeks back, asking some
> > questions that related to my 10 year quest (so far) to understand
> > pointers.
> >

<snip>
>
> Basically, Richard is suggesting that you create an "imaginary
> computer" from scratch -- you can even do this on paper.

Right.

<snip>

> Personally, I think this is a bad way to learn pointers, although
> it's a nice exercise for general understanding of how computers
> work. See below.

I have to disagree with you here, Arthur. It is precisely because it's
a nice exercise for understanding how computers work that it is /also/
a good way to learn pointers. Even if the OP chooses to implement the
solution in an unpointery language (Visual Basic springs to mind as a
possible candidate), the very act of implementing a computer will,
effectively, lead him to /invent/ pointers. And I can certainly attest
to the fact that inventing something definitely gives you a much, much
clearer understanding of it than merely reading about it can ever do.

<snip>

> > This can be done in around 300 lines of C code. Less if you're a terse
> > coder, but mine is 300 lines (excluding a few printfs and comments),
> > and it even includes a rudimentary disassembler.
>
> I must be a very terse coder, then. :-)

Er, yes. Well, tersinosity has never been one of my objectives when
writing C. I tend to set my stall out rather verbosely.

<snip>

> Hmm. In my opinion, Richard speaks the truth, but only because
> you won't ever make it to eight bits without an understanding of
> pointers

I venture to suggest that the process of writing the machine is very
likely to produce that understanding.

> -- I don't see the C implementation of these computers
> really using too many pointers

No, not particularly, but that's not the point. The point is that the
"machine code interpreter" has to understand about how to read from
and write to the main memory of the simulation, and it is in designing
and writing that interpreter that the student's understanding dawns.

<snip>

> HTH, and please follow up to comp.programming with any questions
> not directly related to the C implementations,

Modulo those nits, I take my hat off to you for a superb answer. I
only hope the OP doesn't read it /too/ closely, since the whole idea
is for him to go through the design and development process himself.
It is on that road that he will encounter understanding.

--
Richard Heathfield
(Strangely Placed)

Arthur J. O'Dwyer

unread,
Nov 24, 2003, 4:00:01 PM11/24/03
to

On Mon, 24 Nov 2003, Strangely Placed [Richard Heathfield] wrote:
>
> Arthur J. O'Dwyer wrote...

> >
> > Basically, Richard is suggesting that you create an "imaginary
> > computer" from scratch -- you can even do this on paper.

> > Personally, I think this is a bad way to learn pointers, although


> > it's a nice exercise for general understanding of how computers
> > work. See below.
>
> I have to disagree with you here, Arthur. It is precisely because it's
> a nice exercise for understanding how computers work that it is /also/
> a good way to learn pointers. Even if the OP chooses to implement the
> solution in an unpointery language (Visual Basic springs to mind as a
> possible candidate), the very act of implementing a computer will,
> effectively, lead him to /invent/ pointers. And I can certainly attest
> to the fact that inventing something definitely gives you a much, much
> clearer understanding of it than merely reading about it can ever do.

True. However, I will say that when I wrote the code for Computer
B, the one that includes the "LOAD" opcode, I was momentarily confused
by the parsing of the address operand to "LOAD"; see in the original
code how it has double indirection

b.R0 = b.M[b.M[b.IP].value].value;

where the "machine code" suggests single indirection

1 3 LOAD [M3]

IMHO that's going to be very confusing for anyone who's not already
confident enough in his pointer knowledge to plough ahead as I did,
trusting what I *knew* to be the right C expression even though it
*looked* kind of funny.
I do absolutely agree with you that the pencil-and-paper exercise
of designing your own computers is a great way to learn pointers
"by doing." But I respectfully submit that if you [meaning newbies]
try to *implement* these computers in C, you'll end up more horribly
confused than you started out. No matter how much more fun it is to
be able to see your imaginary programs executing on real machines.


> > > This can be done in around 300 lines of C code. Less if you're a terse
> > > coder, but mine is 300 lines (excluding a few printfs and comments),
> > > and it even includes a rudimentary disassembler.
> >
> > I must be a very terse coder, then. :-)
>
> Er, yes. Well, tersinosity has never been one of my objectives when
> writing C. I tend to set my stall out rather verbosely.

What *did* your code look like? You still have it around anywhere?


Maybe you could post a URL. (And when you write:

> The point is that the "machine code interpreter" has to understand
> about how to read from and write to the main memory of the simulation,
> and it is in designing and writing that interpreter that the student's
> understanding dawns.

it makes me think that your code might be horribly long because you
didn't use arrays or bit-fields for your memory cells; but if that's
the case, I want to see how you *did* do it. Morbid fascination,
y'know. ;-)


> Modulo those nits, I take my hat off to you for a superb answer. [...]

Thank you much!

-Arthur

seemanta dutta

unread,
Nov 24, 2003, 11:07:12 PM11/24/03
to
"Arthur J. O'Dwyer" <a...@nospam.andrew.cmu.edu> wrote in message news:<Pine.LNX.4.58-035....@unix43.andrew.cmu.edu>...

I personally think it is how you start to code your emulator that
decides whether pointers will be used extensively in the software or
not, which will ultimately determine ur level of understanding of
pointers.No doubt, writing an emulator for an imaginary computer will
lead to ideas like 'memory' and 'adress'. but lets not forget that
these can be implemented using pointers as well as simple arrays in C.

when i created my simulator for 8051 ( http://gsim51.sourceforge.net)
i did use pointers, but hardly for memory addressing etc. for me
pointers came in handy in calling the appropriate function for a
particular opcode( u can also have a look at at the code). although i
could have written every array access line with pointer notation, i
did not do that since i believe that pointers deserve much more
elegant use other than array access.

so ultimately it comes down to the point of individual likeness. i
think K&R would be a good reference for understanding pointers.with 10
years on C expereince u should not have any problems in getting thru
that book.

regards,
Semanta Dutta

Richard Heathfield

unread,
Nov 25, 2003, 2:36:10 AM11/25/03
to
Arthur J. O'Dwyer wrote:

> What *did* your code look like? You still have it around anywhere?

I'd prefer not to post it (because I hope to publish it as part of a larger
work), but I will cheerfully email you a copy just as soon as I've finished
the project I'm currently working on (which will actually facilitate
sending the email!). In the meantime, here's a rough breakdown of the code.
just to give you an idea of the "shape" - i.e. how I approached the
problem:

Lines 1-27: Comment.
Lines 28-55: Preprocessor directives.
Lines 56-62: A typedef.
Lines 64-70: PrintBinary()
Lines 71-101: DisplayMachineState()
Lines 102-109: Fetch()
Lines 110-134: Execute()
Lines 135-139: GetPRN() - for simulating randomised memory at startup
Lines 140-149: DisplayPrompt()
Lines 150-159: GetENTER()
Lines 160-182: GetYesNo()
Lines 183-204: DoIntro()
Lines 205-217: UserWantsMore() - as in if(UserWantsMore())
Lines 218-269: GetProgram()
Lines 270-331: RunProgram()
Lines 332-358: main()

Nothing here is terribly non-obvious, of course.


>> The point is that the "machine code interpreter" has to understand
>> about how to read from and write to the main memory of the simulation,
>> and it is in designing and writing that interpreter that the student's
>> understanding dawns.
>
> it makes me think that your code might be horribly long because you
> didn't use arrays or bit-fields for your memory cells; but if that's
> the case, I want to see how you *did* do it. Morbid fascination,
> y'know. ;-)

Sorry to disappoint you.

typedef struct COMPUTER_
{
int State; /* HALTED or RUNNING */
unsigned char CurrentInstruction;
unsigned int CpuRegister[REGISTER_COUNT];
unsigned char Memory[ADDRESS_SPACE];
} COMPUTER;

--
Richard Heathfield : bin...@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Bill Godfrey

unread,
Nov 25, 2003, 7:23:53 AM11/25/03
to
Corey Murtagh <em...@slingshot.co.nz.no.uce> wrote:
> Oh, and debunking misconception #1 for C (and C++) people: there is
> nothing (and I do mean NOTHING) special about 'char*'. It's a pointer
> to a char. Don't ever think it's anything else, no matter what your
> books and professors tell you.

Not quite... The internal representation of a char* has to be identical to
a void*. Unless you are interfacing with K&R1 code, you should not care
about this.

But that's not the same "specialness" that a lot of people tend to apply to
char*.

Bill, does a lot of work for charity.

Bill Godfrey

unread,
Nov 25, 2003, 7:33:07 AM11/25/03
to
bin...@eton.powernet.co.uk wrote:
> typedef struct COMPUTER_
> {
> int State; /* HALTED or RUNNING */
> unsigned char CurrentInstruction;
> unsigned int CpuRegister[REGISTER_COUNT];
> unsigned char Memory[ADDRESS_SPACE];
> } COMPUTER;

Hmmm, I'd have made CurrentInstruction another register, and called it
NextInstruction. Fits in the way that real-world CPUs work.

Of course, it's only a game as a learning device. Not like any of these
emulators-of-imaginary-hardware will actually see life beyond being just a
game.

Bill, having a horrible vision of how Java may have started.

Calum

unread,
Nov 25, 2003, 9:22:07 AM11/25/03
to
Arthur J. O'Dwyer wrote:
<snipped>

> Basically, Richard is suggesting that you create an "imaginary

> computer" from scratch -- you can even do this on paper. Once you
> have got the basic idea, you can write a C program to emulate the
> imaginary machine in software.


> Personally, I think this is a bad way to learn pointers, although
> it's a nice exercise for general understanding of how computers
> work. See below.

Why build a virtual machine, when you already have a virtual address
space given to you by your operating system? Surely, learning assembly
language would be a faster (and considerably more useful) route to
understanding pointers.

What may be eluding the OP is the concept of a flat address space**.
Once you understand that, C makes a lot of sense.

If you just think of a programming language as a black box, you'll never
be a good programmer. If you don't understand what happens underneath,
you won't appreciate the costs and trade-offs, or understand why some
things are fast and memory-efficient, while other things are not.


**Flat address spaces:
Your memory is a long list of bits each storing either a 0 or a 1.
These are grouped into blocks of 8 bits called bytes. Each byte has an
"address", a number in the range 0-4294967295 (on a 32-bit machine), and
each byte can store one of 2^8=256 combinations.

By grouping together bytes, even larger amounts of data can be stored.
For example the string "hello" could be stored in the address range
100-104. An int often takes 4 bytes (32 bits), so could be stored in
the addresses 105-108. A double takes 8 bytes, so could be stored in
the addresses 109-116.

A "pointer" is just the first address of a group of bytes. The "type"
of the pointer (e.g. double*) tells the compiler what is stored in the
bytes. So (double*)109 would point to the double stored in 109-116.
(int*)105 points to the integer at 105-108. sizeof(int) is 4, so the
compiler knows that an integer occupies the next 4 bytes.

We could even emulate a virtual address space using C.

char virtual_addresses[200];
double *d = (double*)(virtual_addresses+109);
int *i = (int*)(virtual_addresses+105);

If we wanted access to "raw" memory, it would be
double *d = (double*)109;
int *i = (int*)105;

If you want to write to an address
*d = 3.14;

If you want to read from an address
int e = *i;

A "segmentation fault" happens when the address you read or write to is
not been allocated to your application.

Simple?


Stephen Smith

unread,
Nov 25, 2003, 12:04:09 PM11/25/03
to
Bill Godfrey wrote:

[snip]

> But that's not the same "specialness" that a lot of people tend to apply
> to char*.

Just curious, but what 'specialness', exactly?

Stephen.


Arthur J. O'Dwyer

unread,
Nov 25, 2003, 1:05:46 PM11/25/03
to

On Tue, 25 Nov 2003, Stephen Smith wrote:
>
> Bill Godfrey wrote:

Incidentally, Bill, the Standard is slightly ambiguous on the real
specialness of 'char *'. I interpret its words to mean that the
type 'void *' must be identical to *some* pointer to character type,
but not necessarily 'char*' itself -- it might be 'signed'/'unsigned'
'char *', at the implementation's discretion.
And note that 'char *' does *not* have to have anything in common
with *either* 'signed char *' or 'unsigned char *', even though
'char' must be an "alias" for either signed or unsigned char.

> > But that's not the same "specialness" that a lot of people tend to apply
> > to char*.
>
> Just curious, but what 'specialness', exactly?

C (and to a certain extent C++) use arrays of 'char' to provide
text strings. Arrays in C (and C++) decay into pointers in most
contexts; thus the C standard library has many functions such as

size_t strlen (const char *s);

'strlen' takes a "string" and returns its length. Strings can
also be assigned, compared with strcmp(), et cetera -- but since
they're just arrays and pointers, the "intuitive" methods rarely
work in C. Common newbie mistakes include

char *p = "foo";
char *q = "bar";

assert(q < p); /* WRONG! < is NOT alphabetical comparison! */

char *r = p+q; /* WRONG! + does NOT catenate strings! */

char a[] = "hello";
char b[] = "world";

strcat(a, b); /* WRONG! strcat() does NOT allocate memory! */

a = p; /* WRONG! Arrays cannot be assigned to! */


[and so on...]

-Arthur

Bill Godfrey

unread,
Nov 26, 2003, 8:30:16 AM11/26/03
to
"Stephen Smith" <nu...@void.pointer> wrote:
> > But that's not the same "specialness" that a lot of people tend to
> > apply to char*.

> Just curious, but what 'specialness', exactly?

The misconception that char* can be used as a generic pointer. It may have
been the case in K&R1, but these days void* serves that role.

(See also the other response to this article.)

Bill, I'm special.

Douglas Garstang

unread,
Nov 27, 2003, 12:43:42 PM11/27/03
to
Gerry,

Thats the frustrating part! I was writing assembler for my commodore
64 at the age of 10, and I've been doing various types of non pointer
based programming, and scripting since that age. I use perl on a daily
basis. So, it isn't programming I have an issue with. Its pointers!


ger...@indigo.ie (Gerry Quinn) wrote in message news:<Vrqwb.1920$nm6....@news.indigo.ie>...

Douglas Garstang

unread,
Nov 28, 2003, 3:37:07 AM11/28/03
to
Calum,

Completely simple until you got to the bit about pointers. As I said
somewhere else, I was programming assembler, sprites etc on my C64 at
the age of 12 so it isn't raw memory access I have issues with.

Doug.

Calum <calum...@ntlworld.com> wrote in message news:<bpvogr$l99$1...@newsg3.svr.pol.co.uk>...

Gerry Quinn

unread,
Nov 28, 2003, 5:00:51 PM11/28/03
to
In article <10d46bdc.03112...@posting.google.com>, do...@pobox.com (Douglas Garstang) wrote:
>Gerry,
>
>Thats the frustrating part! I was writing assembler for my commodore
>64 at the age of 10, and I've been doing various types of non pointer
>based programming, and scripting since that age. I use perl on a daily
>basis. So, it isn't programming I have an issue with. Its pointers!

Well, obviously you have a blind spot! It is said that El Greco was
astigmatic and Beethoven was deaf, so perhaps a programmer can still be
made of you without them ;-)

What I said still applies - if programming works for you without
pointers, just do it! If pointers haven't come in ten years, they
probably won't...

Sheldon Simms

unread,
Nov 29, 2003, 4:07:47 PM11/29/03
to
On Thu, 27 Nov 2003 09:43:42 -0800, Douglas Garstang wrote:

> Gerry,
>
> Thats the frustrating part! I was writing assembler for my commodore
> 64 at the age of 10, and I've been doing various types of non pointer
> based programming, and scripting since that age. I use perl on a daily
> basis. So, it isn't programming I have an issue with. Its pointers!

If you really were programming in assembler on your Commodore 64
then you must have used pointers.

Assume there is a list of two-byte numbers starting at $1000. To make
it easy. the numbers will simply be the numbers from 1000 to 1009:

1000: e8 03
1002: e9 03
...
1010: f0 03
1012: f1 03

Now assume we want to write a program that prints all of these numbers. We
will use two bytes of zero page at $80/$81. We have a subroutine called
'print' that pulls two bytes off the stack, treats them as a two byte
integer, and prints the value. The subroutine print doesn't destroy our
registers. We can print all of the numbers like this:

lda #$10 ;store the address $1000 in $80/$81
sta $81
lda #0
sta $80

tax ; initialize x and y to 0
tay

:more lda ($80),y ; push the low byte of the current number
pha
iny
lda ($80),y ; push the high byte of the current number
pha
dey

jsr print ; print the number

lda $80 ; update the address
clc ; ($1000 => $1002, $1002 => $1004, etc.)
adc #2
sta $80

inx ; if 10 numbers have been printed, then done
cpx #10
bcc :more

Now this is really awful 6502 code, but the idea is to relate it to
C code. The list of 10 numbers starting at $1000 is like a C array.
It might be declared:

short numbers[10] = {1000,1001,...,1008,1009};

The 6502 code uses the two zero page bytes $80/$81 to store the address
of the current number that will be printed. These two bytes are
initialized to contain the address $1000 and then are incremented by
two everytime through the loop so that they contain the address of the
next number. These two bytes are a pointer, and the same can be declared
in C like this:

short * current_number;

In 6502 assembly we know that the list of numbers is located at $1000, so
we store $1000 directly into $80/$81. In C, we don't know at which address
our array numbers will be stored, so we use the "address-of" operator to
obtain the address and store it in the pointer, just as the address of the
array was stored in $80/$81 in 6502 assembly:

current_number = &numbers;

In 6052 assembly, it is necessary to retrieve the byte stored at the
address indicated in $80/$81. This is done by using the indexed-Y
addressing mode, as in the line with the label :more. The indexed-Y
addressing mode says "get the two byte address stored at the given zero
page location, add Y to it to create the final source address, and load
the byte at that address into A". We can't load into "the A register" in
C, but we can load into a variable:

short a_number = *current_number;

The main difference between this line of C code and the 6502 code is that
C can load both bytes of the number at one, whereas in 6502 assembly, we
have to load each byte separately by loading the first byte, and then
incrementing Y to load the second byte.

Lastly, the 6502 code increments the address stored in $80/$81 by two so
that the address there is now the address of the next number in the list.
C can do this as well. The main difference is that since C "knows" that
things being accessed are two-byte integers, we only increment the C
pointer by one and the compiler makes it two. That is, in C, incrementing
a pointer increments the address by the size of the object being pointed
to, no matter what that size is. This looks like this:

current_number = current_number + 1;

So now here's the 6502 and C code side by side:

; int numbers[10]={1000,1001,...,1008,1009};
; char * next_number;
; int count;

lda #$10 ; next_number = &numbers
sta $81
lda #0
sta $80

tax ; count = 0;

tay
:more lda ($80),y ; more:
pha
iny
lda ($80),y
pha
dey
jsr print ; print(*next_number);

lda $80 ; next_number = next_number + 1;
clc
adc #2
sta $80

inx ; count = count + 1;
cpx #10 ; if (count < 10) goto more;
bcc :more


goose

unread,
Dec 1, 2003, 8:53:58 AM12/1/03
to
billg-...@bacchae.f9.co.uk.invalid (Bill Godfrey) wrote in message news:<20031125072351.887$e...@newsreader.com>...

> Corey Murtagh <em...@slingshot.co.nz.no.uce> wrote:
> > Oh, and debunking misconception #1 for C (and C++) people: there is
> > nothing (and I do mean NOTHING) special about 'char*'. It's a pointer
> > to a char. Don't ever think it's anything else, no matter what your
> > books and professors tell you.
>
> Not quite... The internal representation of a char* has to be identical to
> a void*.

are you sure? a pointer to void may be converted to a pointer to any
object type. a pointer to any object type may be converted to a
pointer to void, and back again *AND* the result must compare equal
to the original pointer.

I cant find the bits of the c99 standard that mandates anything
about the internal representation of *any* data type. C&V, please?


> Bill, does a lot of work for charity.

goose,
mostly a comma for a sig :-)

Bill Godfrey

unread,
Dec 1, 2003, 9:09:34 AM12/1/03
to
ru...@webmail.co.za (goose) wrote:

> I cant find the bits of the c99 standard that mandates anything
> about the internal representation of *any* data type. C&V, please?

6.2.5 para 26

It does not specify the exact bit arrangements, but it does require that
void* has the same representation a pointer to a character type.

I'll concede that it does not specify which of three character types is
refered, nor can I justify that the standard requires the three types of
pointer-to-character (char*, unsigned char* and signed char*) are required
to have the same representation.

In this day and age though, you shouldn't need to assume that void* and
char* have the same internal representation.

Bill, bad comma-dian.

Calum

unread,
Dec 1, 2003, 9:15:54 AM12/1/03
to
Douglas Garstang wrote:
> Calum,
>
> Completely simple until you got to the bit about pointers. As I said
> somewhere else, I was programming assembler, sprites etc on my C64 at
> the age of 12 so it isn't raw memory access I have issues with.
>
> Doug.

Oh dear. A pointer is a memory address. Does that help? However you
also need to give the type of the pointer (int*, char*, double* etc) to
tell the compiler how to manipulate the data at the address.

Another way of thinking about things: I want to send you a package.
There are two ways of doing this. 1) You tell me the address of your
house, and I mail you the package. 2) You disassemble your house, put
it in the back of a large truck, and drive it over to mine. There, you
reassemble the house. I put the package in your house.

The address of your house is like a "pointer" to your house. The
difference between a pointer and a value is like the difference between
your address and your house. Or the difference between your name
(Doug), and you (meaty thing).

And with programming, there are two ways of passing a value. 1) Is by
reference, and 2) is by value. In C,

1) void deliver_package1(House *house);
2) void deliver_package2(House house);

House h;
deliver_package1(&h); // Address of h passed - efficient
deliver_package2(h); // Whole house is copied

Any help?

Calum

Mark McIntyre

unread,
Dec 1, 2003, 12:36:27 PM12/1/03
to
On Tue, 25 Nov 2003 14:22:07 +0000, in comp.programming , Calum
<calum...@ntlworld.com> wrote:

>Arthur J. O'Dwyer wrote:
><snipped>
>
>> Basically, Richard is suggesting that you create an "imaginary
>> computer" from scratch -- you can even do this on paper. Once you
>> have got the basic idea, you can write a C program to emulate the
>> imaginary machine in software.
>> Personally, I think this is a bad way to learn pointers, although
>> it's a nice exercise for general understanding of how computers
>> work. See below.
>
>Why build a virtual machine, when you already have a virtual address
>space given to you by your operating system?

I'm not entirely sure you understand whats meant by an imaginary
computer in this context.

> Surely, learning assembly
>language would be a faster (and considerably more useful) route to
>understanding pointers.

Not likely !

>If you just think of a programming language as a black box, you'll never
>be a good programmer. If you don't understand what happens underneath,
>you won't appreciate the costs and trade-offs, or understand why some
>things are fast and memory-efficient, while other things are not.

This simply isn't true. It certainly helps to know the hardware if
your'e doing low-level programming or have computtionally intensive
routines, but for general programming, you no more need to know the
specifics of the machine memory or cpu than a tea leaf needs to know
the history of the east india company.


--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>


----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---

goose

unread,
Dec 2, 2003, 4:20:19 AM12/2/03
to
bill-g...@sunny-coventry.invalid (Bill Godfrey) wrote in message news:<20031201090934.416$M...@newsreader.com>...

> ru...@webmail.co.za (goose) wrote:
>
> > I cant find the bits of the c99 standard that mandates anything
> > about the internal representation of *any* data type. C&V, please?
>
> 6.2.5 para 26

thanks, will keep it in mind (again!) :-).

>
> It does not specify the exact bit arrangements, but it does require that
> void* has the same representation a pointer to a character type.

thats good enough.

<snipped>

>
> Bill, bad comma-dian.

goose,
bad end-ian

Corey Murtagh

unread,
Dec 2, 2003, 5:24:31 AM12/2/03
to
Bill Godfrey wrote:

> ru...@webmail.co.za (goose) wrote:
>
>>I cant find the bits of the c99 standard that mandates anything
>>about the internal representation of *any* data type. C&V, please?
>
> 6.2.5 para 26
>
> It does not specify the exact bit arrangements, but it does require that
> void* has the same representation a pointer to a character type.

But does it matter? Is there any reason for a pointer to any type to
differ from another pointer? Apart from near- and far-pointers in
certain cases of course :)

In other words, the binary representation of a pointer to an arbitrary
address should be the same regardless of the actual type of pointer, no?

Calum

unread,
Dec 2, 2003, 6:29:50 AM12/2/03
to
Mark McIntyre wrote:
> On Tue, 25 Nov 2003 14:22:07 +0000, in comp.programming , Calum
> <calum...@ntlworld.com> wrote:
>
>
>>Arthur J. O'Dwyer wrote:
>><snipped>
>>
>>> Basically, Richard is suggesting that you create an "imaginary
>>>computer" from scratch -- you can even do this on paper. Once you
>>>have got the basic idea, you can write a C program to emulate the
>>>imaginary machine in software.
>>> Personally, I think this is a bad way to learn pointers, although
>>>it's a nice exercise for general understanding of how computers
>>>work. See below.
>>
>>Why build a virtual machine, when you already have a virtual address
>>space given to you by your operating system?
>
>
> I'm not entirely sure you understand whats meant by an imaginary
> computer in this context.

A virtual machine. A program that emulates the actions of a CPU in its
own buffer in memory.

>
>>Surely, learning assembly
>>language would be a faster (and considerably more useful) route to
>>understanding pointers.
>
>
> Not likely !

Assembly language is considerably simpler than most high level
languages. It just takes twenty times as long to do anything useful in
it. You can't program in assembly language without using pointers, so
it certainly wouldn't hurt, plus you'd add a useful skill to your armoury.

>
>>If you just think of a programming language as a black box, you'll never
>>be a good programmer. If you don't understand what happens underneath,
>>you won't appreciate the costs and trade-offs, or understand why some
>>things are fast and memory-efficient, while other things are not.
>
>
> This simply isn't true. It certainly helps to know the hardware if
> your'e doing low-level programming or have computtionally intensive
> routines, but for general programming, you no more need to know the
> specifics of the machine memory or cpu than a tea leaf needs to know
> the history of the east india company.

I wasn't talking about the hardware. I was talking about how a compiler
turns your program into machine code. And given that C is a fairly low
level language, understanding memory is fairly essential, just to get it
working! Even something as trivial as a dangling pointer (to a
destroyed stack frame) could be very very confusing to someone
uninitiated, as "segmentation fault" is not the most helpful diagnostic
in the world.

So yes, you need to understand memory, even for "normal" tasks.

Another example: virtual functions in C++. To call one of these, the
location of the virtual function table (which bloats each object by 4
bytes) is read from the object. Then the entry point of the function is
read from this table. Finally, the function is called.

Knowing this, you can make an intelligent decision about whether a
virtual function is appropriate to use, or whether you should use
something more direct like a template or a switch statement.

Cal

Bill Godfrey

unread,
Dec 2, 2003, 6:29:56 AM12/2/03
to
Corey Murtagh <em...@slingshot.no.uce> wrote:
> But does it matter? Is there any reason for a pointer to any type to
> differ from another pointer? Apart from near- and far-pointers in
> certain cases of course :)

On a "Harvard architecture" CPU such as ATMELs, data pointers may have an
extra field indicating if the pointee is in ram or flash. A pointer to
function would not require this field, as it would always be in flash. Data
pointers might also optionally be restrained at compile time to be
flash-only or ram-only.

A compiler for a 32bit byte CPU might have a CHAR_BIT==8 mode. Pointers to
all non-char types would be a simple address pointer, but pointers to the
char types would require an extra field indicating which of the four octets
is the pointee.

A compiler with support for a "blitter" might have a special pointer-to-bit
field, which required an extra field requiring pointers to address
individual bits.

A compiler with a bounds checking mode might have fields indicating the
upper and lower bounds. These would not be needed on pointers to functions.

Bill, Georgia on my mind.

Mel Wilson

unread,
Dec 2, 2003, 7:55:47 AM12/2/03
to
In article <10703606...@radsrv1.tranzpeer.net>,

Corey Murtagh <em...@slingshot.no.uce> wrote:
>Bill Godfrey wrote:
>
>> ru...@webmail.co.za (goose) wrote:
>>
>>>I cant find the bits of the c99 standard that mandates anything
>>>about the internal representation of *any* data type. C&V, please?
>>
>> 6.2.5 para 26
>>
>> It does not specify the exact bit arrangements, but it does require that
>> void* has the same representation a pointer to a character type.
>
>But does it matter? Is there any reason for a pointer to any type to
>differ from another pointer? Apart from near- and far-pointers in
>certain cases of course :)
>
>In other words, the binary representation of a pointer to an arbitrary
>address should be the same regardless of the actual type of pointer, no?

Not necessarily. On the big 36-bit iron I used to work
on, a pointer to a word fit in 18 bits. Therefore a pointer
to an int, long, float or double might be stored in an
18-bit short int. There were three different kinds of
character: 4, 6 and 9 bit, and a C implementation might
reasonably pick either the 6- or 9-bit form. Pointers to
these needed the word address, the character offset, and the
bit flags to choose 6- or 9-bit. All this took a full word.

A C implementation would very possibly choose to store
these different pointers in a common 1-word format, but it
doesn't seem as though that would be absolutely required.

Regards. Mel.

Mel Wilson

unread,
Dec 2, 2003, 7:55:47 AM12/2/03
to
In article <TvIz/ks/KbdA...@the-wire.com>, I said

>In article <10703606...@radsrv1.tranzpeer.net>,
>Corey Murtagh <em...@slingshot.no.uce> wrote:
>>Bill Godfrey wrote:
>>> It does not specify the exact bit arrangements, but it does require that
>>> void* has the same representation a pointer to a character type.
>>
>>But does it matter? Is there any reason for a pointer to any type to
>>differ from another pointer? Apart from near- and far-pointers in
>>certain cases of course :)
>>In other words, the binary representation of a pointer to an arbitrary
>>address should be the same regardless of the actual type of pointer, no?

[ ... ]


> A C implementation would very possibly choose to store
>these different pointers in a common 1-word format, but it
>doesn't seem as though that would be absolutely required.

Sorry. Confused my categories there. Not required by
the hardware, in that you can save a word address in a
half-word and get it all back later. Of course the C spec
disambiguates by requiring that the full-word format be
used.

Regards. Mel.

goose

unread,
Dec 2, 2003, 12:38:45 PM12/2/03
to
Corey Murtagh <em...@slingshot.no.uce> wrote in message news:<10703606...@radsrv1.tranzpeer.net>...

> Bill Godfrey wrote:
>
> > ru...@webmail.co.za (goose) wrote:
> >
> >>I cant find the bits of the c99 standard that mandates anything
> >>about the internal representation of *any* data type. C&V, please?
> >
> > 6.2.5 para 26
> >
> > It does not specify the exact bit arrangements, but it does require that
> > void* has the same representation a pointer to a character type.
>
> But does it matter? Is there any reason for a pointer to any type to
> differ from another pointer? Apart from near- and far-pointers in
> certain cases of course :)
>
> In other words, the binary representation of a pointer to an arbitrary
> address should be the same regardless of the actual type of pointer, no?

err .. it depends ... on the PIC that I am busy with, an address
to an instruction is 21 bits, and an address to data is 14 (i think!).

obviously a void* must then be *at* *least* 21 bits wide, so that
it satisfies the bits of the standard which say that *any* pointer
can be converted to void, and back again _and_ must compare equal.

when a pointer to data is converted to void*, then the extra
bits *must* be ignored, and when the void* is converted to
data, the same must apply.

anyway, the representation of a pointer to address and a pointer to
data are different on the PIC.

goose,
of course, I use assembly, not C, for the PIC.

Message has been deleted

Martha H Adams

unread,
Dec 8, 2003, 6:30:29 AM12/8/03
to
This is interesting. *This* is why I spend time scouting newsgroups
and coping with the rubbish out there.

There used to be, back in the TRS80 days, a "Tiny Pascal" that ran in
a trsdos environment. I've searched and googled a few times and have
never found it to run in a Freedos, msdos, nor Linux environment. If
anyone knows of a small descendant of that original Tiny Pascal
around, please advise (here) where I could find it. ??

Meanwhile, up this thread is a strong pointer for me to go make one
for myself.

Cheers -- Martha Adams

Morris Dovey

unread,
Dec 8, 2003, 11:05:52 AM12/8/03
to
Bill Godfrey wrote:

> Of course, it's only a game as a learning device. Not like any
> of these emulators-of-imaginary-hardware will actually see
> life beyond being just a game.

Hmm. A long time ago (before I started wearing glasses) I wrote
an APL emulator for a small 16-bit CPU with writable control
store (WCS). When it was finally working properly I liked it well
enough to build it on an Augat wire wrap panel. After dealing
with a couple of race conditions, I finally got it working.

'Twas fun. It had an LCS <size>,<addr> (load control store from
RAM) instruction and a BOOL <op>,<r1>,<r2> instruction that would
perform any of the sixteen possible Boolean <op>erations.

My "big idea" that I didn't ever manage to implement was to have
a compiler work up an optimized instruction set for each compiled
program, then produce the WCS load for that instruction set along
with the TU's object code. Somewhere along the line I got bogged
down in the functional analysis and abandoned the whole project.

It did see light ( But not much light )-:
--
Morris Dovey
West Des Moines, Iowa USA
C links at http://www.iedu.com/c
Read my lips: The apple doesn't fall far from the tree.

Willem

unread,
Dec 8, 2003, 11:18:12 AM12/8/03
to
Morris wrote:
) 'Twas fun. It had an LCS <size>,<addr> (load control store from
) RAM) instruction and a BOOL <op>,<r1>,<r2> instruction that would
) perform any of the sixteen possible Boolean <op>erations.

<nitpick>
Don't you mean eight possible boolean operations ?
Although I recall the blitter chip in an amiga also having sixteen possible
operations, but that had three inputs and one output.

) My "big idea" that I didn't ever manage to implement was to have
) a compiler work up an optimized instruction set for each compiled
) program, then produce the WCS load for that instruction set along
) with the TU's object code. Somewhere along the line I got bogged
) down in the functional analysis and abandoned the whole project.

I seem to recall reading a doctorate thesis about this idea...


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Morris Dovey

unread,
Dec 8, 2003, 11:53:03 AM12/8/03
to
Willem wrote:

> Morris wrote:
> ) 'Twas fun. It had an LCS <size>,<addr> (load control store from
> ) RAM) instruction and a BOOL <op>,<r1>,<r2> instruction that would
> ) perform any of the sixteen possible Boolean <op>erations.
>
> <nitpick>
> Don't you mean eight possible boolean operations ?
> Although I recall the blitter chip in an amiga also having sixteen possible
> operations, but that had three inputs and one output.

Nope. Imagine a truth table like (AND: <op> == 0x1):

A | 0 0 1 1
B | 0 1 0 1
--+--------
r | 0 0 0 1

Then if the same ordering of A and B values were used for all
sixteen tables, the corresponding r values would range from 0x0
to 0xF - with the first and last being "trivial" (always zero and
always one) operations.

> ) My "big idea" that I didn't ever manage to implement was to have
> ) a compiler work up an optimized instruction set for each compiled
> ) program, then produce the WCS load for that instruction set along
> ) with the TU's object code. Somewhere along the line I got bogged
> ) down in the functional analysis and abandoned the whole project.
>
> I seem to recall reading a doctorate thesis about this idea...

Not guilty. But I still like both concepts (the dynamically
writable control store and the optimization).

Did the thesis writer develop the idea into anything
implementable? [If the answer is "yes" I need to track him/her
down and present a bottle of fine champagne!]

Willem

unread,
Dec 8, 2003, 1:11:33 PM12/8/03
to
Morris wrote:
) Nope. Imagine a truth table like (AND: <op> == 0x1):
)
) A | 0 0 1 1
) B | 0 1 0 1
) --+--------
) r | 0 0 0 1
)
) Then if the same ordering of A and B values were used for all
) sixteen tables, the corresponding r values would range from 0x0
) to 0xF - with the first and last being "trivial" (always zero and
) always one) operations.

D'oh!

That must mean that the Amiga's blitter chip had 256 possible operations.
Which sounds plausible.

) Did the thesis writer develop the idea into anything
) implementable? [If the answer is "yes" I need to track him/her
) down and present a bottle of fine champagne!]

Err, I think so. But I might be bound by an NDA about that, because he was
my colleague at the time.

Morris Dovey

unread,
Dec 8, 2003, 6:28:49 PM12/8/03
to
Willem wrote:

> Morris wrote:
> ) Did the thesis writer develop the idea into anything
> ) implementable? [If the answer is "yes" I need to track him/her
> ) down and present a bottle of fine champagne!]
>
> Err, I think so. But I might be bound by an NDA about that, because he was
> my colleague at the time.

Willem...

Then if/when you see him again please convey my admiration and at
least a /virtual/ magnum. He did a significant piece of work.

<rant>
Hrumph! This NDA thing is getting out of hand. I recall a time
when a thesis was supposed to be an original contribution to the
/common/ body of knowledge - for the betterment of the human
condition and all that.
</rant>

Willem

unread,
Dec 8, 2003, 9:58:13 PM12/8/03
to
Morris wrote:
)<rant>
) Hrumph! This NDA thing is getting out of hand. I recall a time
) when a thesis was supposed to be an original contribution to the
) /common/ body of knowledge - for the betterment of the human
) condition and all that.
)</rant>

I *might* be bound. I'm pretty sure I'm not, because it's a thesis paper
that has been published and stuff, but I'd rather not mess with the big
corporations. <Cowers in a corner>

;-)

I found the book, by the way..

"Automatic Synthesis of Reconfigurable Instruction Set Accelerators"
by Bernardo Kastrup, 2001, ISBN 90-74445-50-0

Nice fellow, and a pretty good chess player as well.

Morris Dovey

unread,
Dec 9, 2003, 3:59:35 AM12/9/03
to
Willem wrote:

> "Automatic Synthesis of Reconfigurable Instruction Set Accelerators"
> by Bernardo Kastrup, 2001, ISBN 90-74445-50-0
>
> Nice fellow, and a pretty good chess player as well.

Thanks! I just downloaded a copy from the Design Automation
Section at TU/e. Very nice of them to make it available online.

In case anyone else is interested, this thesis is available (in
English) at http://alexandria.tue.nl/extra2/200101304.pdf

Corey Murtagh

unread,
Dec 10, 2003, 2:52:45 PM12/10/03
to
Martha H Adams wrote:

> This is interesting. *This* is why I spend time scouting newsgroups
> and coping with the rubbish out there.
>
> There used to be, back in the TRS80 days, a "Tiny Pascal" that ran in
> a trsdos environment. I've searched and googled a few times and have
> never found it to run in a Freedos, msdos, nor Linux environment. If
> anyone knows of a small descendant of that original Tiny Pascal
> around, please advise (here) where I could find it. ??

You could try the "Tiny Pascal" mentioned here:

http://www.programmershelp.co.uk/pascalcompilers.php

It has source apparently, although I have no idea what the source is in,
or what platform it's for.

Total time: 30 seconds on Google :>

0 new messages