Bits And Bytes Computer Book Pdf

0 views

Skip to first unread message

Gaetan Boren

unread,

Aug 3, 2024, 4:21:23 PM8/3/24

to penmiletni

Note: The following information is providedin part by the Extreme Science and Engineering Discovery Environment(XSEDE), a National Science Foundation (NSF) project that provides researcherswith advanced digital resources and services that facilitatescientific discovery. For more, see the XSEDE website.

Because bits are so small, you rarely work with information one bit ata time. Bits are usually assembled into a group of eight to form abyte. A byte contains enough information to store asingle ASCII character, like "h".

Many hard drive manufacturers use a decimal number system to defineamounts of storage space. As a result, 1 MB is defined as one millionbytes, 1 GB is defined as one billion bytes, and so on. Since yourcomputer uses a binary system as mentioned above, you may notice adiscrepancy between your hard drive's published capacity and thecapacity acknowledged by your computer. For example, a hard drivethat is said to contain 10 GB of storage space using a decimal systemis actually capable of storing 10,000,000,000 bytes. However, in abinary system, 10 GB is 10,737,418,240 bytes. As a result, instead ofacknowledging 10 GB, your computer will acknowledge 9.31 GB. This is not a malfunction but a matter of differentdefinitions.

Note: The names and abbreviations for numbers ofbytes are easily confused with the notations for bits. Theabbreviations for numbers of bits use a lower-case "b" instead of anupper-case "B". Since one byte is made up of eight bits, thisdifference can be significant. For example, if a broadband Internetconnection is advertised with a download speed of3.0 Mbps, its speed is 3.0 megabitsper second, or 0.375 megabytes per second (whichwould be abbreviated as 0.375 MBps). Bits and bit rates(bits over time, as in bits per second [bps]) are most commonly usedto describe connection speeds, so pay particular attention whencomparing Internet connection providers and services.

This document was developed with support from National Science Foundation (NSF) grants 1053575 and 1548562. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.

There are some types of data (booleans, small integers, etc) that could be stored in fewer bits than a byte. Yet using less than a byte is not supported by any programming language I know of (natively).

Why does this minimum of using 8 bits to store data exist? Why do we even need bytes? Why don't computers just use increments of bits (1 or more bits) rather than increments of bytes (multiples of 8 bits)?

because at the hardware level memory is naturally organized into addressable chunks. Small chunks means that you can have fine grained things like 4 bit numbers; large chunks allow for more efficient operation (typically a CPU moves things around in 'chunks' or multiple thereof). IN particular larger addressable chunks make for bigger address spaces. If I have chunks that are 1 bit then an address range of 1 - 500 only covers 500 bits whereas 500 8 bit chunks cover 4000 bits.

Get the picture? Americans figured that characters could be stored in only 6 bits.
Then we discovered that there was more in the world than just English.
So we floundered around with 7-bit ascii and 8-bit EBCDIC.

For the purposes of explanation, let's imagine a ficticious machine with an architecture of the name of Bitel(TM) Inside or something of the like. The Bitel specifications mandate that the Central Processing Unit (CPU, i.e, microprocessor) shall access memory in one-bit units. Now, let's say a given instance of a Bitel-operated machine has a memory unit holding 32 billion bits (our ficticious equivalent of a 4GB RAM unit).

In my opinion, it's an issue of addressing. To access individual bits of data, you would need eight times as many addresses (adding 3 bits to each address) compared to using accessing individual bytes. The byte is generally going to be the smallest practical unit to hold a number in a program (with only 256 possible values).

And, once a thing goes on for long enough it becomes terribly hard to change. This is also why your hard drive or SSD likely still pretends to use 512 byte blocks. Even though the disk hardware does not use a 512 byte block and the OS doesn't either. (Advanced Format drives have a software switch to disable 512 byte emulation but generally only servers with RAID controllers turn it off.)

Also, Intel/AMD CPUs have so much extra silicon doing so much extra decoding work that the slight difference in 8 bit vs 64 bit addressing does not add any noticeable overhead. The CPU's memory controller is certainly not using 8 bits. It pulls data into cache in long streams and the minimum size is the cache line, often 64 bytes aka 512 bits. Often RAM hardware is slow to start but fast to stream so the CPU reads kilobytes into L3 cache, much like how hard drives read an entire track into their caches because the drive head is already there so why not?

Probably an answer lies in performance and memory alignment, and the fact that (I reckon partly because byte is called char in C) byte is the smallest part of machine word that can hold a 7-bit ASCII. Text operations are common, so special type for plain text have its gain for programming language.

What is so special about 8 bits that it deserves its own name?Computers do process all data as bits, but they prefer to process bits in byte-sized groupings. Or to put it another way: a byte is how much a computer likes to "bite" at once.

What's in a byte?A byte represents different types of information depending on the context. It might represent a number, a letter, or a program instruction. It might even represent part of an audio recording or a pixel in an image.

The size of the byte has historically been hardware-dependent and no definitive standards existed that mandated the size. Sizes from 1 to 48 bits have been used.[4][5][6][7] The six-bit character code was an often-used implementation in early encoding systems, and computers using six-bit and nine-bit bytes were common in the 1960s. These systems often had memory words of 12, 18, 24, 30, 36, 48, or 60 bits, corresponding to 2, 3, 4, 5, 6, 8, or 10 six-bit bytes. In this era, bit groupings in the instruction stream were often referred to as syllables[a] or slab, before the term byte became common.

The modern de facto standard of eight bits, as documented in ISO/IEC 2382-1:1993, is a convenient power of two permitting the binary-encoded values 0 through 255 for one byte, as 2 to the power of 8 is 256.[8] The international standard IEC 80000-13 codified this common meaning. Many types of applications use information representable in eight or fewer bits and processor designers commonly optimize for this usage. The popularity of major commercial computing architectures has aided in the ubiquitous acceptance of the 8-bit byte.[9] Modern architectures typically use 32- or 64-bit words, built of four or eight bytes, respectively.

The unit symbol for the byte was designated as the upper-case letter B by the International Electrotechnical Commission (IEC) and Institute of Electrical and Electronics Engineers (IEEE).[10] Internationally, the unit octet, symbol o, explicitly defines a sequence of eight bits, eliminating the potential ambiguity of the term "byte".[11][12]

The term byte was coined by Werner Buchholz in June 1956,[4][13][14][b] during the early design phase for the IBM Stretch[15][16][1][13][14][17][18] computer, which had addressing to the bit and variable field length (VFL) instructions with a byte size encoded in the instruction.[13] It is a deliberate respelling of bite to avoid accidental mutation to bit.[1][13][19][c]

Another origin of byte for bit groups smaller than a computer's word size, and in particular groups of four bits, is on record by Louis G. Dooley, who claimed he coined the term while working with Jules Schwartz and Dick Beeler on an air defense system called SAGE at MIT Lincoln Laboratory in 1956 or 1957, which was jointly developed by Rand, MIT, and IBM.[20][21] Later on, Schwartz's language JOVIAL actually used the term, but the author recalled vaguely that it was derived from AN/FSQ-31.[22][21]

Early computers used a variety of four-bit binary-coded decimal (BCD) representations and the six-bit codes for printable graphic patterns common in the U.S. Army (FIELDATA) and Navy. These representations included alphanumeric characters and special graphical symbols. These sets were expanded in 1963 to seven bits of coding, called the American Standard Code for Information Interchange (ASCII) as the Federal Information Processing Standard, which replaced the incompatible teleprinter codes in use by different branches of the U.S. government and universities during the 1960s. ASCII included the distinction of upper- and lowercase alphabets and a set of control characters to facilitate the transmission of written language as well as printing device functions, such as page advance and line feed, and the physical or logical control of data flow over the transmission media.[18] During the early 1960s, while also active in ASCII standardization, IBM simultaneously introduced in its product line of System/360 the eight-bit Extended Binary Coded Decimal Interchange Code (EBCDIC), an expansion of their six-bit binary-coded decimal (BCDIC) representations[d] used in earlier card punches.[23]The prominence of the System/360 led to the ubiquitous adoption of the eight-bit storage size,[18][16][13] while in detail the EBCDIC and ASCII encoding schemes are different.

In the early 1960s, AT&T introduced digital telephony on long-distance trunk lines. These used the eight-bit μ-law encoding. This large investment promised to reduce transmission costs for eight-bit data.

In Volume 1 of The Art of Computer Programming (first published in 1968), Donald Knuth uses byte in his hypothetical MIX computer to denote a unit which "contains an unspecified amount of information ... capable of holding at least 64 distinct values ... at most 100 distinct values. On a binary computer a byte must therefore be composed of six bits".[24] He notes that "Since 1975 or so, the word byte has come to mean a sequence of precisely eight binary digits...When we speak of bytes in connection with MIX we shall confine ourselves to the former sense of the word, harking back to the days when bytes were not yet standardized."[24]