Every file on a computer uses a certain amount of resources when sent over the internet or stored. Keeping mind of your kilobytes (kB) and megabytes (MB) can prevent problems and produce a smoother online experience. This GreenNet guide is here to help you tell the whales from the minnows.
Computer resources do have physical limits to their capacities, even if the idea of computer resources can be scaled up indefinitely. So we really want to think of the sizes of files in a tidy, minimalist way and thereby make the most of the resources we already have. Although most people nowadays seem to have internet connections which cope easily with audio, video and high-resolution images, it is worth remembering that many people do not. If care is not taken, it is possible to produce a large media file that actually conveys no more information to people than a file a tenth or a hundredth of the size.
Software packages that consume excessive memory and disk space for their function are sometimes called "bloatware", and one could apply a similar aesthetic to media files. For instance, making transcripts available on a web site might help people to find the information they are looking for more quickly than having audio or video interviews alone. Similarly, you might want to consider whether it's easier for people, including those with visual impairments, to read the date and time of an event from a text email, or to have to open a large PDF or image file of a poster. (By the way, the Microsoft term "document" for files never really caught on. The two words are synonymous in this context.)
So how big is too big? Obviously, it depends on the context. If you are signing off on a report that is intended to go to the printers, then emailing a 10MB PDF attachment to a few people asking for final comments is completely reasonable. What would be unreasonable is then to email the finished 10MB file to your list of 2000 supporters. Instead, you could create a lower-resolution or even text-only version of the PDF, put that on your website, and email a link to the file, perhaps with a little indicator of the file size (like "[1.2 MB PDF]") next to the download link.
Although the download might take 15 seconds for some people (eg GreenNet ADSL2+ broadband offering speeds "up to" 12Mbps), 10% of household internet connections in the UK as at 2009 are still dial-up, higher in many other countries. A 10MB download on dial-up might take nearly an hour. And older broadband connections or in rural areas the download speed might be 512kbps and the transfer still takes several minutes. Even on the fastest broadband, uploading is often limited to 256kbps, so if you expect a 10MB file to be retransmitted, that is likely to be slower than expected.
A large file on its own may be no problem, but when multiplied by the size of the audience it can cause bandwidth problems that affect internet service providers and other users. Transmission also consumes a greater amount of energy, and it may result in having to upgrade hardware (up to 80% of energy over the lifetime of computer equipment is "embodied", that is, in its manufacture). GreenNet doesn't limit bandwidth, but it is subject to a "fair use" policy.
Once downloaded, larger files are harder to manipulate. Large emails can slow down access to an email inbox, and will increase the size of mailbox files on the recipients' computers. Large image files on a web page often have to be scaled by the browser software and mean navigating and scrolling through the page can be slow and erratic. (There are other things that can cause slow "rendering" of a page, such as Javascript or a complex website "back-end".)
Then there's the backup. If someone intends to keep the document or image or archives all email, it might be replicated on backup media many times over. People may also be reluctant to keep files that consume more storage than they are worth, and so delete them.
(To confuse matters, "1 KB" or "1K" is used by many computer people to mean 1024 bytes, which is a convenient number in binary, and memory or disk is often allocated by operating systems in units of 1024. To avoid this confusion with standard scientific usage of "mega-" and so on, the terms "kibibyte" (KiB), "mebibyte" (MiB), "gibibyte" (GiB) and "tebibyte" are now recommended for these non-decimal technical units. You might still feel short-changed if you bought a 4GB flash drive and it's only 3.725GiB. For simplicity this article will stick to round 1000s and kilobytes [kB].)
File or attachment size is usually easily accessible, if not already prominent. In Windows, right-clicking on any file, folder, or drive and choosing "Properties..." will show the size. In an Explorer window, you can select "Details" from the "View" menu; or in a file open or save dialogue box there is a "View" button from which you can also choose "Details". If you then click on the word "Size" at the top of the column, you can group together the largest files in a folder. In Mac OS X, you can press Command+i to show details of an individual file, or Command+Option+i to show details of all selected items in an Inspector window. The Mac equivalent of Details view is "List" view, and Command+J gives you the option to "calculate all sizes" of folders as well as files.
Most email programs such as Windows Mail or Thunderbird always show the size of attachments next to the file name. In Thunderbird (and many other programs) you can click on the columns button up the top right of a list to add a column showing the size of each item. FTP programs, used to transfer files to websites, almost all show the size of files by default, although usually in bytes, so you need to split these large numbers by eye into groups of three digits to see which are measured in B or kB and which in MB.
As you may gather, one of the main factors in determining how cumbersome a file is is the quality or resolution of images. A 300 dpi (dots or pixels per inch) image added to a word-processor or PDF file takes up about four times as much space as a 150 dpi image (because the resolution applies both horizontally and vertically). Now, if you need to share an image with someone online either on a website or by email, and you're not expecting them to print it out, nor to expect perfect copy or zoom in to examine minute detail, then it's only going to be shown on the screen. So it's worth knowing a bit about screen resolutions. A typical flat-panel screen is 1280 pixels wide. However, some may be smaller or lower resolution, and allowing for navigation bars and margins on the side of a screen, and also that a visitor's web browser might not occupy the full size of the screen, there's probably little point in uploading an image that is wider than 800 pixels. Anything larger and the the viewer may only see the top left-hand corner of the image and have to scroll to see the rest.
Scans or digital photos may be 20 times that size and yet appear no sharper to the recipient. So if you have such an image, you will need to resize it or scale down before you upload or publish. A common mistake when creating a web page is to try to resize the image on the page by changing the image element properties. Some content-management systems, such as Drupal, may include an image module that automatically creates a scaled copy of the image at the size you specify, but if you're editing pages in a web authoring program like Dreamweaver or KompoZer, the chances are you're forcing every web site visitor to download far too much information and then make their computer work quite hard at doing the downscaling. So it's best to try to keep photographic images, even banners, to no more than 800 pixels across and perhaps no more than 50 KB. Any image-editing software, such as the open source GIMP, allows you to easily produce a smaller file. Simply open the large file, choose an "image size" or "scale image" function, select the width you want, remembering that 800px is often full-width, and save in an appropriate file format.
The other thing to be aware of with images is the different advantages of the different kinds of compression and file format. As mentioned above, JPEG files (also called .jpg files because Windows was once limited to 3-character extensions) are most commonly used for photography, and JPEG is the format used by almost all digital cameras. They store a full range of colours but do lose a certain amount of fine detail; there is a balance between the file size and the acceptable amount of distortion. A highly compressed JPEG may show a Fourier fringe effect, but most people won't notice it. Mostly you will want a mid-range JPEG quality around 50 (out of 100). The other main formats used on the web are PNG or the older GIF, and these are "lossless" formats that are not suitable for photographs or full-colour scans of artwork. However, for images such as line drawings or logos that have been created on a computer in the first place, choosing PNG allows areas of flat colour to be compressed very efficiently and maintain the sharp edges of a design that JPEG would lose. PNG also tends to be used for smaller images, as for larger images the size reduction from using JPEG is much more important. The following images illustrate the reason JPG is not used for small files with only a handful of colours:
When you attach a file to an email, it will usually be converted to ("base 64") text, which can only represent 6 bits per character. This means a 1 MB file will produce an email of about 1.37 MB (including some additional overhead, the ratio works out at 26:19, 26 bytes of email for every 19 bytes of attachment).
Data transfer speeds can be measured in bits (usually for the rating of the connection itself) or bytes (more commonly for actual download or upload speeds, and shown with a capital "B"). The conversion factor is usually 8 bits to 1 byte (excluding now-rare parity or stop bits). So an old dial-up modem might upload and download at 32kbps, but that is only 4 kBps or 4000 bytes per second. A broadband/DSL connection rated at 8 megabits per second (Mbps) actually only means an absoute maximum of 1MB/s, and a 100MB software package (like OpenOffice) will take at least 100 seconds to download, very possibly longer.
b1e95dc632