Dynamic Link Libraries

0 views

Skip to first unread message

Isabella Kells

unread,

Aug 3, 2024, 2:05:53 PM8/3/24

to citelbartde

A DLL file often has file extension .dll, but can have any file extension. Developers can choose to use a file extension that describes the content of the file such as .ocx for ActiveX controls and .drv for a legacy device driver.

The file format of a DLL is the same as for an executable (a.k.a. EXE), but different versions of Windows use different formats. 32-bit and 64-bit Windows versions use Portable Executable (PE), and 16-bit Windows versions use New Executable (NE).

The main difference between DLL and EXE is that a DLL cannot be run directly since the operating system requires an entry point to start execution. Windows provides a utility program (RUNDLL.EXE/RUNDLL32.EXE) to execute a function exposed by a DLL.

The first versions of Microsoft Windows ran programs together in a single address space. Every program was meant to co-operate by yielding the CPU to other programs so that the graphical user interface (GUI) could multitask and be maximally responsive. All operating-system level operations were provided by the underlying operating system: MS-DOS. All higher-level services were provided by Windows Libraries "Dynamic Link Library". The Drawing API, Graphics Device Interface (GDI), was implemented in a DLL called GDI.EXE, the user interface in USER.EXE. These extra layers on top of DOS had to be shared across all running Windows programs, not just to enable Windows to work in a machine with less than a megabyte of RAM, but to enable the programs to co-operate with each other. The code in GDI needed to translate drawing commands to operations on specific devices. On the display, it had to manipulate pixels in the frame buffer. When drawing to a printer, the API calls had to be transformed into requests to a printer. Although it could have been possible to provide hard-coded support for a limited set of devices (like the Color Graphics Adapter display, the HP LaserJet Printer Command Language), Microsoft chose a different approach. GDI would work by loading different pieces of code, called "device drivers", to work with different output devices.

The same architectural concept that allowed GDI to load different device drivers also allowed the Windows shell to load different Windows programs, and for these programs to invoke API calls from the shared USER and GDI libraries. That concept was "dynamic linking".

In a conventional non-shared static library, sections of code are simply added to the calling program when its executable is built at the "linking" phase; if two programs call the same routine, the routine is included in both the programs during the linking stage of the two. With dynamic linking, shared code is placed into a single, separate file. The programs that call this file are connected to it at run time, with the operating system (or, in the case of early versions of Windows, the OS-extension), performing the binding.

For those early versions of Windows (1.0 to 3.11), the DLLs were the foundation for the entire GUI. As such, display drivers were merely DLLs with a .DRV extension that provided custom implementations of the same drawing API through a unified device driver interface (DDI), and the Drawing (GDI) and GUI (USER) APIs were merely the function calls exported by the GDI and USER, system DLLs with .EXE extension.

This notion of building up the operating system from a collection of dynamically loaded libraries is a core concept of Windows that persists as of 2015[update].DLLs provide the standard benefits of shared libraries, such as modularity. Modularity allows changes to be made to code and data in a single self-contained DLL shared by several applications without any change to the applications themselves.

Another benefit of modularity is the use of generic interfaces for plug-ins. A single interface may be developed which allows old as well as new modules to be integrated seamlessly at run-time into pre-existing applications, without any modification to the application itself. This concept of dynamic extensibility is taken to the extreme with the Component Object Model, the underpinnings of ActiveX.

In Windows 1.x, 2.x and 3.x, all Windows applications shared the same address space as well as the same memory. A DLL was only loaded once into this address space; from then on, all programs using the library accessed it. The library's data was shared across all the programs. This could be used as an indirect form of inter-process communication, or it could accidentally corrupt the different programs. With the introduction of 32-bit libraries in Windows 95, every process ran in its own address space. While the DLL code may be shared, the data is private except where shared data is explicitly requested by the library. That said, large swathes of Windows 95, Windows 98 and Windows Me were built from 16-bit libraries, which limited the performance of the Pentium Pro microprocessor when launched, and ultimately limited the stability and scalability of the DOS-based versions of Windows.

The executable code of a DLL runs in the memory space of the calling process and with the same access permissions, which means there is little overhead in their use, but also that there is no protection for the calling program if the DLL has any sort of bug.

The DLL technology allows for an application to be modified without requiring consuming components to be re-compiled or re-linked. A DLL can be replaced so that the next time the application runs it uses the new DLL version. To work correctly, the DLL changes must maintain backward compatibility.

The code in a DLL is usually shared among all the processes that use the DLL; that is, they occupy a single place in physical memory, and do not take up space in the page file. Windows does not use position-independent code for its DLLs; instead, the code undergoes relocation as it is loaded, fixing addresses for all its entry points at locations which are free in the memory space of the first process to load the DLL. In older versions of Windows, in which all running processes occupied a single common address space, a single copy of the DLL's code would always be sufficient for all the processes. However, in newer versions of Windows which use separate address spaces for each program, it is only possible to use the same relocated copy of the DLL in multiple programs if each program has the same virtual addresses free to accommodate the DLL's code. If some programs (or their combination of already-loaded DLLs) do not have those addresses free, then an additional physical copy of the DLL's code will need to be created, using a different set of relocated entry points. If the physical memory occupied by a code section is to be reclaimed, its contents are discarded, and later reloaded directly from the DLL file as necessary.

In contrast to code sections, the data sections of a DLL are usually private; that is, each process using the DLL has its own copy of all the DLL's data. Optionally, data sections can be made shared, allowing inter-process communication via this shared memory area. However, because user restrictions do not apply to the use of shared DLL memory, this creates a security hole; namely, one process can corrupt the shared data, which will likely cause all other sharing processes to behave undesirably. For example, a process running under a guest account can in this way corrupt another process running under a privileged account. This is an important reason to avoid the use of shared sections in DLLs.

If a DLL is compressed by certain executable packers (e.g. UPX), all of its code sections are marked as read and write, and will be unshared. Read-and-write code sections, much like private data sections, are private to each process. Thus DLLs with shared data sections should not be compressed if they are intended to be used simultaneously by multiple programs, since each program instance would have to carry its own copy of the DLL, resulting in increased memory consumption.

Like static libraries, import libraries for DLLs are noted by the .lib file extension. For example, kernel32.dll, the primary dynamic library for Windows's base functions such as file creation and memory management, is linked via kernel32.lib. The usual way to tell an import library from a proper static library is by size: the import library is much smaller as it only contains symbols referring to the actual DLL, to be processed at link-time. Both nevertheless are Unix ar format files.

Linking to dynamic libraries is usually handled by linking to an import library when building or linking to create an executable file. The created executable then contains an import address table (IAT) by which all DLL function calls are referenced (each referenced DLL function contains its own entry in the IAT). At run-time, the IAT is filled with appropriate addresses that point directly to a function in the separately loaded DLL.[3]

In Cygwin/MSYS and MinGW, import libraries are conventionally given the suffix .dll.a, combining both the Windows DLL suffix and the Unix ar suffix. The file format is similar, but the symbols used to mark the imports are different (_head_foo_dll vs __IMPORT_DESCRIPTOR_foo).[4] Although its GNU Binutils toolchain can generate import libraries and link to them, it is faster to link to the DLL directly.[5] An experimental tool in MinGW called genlib can be used to generate import libs with MSVC-style symbols.

Each function exported by a DLL is identified by a numeric ordinal and optionally a name. Likewise, functions can be imported from a DLL either by ordinal or by name. The ordinal represents the position of the function's address pointer in the DLL Export Address table. It is common for internal functions to be exported by ordinal only. For most Windows API functions only the names are preserved across different Windows releases; the ordinals are subject to change. Thus, one cannot reliably import Windows API functions by their ordinals.

Importing functions by ordinal provides only slightly better performance than importing them by name: export tables of DLLs are ordered by name, so a binary search can be used to find a function. The index of the found name is then used to look up the ordinal in the Export Ordinal table. In 16-bit Windows, the name table was not sorted, so the name lookup overhead was much more noticeable.