Like all other catalysts, enzymes are characterized by two fundamental properties. First, they increase the rate of chemical reactions without themselves being consumed or permanently altered by the reaction. Second, they increase reaction rates without altering the chemical equilibrium between reactants and products.
These principles of enzymatic catalysis are illustrated in the following example, in which a molecule acted upon by an enzyme (referred to as a substrate [S]) is converted to a product (P) as the result of the reaction. In the absence of the enzyme, the reaction can be written as follows:
The effect of the enzyme on such a reaction is best illustrated by the energy changes that must occur during the conversion of S to P (Figure 2.22). The equilibrium of the reaction is determined by the final energy states of S and P, which are unaffected by enzymatic catalysis. In order for the reaction to proceed, however, the substrate must first be converted to a higher energy state, called the transition state. The energy required to reach the transition state (the activation energy) constitutes a barrier to the progress of the reaction, limiting the rate of the reaction. Enzymes (and other catalysts) act by reducing the activation energy, thereby increasing the rate of reaction. The increased rate is the same in both the forward and reverse directions, since both must pass through the same transition state.
The catalytic activity of enzymes involves the binding of their substrates to form an enzyme-substrate complex (ES). The substrate binds to a specific region of the enzyme, called the active site. While bound to the active site, the substrate is converted into the product of the reaction, which is then released from the enzyme. The enzyme-catalyzed reaction can thus be written as follows:
Note that E appears unaltered on both sides of the equation, so the equilibrium is unaffected. However, the enzyme provides a surface upon which the reactions converting S to P can occur more readily. This is a result of interactions between the enzyme and substrate that lower the energy of activation and favor formation of the transition state.
The binding of a substrate to the active site of an enzyme is a very specific interaction. Active sites are clefts or grooves on the surface of an enzyme, usually composed of amino acids from different parts of the polypeptide chain that are brought together in the tertiary structure of the folded protein. Substrates initially bind to the active site by noncovalent interactions, including hydrogen bonds, ionic bonds, and hydrophobic interactions. Once a substrate is bound to the active site of an enzyme, multiple mechanisms can accelerate its conversion to the product of the reaction.
Although the simple example discussed in the previous section involved only a single substrate molecule, most biochemical reactions involve interactions between two or more different substrates. For example, the formation of a peptide bond involves the joining of two amino acids. For such reactions, the binding of two or more substrates to the active site in the proper position and orientation accelerates the reaction (Figure 2.23). The enzyme provides a template upon which the reactants are brought together and properly oriented to favor the formation of the transition state in which they interact.
In addition to bringing multiple substrates together and distorting the conformation of substrates to approach the transition state, many enzymes participate directly in the catalytic process. In such cases, specific amino acid side chains in the active site may react with the substrate and form bonds with reaction intermediates. The acidic and basic amino acids are often involved in these catalytic mechanisms, as illustrated in the following discussion of chymotrypsin as an example of enzymatic catalysis.
Substrates bind to the serine proteases by insertion of the amino acid adjacent to the cleavage site into a pocket at the active site of the enzyme (Figure 2.25). The nature of this pocket determines the substrate specificity of the different members of the serine protease family. For example, the binding pocket of chymotrypsin contains hydrophobic amino acids that interact with the hydrophobic side chains of its preferred substrates. In contrast, the binding pocket of trypsin contains a negatively charged acidic amino acid (aspartate), which is able to form an ionic bond with the lysine or arginine residues of its substrates.
Substrate binding positions the peptide bond to be cleaved adjacent to the active site serine (Figure 2.26). The proton of this serine is then transferred to the active site histidine. The conformation of the active site favors this proton transfer because the histidine interacts with the negatively charged aspartate residue. The serine reacts with the substrate, forming a tetrahedral transition state. The peptide bond is then cleaved, and the C-terminal portion of the substrate is released from the enzyme. However, the N-terminal peptide remains bound to serine. This situation is resolved when a water molecule (the second substrate) enters the active site and reverses the preceding reactions. The proton of the water molecule is transferred to histidine, and its hydroxyl group is transferred to the peptide, forming a second tetrahedral transition state. The proton is then transferred from histidine back to serine, and the peptide is released from the enzyme, completing the reaction.
This example illustrates several features of enzymatic catalysis; the specificity of enzyme-substrate interactions, the positioning of different substrate molecules in the active site, and the involvement of active-site residues in the formation and stabilization of the transition state. Although the thousands of enzymes in cells catalyze many different types of chemical reactions, the same basic principles apply to their operation.
In addition to binding their substrates, the active sites of many enzymes bind other small molecules that participate in catalysis. Prosthetic groups are small molecules bound to proteins in which they play critical functional roles. For example, the oxygen carried by myoglobin and hemoglobin is bound to heme, a prosthetic group of these proteins. In many cases metal ions (such as zinc or iron) are bound to enzymes and play central roles in the catalytic process. In addition, various low-molecular-weight organic molecules participate in specific types of enzymatic reactions. These molecules are called coenzymes because they work together with enzymes to enhance reaction rates. In contrast to substrates, coenzymes are not irreversibly altered by the reactions in which they are involved. Rather, they are recycled and can participate in multiple enzymatic reactions.
Several other coenzymes also act as electron carriers, and still others are involved in the transfer of a variety of additional chemical groups (e.g., carboxyl groups and acyl groups; Table 2.1). The same coenzymes function together with a variety of different enzymes to catalyze the transfer of specific chemical groups between a wide range of substrates. Many coenzymes are closely related to vitamins, which contribute part or all of the structure of the coenzyme. Vitamins are not required by bacteria such as E. coli but are necessary components of the diets of human and other higher animals, which have lost the ability to synthesize these compounds.
An important feature of most enzymes is that their activities are not constant but instead can be modulated. That is, the activities of enzymes can be regulated so that they function appropriately to meet the varied physiological needs that may arise during the life of the cell.
One common type of enzyme regulation is feedback inhibition, in which the product of a metabolic pathway inhibits the activity of an enzyme involved in its synthesis. For example, the amino acid isoleucine is synthesized by a series of reactions starting from the amino acid threonine (Figure 2.28). The first step in the pathway is catalyzed by the enzyme threonine deaminase, which is inhibited by isoleucine, the end product of the pathway. Thus, an adequate amount of isoleucine in the cell inhibits threonine deaminase, blocking further synthesis of isoleucine. If the concentration of isoleucine decreases, feedback inhibition is relieved, threonine deaminase is no longer inhibited, and additional isoleucine is synthesized. By so regulating the activity of threonine deaminase, the cell synthesizes the necessary amount of isoleucine but avoids wasting energy on the synthesis of more isoleucine than is needed.
The activities of enzymes can also be regulated by their interactions with other proteins and by covalent modifications, such as the addition of phosphate groups to serine, threonine, or tyrosine residues. Phosphorylation is a particularly common mechanism for regulating enzyme activity; the addition of phosphate groups either stimulates or inhibits the activities of many different enzymes (Figure 2.30). For example, muscle cells respond to epinephrine (adrenaline) by breaking down glycogen into glucose, thereby providing a source of energy for increased muscular activity. The breakdown of glycogen is catalyzed by the enzyme glycogen phosphorylase, which is activated by phosphorylation in response to the binding of epinephrine to a receptor on the surface of the muscle cell. Protein phosphorylation plays a central role in controlling not only metabolic reactions but also many other cellular functions, including cell growth and differentiation.
The activities of most enzymes and drugs depend on interactions between proteins and small molecules. Accurate prediction of these interactions could greatly accelerate pharmaceutical and biotechnological research. Current machine learning models designed for this task have a limited ability to generalize beyond the proteins used for training. This limitation is likely due to a lack of information exchange between the protein and the small molecule during the generation of the required numerical representations. Here, we introduce ProSmith, a machine learning framework that employs a multimodal Transformer Network to simultaneously process protein amino acid sequences and small molecule strings in the same input. This approach facilitates the exchange of all relevant information between the two molecule types during the computation of their numerical representations, allowing the model to account for their structural and functional interactions. Our final model combines gradient boosting predictions based on the resulting multimodal Transformer Network with independent predictions based on separate deep learning representations of the proteins and small molecules. The resulting predictions outperform recently published state-of-the-art models for predicting protein-small molecule interactions across three diverse tasks: predicting kinase inhibitions; inferring potential substrates for enzymes; and predicting Michaelis constants KM. The Python code provided can be used to easily implement and improve machine learning predictions involving arbitrary protein-small molecule interactions.
c01484d022