Method for selecting potential medicinal compounds

xyli83
May 2, 2017
5 min read

Medicilon's structural biology department offers services supporting structure-based drug discovery from determination of novel targets to final structures. Our platform is one of the earliest established structural biology platforms in China and has been certified by the Shanghai Government. Email:marketing@medicilon.com.cn Web:www.medicilon.com

A method for the structure based drug design, searching for and selection of potential medicinal compounds is proposed, which comprises predicting the value of the ligand binding affinities from the score calculated with the help of a scoring function with taking into account the protein structure, the ligand structure and the ligand position in the protein binding site. In the elaboration of the scoring function information about the already known both active and inactive ligands is employed. The use of the information about the inactive ligands makes the proposed method of elaborating the scoring function fundamentally different from all the known methods and allows not only to essentially improve the quality of the scoring function being elaborated, but also to constantly improve this quality as new experimental data become available.

The present invention relates to medical chemistry and may be used for searching for medicinal substances having a required biological activity or function.

There exists a whole group of drugs which are relatively small chemical compounds capable of binding to definite proteins in an organism in a definite region on a protein, which is called binding site. It is known that the quality of this interaction is determined by the binding affinity or the binding free energy of the chemical compound-protein interaction. The smaller the binding free energy, the stronger the interaction is. All chemical substances which may be candidates for the role of drugs and interact with a protein are called ligands. A ligand which interacts with a protein in a binding site with an energy smaller than -9 kcal/mole is referred to as active for a given protein.

One of the main goals of structural drug design is to predict and find active ligands for a prescribed protein, using the structure of the binding site of this protein. To solve this problem, reliable and fast numerical methods for predicting the ligand-protein interaction are required.

In the course of searching for new active ligands the following technologies have received wide recognition: de novo drug design, virtual screening, and docking.

De novo drug design comprises creating a virtual ligand having a minimum score, with indicating its position in the binding site.

Virtual screening comprises docking a multiplicity of ligands into the protein binding site and ranging these ligands in accordance with the score obtained as a result of docking, with a view to selecting the ligands with the best score.

Provided that the de novo drug design and virtual screening operate correctly, the selected ligands must be the most active for a given protein.

Docking of a ligand comprises a process of selecting such position of a ligand in the binding site of a protein, in which the ligand has the best score. Score is the number which is determined by the structure of the ligand, by the structure of the binding site, and depends on the position of the ligand in the binding site. Score is also understood as a set of methods which make it possible to calculate the score value. Correct score must be proportional to the binding affinity or the binding free energy of the ligand-protein interaction.

The scores or approaches to predicting the ligand-protein interaction from the structure and position of the protein and the ligand may be divided into several groups: molecular dynamics, physical methods based on force fields, empirical and knowledge based.

The most widespread approaches to predicting the ligand-protein interaction from the structure of the protein and of the ligand and from the position of the ligand in the protein binding site are empirical. These methods are the fastest and simplest. The interaction prediction speed is one of decisive factors in the structural drug design, because fast methods allow carrying out complete enumeration of multiplicities of molecules and positions of molecules with a view to finding an optimal molecule and its position.

The empirical methods for predicting the ligand-protein interaction from the structure of the protein, the structure of the ligand and the position of the ligand in the protein binding site are based on a set of structures of proteins, of ligands in the binding sites of these proteins and of experimentally known binding affinities for these proteins and ligands. In the empirical methods a certain physically reasonable model of the ligand-protein interaction is proposed. In this model some parameters are selected — trained — so that the binding affinity or the free energy predicted by the model for the known structures of proteins and ligands should most closely correspond to the experimentally known binding affinities or free energies for these proteins and ligands.

The basic rule in empirical approaches is: an empirical model operates correctly only if the problem to which it is applied is analogous with the problem on which the model was developed and the object to which the model is applied are analogous with the objects which were used in elaborating the model.

The task of the virtual screening and de novo drug design is to separate active ligands from inactive ones on one particular protein, whereas in developing empirical scores, currently use is made only of information only about active ligands, and for different proteins simultaneously. In the judgment of the authors of the present invention, this particular inconsistency is responsible for all the main problems in using the known empirical scores for the virtual screening and de novo drug design.

In the opinion of the authors, the quality of the scores and therefore the quality of the virtual screening and design of potentially active ligands at the current moment are not always acceptable. The quality of virtual screening carried out by the same docking program but with different scores substantially differ. One and the same score may operate adequately in the course of virtual screening on one protein but operate poorly on another protein.

For improving the quality of the virtual screening and design of potentially active ligands, a multiplicity of methods have been developed: additional filters for eliminating inherently wrong positions, joint use of several scores simultaneously in a consensus scoring, etc. All these methods attempt to find a universal solution which would operate adequately well for all types of proteins and ligands. As an alternative, there exists another approach as well: elaboration of focused scores for virtual screening and design of potentially active ligands on a specific target. At the moment there are known several procedures for creating focused scores, which have made a good showing.

Development of focused empirical scores is a promising technology also in view of the fact that the corpus of data about proteins and active ligands for given proteins grows extremely both within private pharmaceutical companies and in the academic community.