(Author: Prof. Dr. S. Leonov)

“The following Description summons a maximum diversity set preparation out of 1 550 000 stock available collection of our Partners

Compound set will complement the already existing diversity of the current stock and will be combined  with a focused set from the Chemoinformatic (CI) & Medicinal Chemistry(MC) Team towards requested Target(s)

Number of chemotypes

The team is known for it’s unparalleled ability to expand the portfolio of validated chemical library systems, development of new reactions, scale up of intermediates, and innovations in diversity- and target-specific chemistry space design. 

The compounds will be selected from our Partners’s small molecules collection of more than 1,550,000 diverse drug-like small molecules. Selection can be made against the existing customer library and/or taking into account structural and functional considerations of the customer to maximally increase the diversity of the original data set to maximize the void chemical space between the libraries.

The collection consists of more than 15,000 chemical families (chemotypes), that provides the highest available diversity score and reasonable number of analogs in the series.

Diversity calculations (described in Trepalin et al.).

Molecular diversity calculation procedures include assessment of 2D structural fragment descriptors, diversity of heterocycles, and maximal plate diversity. Each algorithm allows optimization of both intra- and interlibrary diversities.

ChemoSoftTM approach is used to dissect a molecule into the structural fragments and calculate the diversity. Sequentially, taking each atom in the molecule as a centroid, the algorithm determines two structural fragments (“screens”) around each centroid, comprised of either “nearest” atoms (captured in the sphere with a radius equal to one valent bond) and the fragment with the “nearest” and “next-to-nearest” atoms (captured in the sphere with a radius equal to two valent bonds). After the screens are determined for each molecule, a combined “screen key” is created that contains all of the fragments describing the two (or more) molecules including fragments that are common to the molecules under consideration, as well as unique fragments. For each molecule a binary code is then determined. The binary code of 1 or 0 is assigned to each screen that is present or absent in the molecule.” The molecule keys are then compared using the following equation:

Analog Series 

The team developed and practice a strategy combining the rigor of single compound synthesis in liquid phase with the high throughput of parallel synthesis and purification of combinatorial methods. One promising method enhancing the efficiency of compound synthesis is the use of multicomponent reactions, in which several building blocks are brought together in a single step. We applied this method for the Ugi, Biginelli, Passerini, Tsuge, and other reactions and developed several significant modifications of known multicomponent reactions.

Such approach gives our customers the opportunity to perform direct hit analogs search from stock available compounds.   

Privileged structures/ Focused Sets

Privileged structures are defined as chemical scaffolds present in many biologically active ligands and determining the molecule’s specificity (Evans et al., 1988). Using “privileged” scaffolds as building blocks is advantageous in synthesis of diverse sets of derivatives for discovery libraries, particularly in the cases when no small molecule ligands known for the target and no structural information is available. For enriching IP potential of these libraries, we applied the privileged scaffolds approach have implemented structural morphing of privileged structures based on functional equivalents of their constituent hetero atoms.

The team actively explores a space of natural and semi-natural scaffolds as an important libraries development strategy. When design the synthetic routes for new series, we maintain a “genetic originality” of the compounds by retaining their unique core scaffolds and applying focused modifications of side chains.

Our strategy for lead generation (focused) libraries design includes :

• Gathering project data (reference compounds, X-ray structure if available, literature patents)
• Computational chemistry (methods depends on target)
• Chemistry evaluation (investigate parallel feasibility)
• CCE database. Library ideas. Synthesis

The computational tool that we use on regular basis to narrow down large chemical space to a more relevant chemical space are listed here.

• ChemoSoftTM (MIPT)
• Smart Mining (MIPT)
• Cerius2 (Accelrys)
• Discovery Studio (Accelrys)
• NeuroSolution (NeuroDimension)
• ISIS Base (MDL) 
• AutoDock (Scripps) 
• Surflex (Biopharmics) 
• MolSoft ICM (MolSoft LLC) 

The first two programs have been created here at MIPT/CDRI. The last one is our partner. All of these tools help maximize the chance of your finding an active compound using such familiar approaches as docking, searches based on 3D pharmacophore models and shape similarity (target-based strategy) and 2D fingerprint similarity, QSAR models, and substructure searches (in the ligand-based franchise).

We also actively use a Neural Networks (NN) approach, frequently termed as AI-based approach, to assess libraries’ various parameters influencing ADME/PK characteristics such as potential interactions with P450, blood-brain barrier permeability, DMSO solubility, probability of being modulators of particular target classes, etc.

For such libraries, using ChemoSoftTM, we can efficiently calculate prediction of their major physiochemical parameters, which are routinely used for the assessment of the compounds’ drug-likeness based on the Lipinski rules of 5and certain ADME predictions. 

Examples of available focused libraries platforms:

Kinase library (set of sublibraries), GPCR library, AGRO library, Apoptotic library, CNS library (set of sublibraries), Fragment based library, HSP90 library, ion channel library, proteases library, NHR library, phosphatases library, peptidomimetic library, focused diversity set.

Examples of custom made focused sets:

• GPCRs: Serotonin 5-HT6, 5-HT7, 5-HT2C, Glutamate mGluR5, mGluR2/3, mGluR7, Galanin Gal-3, Dopamine D1, Histamine H1, H3, NeurotensinNT1/2, Bradycinin B1/2, Tachykinin NK1, NK3, OrexinOX-2, Opioid-like ORL-1, Urotensin, Bombesin brs3, Chemokine CXCR4, CXCR3, CCR5, CXCR1/2, CCR2, PAR-1, Oxytocin, Glucagon, Glucagon-like GLP-1, Cannabinoid CB-1/2, GPR 30, GPR116, GPR119, TAAR1, GPR40, SNORF25, Niacin, etc.
• Kinases:  VEGF2R, Raf1, PI3Kalpha, CDK-1, VEGFR2, Raf-1, Akt, mTOR, PKC-beta, GSK-3b, MK2, JNK1, IKKa, c-Met, CHK1, IGF1R, Rho, Aurora 1/2, etc.
• Ion Channels: P2X7, Vaniloid, NMDA, AMPA, iGluR3, Kv1-3, nAChR-alpha7, nAChR-alpha4beta2, etc.
• Nuclear Receptors: PR-2, RARgamma, ERRalpha, FXR, LXR, ERbeta, GR, etc.
• Others: MCL-1, BCL-2, HSP-90, GlyT1/2, PDZ, Sulfotransferase, etc

For a project with a Customer we propose special set of 23 000 compounds towards requested targets(s) (see description in separated presentation)

Drug-Likeness/Potential IP

Most screening compounds in the collection satisfy stringent criteria, partly predictable by molecular properties. In establishing the diversity-oriented chemical space for discovery libraries, we use four types of Medicinal Chemistry Filters (MCF) which sieves off compounds containing chemical groups undesirable in drug development for various reasons.

The first filter (MCF-1) screens the compounds for presence of 100 chemical groups considered as reactive, unstable, or toxicophoric (e.g. haloanhydrides, hydrazines, aldehydes, etc.).

The second filter (MCF-2), based on 30 groups (e.g. naphthylamines, barbiturates, acetals, thiols, etc.) flags the chemotypes believed to be toxic, cancerogenic, mutagenic, etc. The applicability of this filter is conditional on the purpose of libraries design. For example, a library targeted against metal-containing enzymes (MMP, PDF, HAD, etc) should include compounds with all types of chelating groups (hydroxamic acids, thiols, oximes, etc.), etc.

The third filter (MCF-3) evaluates the physico-chemical parameters of the compounds and classifies them in accordance with Lipinski’s “rule of 5” (LR5) of drug-likeness (Lipinsky et al., 2001) and Veber’s “rule of 2” (VR2) (Veber et al., 2002) However, this filter is not universal as the rules are rather related to compounds; bioavailability than efficacy. For example, it should be noted that the majority of anti-infective or oncolytic drugs do not conform to either LR5 or VR2. We recommend using human medicinal chemistry expertise to further analyze the compounds flagged with the MCF-3.

The fourth filter (MCF-4) screens the compounds on the basis of their novelty and IP potential. For example, it would reject trivial compounds readily obtainable by coupling of two simple commercially available reagents. These simplistic, easy to synthesize “garbage” compounds are present in almost all random libraries from commercial sources, yet they pass through the general in silico filters (as a rule, these compounds lack reactive and toxicophoric groups, have low molecular weight, high hydrophilicity, low number of rotatable bonds, etc.). The MCF-4 filter detects such compounds and allows the library designer to make a decision on the desirability of having such compounds in the collection.

For IP assessment of novel compounds we do similarity search with SciFinder. Use Scieentific databases Beilstein, Integrity , Wombat. 

SAR support

All products resulting from customer screening projects can be supported by hit-to-lead development chemistry support of MIPT Medicinal Chemistry Team.

• typically, small Project manager supervised teams of highly skilled synthetic chemists (2-4 chemistry FTEs per biological target)
• project timelines – 6-12 mo. (e. g. “Milestone X – improvement of in vitro potency 1 uM à <100 nM in enzymatic assay, 10 uM à <1 uM in cell-based assay – 3-4 months” requires 3-FTE synthetic support)
• frequent project team meetings with assay biology team and the client’s project team (project manager)
• highly flexible synthetic program, with frequent synthetic target changes
• the team is motivated by and rewarded upon achieving a particular contractual milestone

All project activities require high level of data/information security and data keeping practices for adequate IP protection  

Re-synthesis and re-supply

All products can be supported by adequate replenishing, scale up. Re-supply is subject to MIPT’s current stock availability. Compounds re-synthesis (subject to the Customer’s confirmation) will be offered by MIPT if no adequate sample quantity is available for re-supply.

The CI Team comprehesive chemistry skill set includes:

• Pd-catalyzed coupling reactions (Sonogashira, Buchwald, Heck, Suzuki, Grubbs metathesis, etc. )
• Large-scale (100-150 g per run) (organometallic reactions, lithiation of aromatics and heteroaromatics , Cu, Mg, Zn, SM chemistry)
• Modern protective group strategies
• High-pressure and high-temperature reactions
• Microwave-assisted organic synthesis (Set of Biotageand CEM automated reactors)
• Solid phase supported library synthesis (Teabag technology)
• Liquid phase parallel synthesis (Solid phase catalysts, Scavenger resins, SPE purification)

Quality Control

  Not less than 90% purity (+/- 5% accuracy). We provide 100% quality control for all compounds and guarantee more than 95% purity (+/- 5% accuracy). The purity accuracy is confirmed by 1H NMR and/or LC (UV)/ MS spectra in electronic format (MS TIF files) for all stock available compounds