Large Scale Virtual Screening

Bringing Billion Molecule Scale Searching To Your Browser

LSVS-1

Bringing billion molecule scale searching to your browser


LSVS-Speed-rocs-fastrocs

Figure 1: Speed of ROCS and FastROCS.

LSVS-2-fastrocs-rocs

Figure 2: Virtual screening performance for ROCS and FastROCS on the DUD-E dataset.

Orion: Delivering scale easily and efficiently

Searching billions or tens of billions of molecules in 3D is computationally demanding. OpenEye’s cloud-native computational chemistry platform, Orion, provides the on-demand, fault-tolerant compute resources required for the intense, but irregular, demands of LSVS.

LSVS: Ligand-based

ROCS, OpenEye’s class-leading shape and chemical feature similarity tool6 is one of the most widely used tools in 3D lead discovery and lead hopping.7 The CPU implementation is fast, exceeding 200 molecules/CPU/second, while the GPU-enabled version ( FastROCS) is massively faster (700,000 molecules/GPU/second) but has identical overall performance in virtual screening (Figure 2). The high speed and performance of FastROCS make it ideally suited to LSVS at the billion molecule scale.

Recent work from AstraZeneca4 illustrates the immense power of coupling FastROCS to the compute resources of the cloud through Orion. Searching more than 12 billion molecules requires only an hour, with automatic scale-up and scale-down. These very large scale searches produced more diverse, better scoring hits than searching smaller libraries (see Figure 3).4

LSVS-3-scaffolds

Figure 3: Number of Bemis-Murcko scaffolds in the top 10,000 hits versus size of library searched.


LSVS-retro-fred-hybrid

Figure 4: : Retrospective virtual screening performance for FRED and HYBRID, compared on the DUD set. Performance of a null model, 2D fingerprint, is shown for reference.

LSVS: Structure-based

FRED is carefully designed to balance high performance with high speed,8 allowing previously very large libraries to be handled with ease. Comparison with other tools, including HYBRID,9 the ligand-guided companion tool to FRED, shows that FRED performs very well in virtual screening (see Figure 4)

Harnessing the power of Orion to deliver the compute resources required by both FastROCS and FRED has brought previously technically or financially intractable searches within reach. As a proof of concept OpenEye has conducted both FastROCS and FRED virtual screens of the Enamine REAL database against heat shock protein 90, HSP-90 (PDB code 1UYG, (see Figure 3) and submitted the highest scoring molecules to biological testing.


LSVS-4-stucture-based

Figure 5: Active site and cognate ligand of HSP-90, PDB code 1UYG.

Prospective ultra-large-scale virtual screening: FastROCS

FastROCS was able to search the Enamine REAL collection for molecules similar to the 1UYG ligand in less than 30 minutes. The most active hit (IC50 16 uM) and the query ligand (IC50 53.5 uM) are shown in Figure 6. FastROCS successfully identified a new chemical scaffold with slightly better activity than that of the query ligand.

LSVS-5-fastrocs

Figure 6: 2D depictions of the 1UYG ligand (left), 53.5 uM and the most active hit from the FastROCS virtual screen (right), 16 uM.


LSVS-6-fred

Figure 8: The most active hit from the FRED virtual screen, 4 uM.

Prospective ultra-large-scale virtual screening: FRED

FRED was used to dock the entire Enamine REAL collection to the 1UYG receptor in less than 24 hours on Orion, utilizing around 45,000 CPUs at peak capacity (Figure 7). To the best of our knowledge this is the largest scale docking study performed to date, almost an order of magnitude larger than the largest previous docking calculation (170 million compounds, reported by Lyu et al).5 The top-ranked molecule, also the most active hit (IC50 4 uM), is shown in Figure 8. The pose of this molecule recapitulates the key interactions of the cognate ligand, while occupying different regions of the binding site. FRED successfully identified a new chemical scaffold with substantially better activity than that of the cognate ligand.

LSVS-compute-resources-orion-enamine

Figure 7: Compute resources used by Orion to dock the Enamine REAL collection to HSP-90

Summary

Orion is revolutionizing large-scale computing on the cloud, enabling virtual screens of unprecedented scale.

FastROCS can search multi-million molecule databases in seconds and databases of billions of molecules in less than an hour, delivering multiple actionable hits rapidly.

FRED has brought docking to the billion-molecule level, the first tool to do so.

Want to know more about Large Scale Virtual Screening with ORION ? 

Citations

  1. Drew, K. L.; Baiman, H.; Khwaounjoo, P.; Yu, B.; Reynisson, J., Size estimation of chemical space: how big is it? J Pharm Pharmacol 2012, 64, 490-5.
  2. Bohacek, R. S.; McMartin, C.; Guida, W. C., The art and practice of structure-based drug design: A molecular modeling perspective. Med Res Rev 1996, 16, 3-50.
  3. Enamine REAL library. https://enamine.net/library-synthesis/real-compounds/real-database
  4. Grebner, C.; Malmerberg, E.; Shewmaker, A.; Batista, J.; Nicholls, A.; Sadowski, J., Virtual screening in the cloud: How big is big enough? J Chem Inf Model 2019 https://doi.org/10.1021/acs.jcim.9b00779
  5. Lyu, J.; Wang, S.; Balius, T. E.; Singh, I.; Levit, A.; Moroz, Y. S.; O'Meara, M. J.; Che, T.; Algaa, E.; Tolmachova, K.; Tolmachev, A. A.; Shoichet, B. K.; Roth, B. L.; Irwin, J. J., Ultra-large library docking for discovering new chemotypes. Nature 2019, 566, 224-229.
  6. Hawkins, P. C. D.; Skillman, A. G.; Nicholls, A., Comparison of Shape-Matching and Docking as Virtual Screening Tools. J Med Chem 2007, 50, 74-82.
  7. Kumar, A.; Zhang, K. Y. J., Advances in the Development of Shape Similarity Methods and Their Application in Drug Discovery. Front Chem 2018, 6, 315.
  8. McGann, M., FRED and HYBRID docking performance on standardized datasets. J Comput Aided Mol Des 2012, 26, 897-906.

Orion's Integrated Applications and Toolkits