If you are just getting started with brood, we highly recommend you use the graphical interface. Even if you prefer to use the command-line, walking through the graphical interface one time will give you a good overview of the workflow involved in running BROOD.
The basic idea behind BROOD is that you have a lead molecule and would like to generate analogs with somewhat different properties by changing a portion of the molecules. Normally, you will specify both the lead molecule (with the -queryMolecule flag) and the fragment of the molecule that you would like to modify (with the -queryFrag flag).
Brood will search the default database of fragments for fragments with similar shape and electrostatics to your query fragment, replace them in your query molecule and generate a hitlist of analog molecules. To carry out a search like this, use the command-line:
prompt> brood -queryMol myMol.sdf -queryFrag myFrag.sdf
will search the default database for fragments similar to myFrag.sdf, replace them in the molecule found in myMol.sdf and write a list of analogs into the default output file brood.oeb. The input molecule can be in any standard file format, however, it is easiest to specify the fragment in SDF or SMILES format. For information on query files, see the section Query Files. Example SDF and SMILES files can be found in Example Fragment Files.
The output file brood.oeb can be viewed in VIDA. We recommend you use the VIDA extension that ships with the BROOD application. This extension allows you to simultaneously view the 3D structure of the analogs generated by BROOD and various of their physical properties. Further, it aids in the exploration of the clusters in the BROOD hitlist. If that is not possible, you should examine the SD data attached to each molecule in whatever viewer you deem best. In addition, a .txt file of all the molecular data is written that can be imported into most spreadsheet programs for detailed examination.
Executing BROOD with no arguments will result in:
prompt> brood
Brood 2.0.0, 20091231
OEChem version 1.8.0, 20091231
Platform: centos-5.4-i586-x64
OpenEye Scientific Software, Inc.
Supported Run Modes:
Single processor
MPI Multiprocessor
No argument specified on the command line
Required parameters:
-queryFrag : Input query filename
For more help type:
brood --help
A description of the command line interface can be obtained by executing BROOD with the --help option.
prompt> brood --help
will generate the following output:
Help functions:
brood --help simple : Get a list of simple parameters
brood --help all : Get a complete list of parameters
brood --help defaults : List the defaults for all parameters
brood --help <parameter> : Get detailed help on a parameter
brood --help html : Create an html help file for this program
The defaults for each command-line parameter can be examined with the --defaults flag.
If you desire to see the most important command-line options use --help simple.
prompt> brood --help simple
will generate the following output:
Brood 2.0.0, 20100105
OEChem version 1.8.0, 20091211
Platform: microsoft-win32-msvc9-MD-x86
OpenEye Scientific Software, Inc.
Supported Run Modes:
Single processor
MPI Multiprocessor
Simple parameter list
Execute Options
-param : A parameter file
Brood
Input
-queryMol : Query molecule for building analogs
-queryFrag : Query fragment to use as search template
-db : Fragment database to search
Control parameters
-quickLook : Do a brief search and return a quick set of results
-ringOnly : Only select fragments with a ring in the attachment path
Property Filters
-property : Filter fragments by property
If you desire to see all of the command-line options use --help all.
prompt> brood --help all
will generate the following output:
Brood 2.0.0, 20100116
OEChem version 1.8.0, 20091229
Platform: microsoft-win32-msvc9-MD-x86
OpenEye Scientific Software, Inc.
Supported Run Modes:
Single processor
MPI Multiprocessor
This executable supports single processor execution
Complete parameter list
Execute Options
-param : A parameter file
-chunk : Number of input chunks to be created
Brood
Input
-queryFrag : Query fragment to use as search template (required)
-queryMol : Query molecule for building analogs (not required)
-db : Fragment database to search
-prot : Macro molecule for bump-check of fragments and build analogs
-cpddb : Database of known compounds to use for synthetic reference
-param : Control parameter file
Output
-prefix : Prefix for generic output files
-dots : Write a dot to the terminal for every 500 cpds processed
-log : Write to specified log file (override -prefix)
-info : Write to specified info file (override -prefix)
-report : Write complete output in table form (override -prefix)
-format : Molecular output format
-txt : Generate tab separated hitlist for reading into spreadsheets
-idea : Generate cluster information for hitlists
Control parameters
-quickLook : Do a brief search and return a quick set of results
-ringOnly : Only select fragments with ring in attachment path
-ET : Generate electrostatic Tanimoto hitlist.
-linkOnly : Identify linkers that mimic geometry attachment ONLY
(Caveat-like).
-sdtag : Add bioisostere scores as SD Tags (SDF and OEB only)
-checkBond : Check for medicinally acceptable attachment bonds
-maxHit : Size of hitlists (1-5000)
-title : Add scores to molecule title with this delimeter
Advanced parameters
-bondOrder : Require same attachment bond order
-attachmentCutoff : Minimum acceptable attachment point tanimoto
-shapeCutoff : Minimum acceptable shape tanimoto
-attachmentScale : Scale factor weighting the importance of attachment
-fromCT : Generate query conformer from the connection table.
-fileChrg : Take partial charges from the input molecule.
-interval : Update info file every N molecules
-hitinterval : Write intermediate hitlist files every N molecules
points
-maxfrag : Maximum number of fragments to search
Property Selection
-property : Filter fragments by property
-molWt : Molecular weight less than current +/- value
-logp : LogP less than current +/- value
-psa : PSA less than current +/- value
-rotbond : Rotatable bond less than current +/- value
-hvyAtom : Heavy atom less than current +/- value
-LipinskiDon : Heavy atom less than current +/- value
-LipinskiAcc : Heavy atom less than current +/- value
This section has a series of example BROOD command-line executions. Each example is followed by a brief description of its behavior.
If you would like to execute the following examples as written, the appropriate paths to the executable file and the database file must be included. In addition, the file amide.smi will need to be in the working directory. This can be accomplished with the following command:
prompt> echo "*C(=O)NC*" >> amide.smi
This file can now be used as the query for each case below.
prompt> brood amide.smi
prompt> brood -queryFrag amide.smi -db brood.v200.db
These two commands will yield identical results. They execute BROOD with the default parameters. The file amide.smi is opened in SMILES format as the query, and the database brood.v200.db is read in database format. The default hitlist will be written to brood.hitlist.oeb.gz, using the default -prefix argument “brood”. Similarly, the informational output files brood.info, brood.log, brood.param and brood.rpt will also be written.
prompt> brood -queryFrag amide.smi -db brood.v200.db -prefix 4dfr
This command is the same as the previous except that the prefix to all of the output files has been changed from “brood” to “4dfr” (for example, the log file will be written to 4dfr.log rather than brood.log.
prompt> brood -queryFrag amide.smi -db brood.v200.db -prefix 4dfr -report myRpt
This executes BROOD as above, however, the -report argument over-rides the -prefix argument and the report file is written to the file “myRpt” rather than the file “4dfr.rpt“.
prompt> brood -param 4dfr.param
This execution of BROOD will read all the command-line arguments from the file “4dfr.param“. Every time BROOD is executed, a param file is generated that can be used to exactly reporduce the run (vida infra).
prompt> brood -param 4dfr.param -maxHit 2500
The first of these command-lines will execute BROOD with the parameters from “4dfr.param“, but the -maxHit parameter will be overridden to a value of 2500. This indicates that 2500 compounds will be stored in each of the four hitlists.
prompt> brood -queryFrag amide.smi -db brood.v200.db -format sdf -sdtag
This execution will be as before, except now the hitlist file will be written in .sdf format. Further, the score of each of the ligands in the file will be attached to the molecule as SD Tag data. Note: the default for the -sdtag is verbose, so writing this information is on by default. In addition, the default format is .oeb, which also handles SD Tag data properly.
prompt> brood -queryFrag amide.smi -db brood.v200.db -dots
The execution will be as before except a single ‘.’ will be written to the screen periodically as database fragments are processed. This gives an easy visual measure of the progress of the execution.
prompt> brood -queryFrag amide.smi -db brood.v200.db -queryMol mol.smi
The -queryMol flag indicates that the query fragment in amide.smi will be located in the molecule specified by mol.smi. Each of the similar fragments in the hitlist will be used to replace that fragment in the whole molecule to generate a new analog molecules. Since the molecule specified with the -queryMol flag in this instance is 2D, all of the final molecules will be generated with 2D coordinates.
prompt> brood -queryFrag amide.smi -db brood.v200.db -queryMol mol.mol2
This command line acts similarly to the one above. In this case the molecule specified by the -queryMol parameter has 3D coordinates. This causes the query’s 3D coordinates to be copied out of the -queryMol molecule (rather than being generated as a minimized MMFF structure). Further, the constructed analog molecules will be generated with minimal perturbation to the 3D geometry of the -queryMol molecule.
prompt> brood -queryFrag amide.sdf -db brood.v200.db -queryMol mol.smi
Again, this execution is similar to the one above. In this case, the query fragment has 3D structure but the -queryMol molecule does not. The 3D query will be carried out with the coordinates specified in the amide.sdf file, but the built molecules will all be generated with only 2D coordinates.
prompt> brood -queryFrag amide.sdf -db brood.v200.db -queryMol mol.mol2
In this example of a -queryMol execution, both the query fragment and the -queryMol molecule have 3D coordinates. Here, the input 3D coordinates of the query fragment are discarded and the 3D coordinates that the query fragment has inside the -queryMol molecule are used to both carry out the search and to build the analog molecules.
prompt> brood -queryFrag amide.sdf -db brood.v200.db -queryMol mol.mol2 -prot spf3.pdb
In this final example of a -queryMol execution, both the query fragment and the -queryMol molecule have 3D coordinates as above. BROOD reads the protein from the spf3.pdb. Each analog of the query molecule is built and then tested for clashes with the protein. Analogs that clash are eliminated from the hitlist.
BROOD can read and write a variety of molecular file formats. The file format is automatically interpreted from the filename suffix.
| File Type | Extension |
|---|---|
| SMILES | .smi .ism .can .smi.gz .ism.gz .can.gz |
| SDF | .sdf .mol .sdf.gz .mol.gz |
| SKC | .skc .skc.gz |
| CDK | .cdk .cdk.gz |
| MOL2 | .mol2 .mol2.gz |
| PDB | .pdb .ent .pdb.gz .ent.gz |
| MacroModel | .mmod .mmod.gz |
| OEBinary v2 | .oeb .oeb.gz |
| Old OEBinary | .bin |
Old OEBinary format can be read but not written by BROOD. Gzipped OEBinary version 2 (oeb.gz) is the recommended output format.
BROOD is capable of piping formatted input and output. The simple “-” can be used in place of a filename to indicate std::cin or std::cout with the default SMILES format.
prompt> BROOD -in - -out -
This execution will run BROOD with std::cin as the input with SMILES format. It will also open std::cout with SMILES format as output. However, the use of “-” does not allow control of the file format.
To control the file format of std::cin and std::cout one may use the file extensions without a preceeding filename.
prompt> BROOD -in .ism -out .oeb.gz
This executes BROOD with the input from std::cin formatted in isomeric SMILES and the output sent to std::cout in gzipped OEBinary version 2 format.