User:VPanchal87/sandbox

Check out all necessary scripts from GitHub:
see here: https://wiki.uib.no/brenklab/index.php/GitHub

Scripts
Activate proper conda environment first (all scripts use now Python 3.x), see also Install python package with Anaconda.

All required scripts are in dockable_db/compound_upload_rdkit, dockable_db/make_dockable_db_rdkit, and dockable_modules.

Environment variable MakeDbl most contain the full path to the directory make_dockable_db_rdkit.

The environment variable PYTHONPATH must contain the path to the modules directory.

Converting a few molecules without uploading them in the MySQL db

 * 1) You need the smiles string of the molecules you want to dock. Each line in the file should start with the smiles, followed by a tab, followed by a name (e.g. "c1ccccc1 [tab] test", don't spell out [tab], use the tab key instead)

Protonate compounds
Use

charger.py

you can find information for which parameters are needed by running

charger.py -h

(-h for help works for most of the following scripts).

pH: pH value at which you want to use your compounds

pH range: tolerance range, eg. if you use a pH of 7 and a pH range of 2 for all compounds that have an estimated pKa of 7 +/- 2 both the neutral and the charged version will be stored.

By default, you should run charger with -p 7 -r 2.

Tautomerize compounds
Use

tautomerizer.py

Generate stereoisomers
Use

stereoisomerizer.py

Default is to only generate stereoisomers for unassigned chiral centres. If a molecule contains more stereo centres than max_centres (default 4), a random subset will be generated.

By default, the scripts writes in the output file which type of stereoisomer it is (random selection, specified, guessed, ...). Files of this type can not by read by the script needed for the next step. To not store this information, run stereoisomerizer.py  with the flag -r.

Make dockable db format
Use smiles files as input file, must have correct protonation states, tautomers and stereoisomers. The file must not contain the information about the type of stereoisomer (see above).

Use

make_and_load_db_3.7.py

It looks like 0.05 is a good value for the rmsd cut-off (-r).

The script offers you the option to update the database with information if the conversion of the molecules has worked. This option must only be used if you are converting molecules that have been downloaded from the database before. If not, you will probably mess up the database, as the numbering of your molecules and the numbering of the molecules in the database will be different.

For each molecule in the input file, you will get one directory with the output. The dockable files are in the omega_mult sub directory. You can combine all dockable files into one file for docking with zcat */omega_mult/*gz > all.db

Your dockable database files will be in new_dbs or in the MysQL table if update mysql db is set to True.

Note: If you have several molecules with the same name, make_and_load_db_3.7.py will only prepare the docking input for the first of these molecules! To avoid this, every molecule in your smiles files needs to have a unique name.

Make dockable db from xtal ligand (or any other none smiles file) [this part needs to be updated, working with SD files will probably be better for RDKIT]
Attention: ligand name must be of format MFCxxxxxxxxx!

convert ligand-pdb to mol2 (e.g. with omega2) and check with a viewer

make sure that total charge of ligand is correct in xtal-lig.mol2 (last column in mol2 file, if your ligand has a total charge of 0, all values should be 0 (if you are not using partial charges), if total charge is X, just set charge of one atom to X, does not matter which one).

copy omega parameter files: cp omega.*.

generate single ligand conformation mkdir amsol2 omega2 -in xtal-lig.mol2 -out amsol2/xtal-lig_os.mol2 -param omega.single

generate multiple ligand conformations mkdir omega_mult omega2 -in xtal-lig.mol2 -out omega_mult/xtal-lig_om.mol2 -param omega.mult

desolvation energy: amsol.py CYC cav

change directory: cd omega_mult/

execute script superpos_conf_ensemble.py. .. False False

get INHIER file (should be in make_dockable_db directory)

link solv file from amsol.py output in *-cyc-CAV directory to temp.solv ln -s *cyc_CAV/*_mult_rings_os_CYC.solv temp.solv link output mol2 file from superpos_conf_ensemble.py to temp.mol2 ln -s *_om_mult_rings.mol2 temp.mol2

execute mol2db inhier

after changing inhier-file (might have to adjust names for input mol2 and solvation table file)

Charge molecules
Get molecules that should be charged out of the data base using

mysql_get_smiles_chunks.py

A possible query to get all molecules which have not yet been charged is (see also query_get_prot_smiles.txt)

SELECT u.id, unique_smiles FROM purchasable.unique_compounds u left join purchasable.protonated_smiles p on p.id = u.id where p.id is null

Charge and load molecules to db using

cl_charg_taut_stereo_load.py -j charger

On the cluster, loading to the db might be slow. In this case, cat the files and load them manually using mysql_load_prot_states.py

Generate tautomers
Get molecules that should be tautomerized out of the data base using mysql_get_smiles_chunks.py

A possible query to get all molecules which have not yet been tautomerized is (see also query_get_prot_smiles.txt) SELECT p.prot_id, prot_smiles FROM purchasable.protonated_smiles p left join purchasable.tautomerized_smiles t on p.prot_id = t.prot_id where t.prot_id is null

Tautomerize and load molecules to db using

cl_charg_taut_stereo_load.py -j tautomerizer

On the cluster, loading to the db might be slow. In this case, cat the files and load them manually using mysql_load_taut_states.py

Generate stereoisomer
Get molecules for which stereoisomers should be generated out of the data base using mysql_get_smiles_chunks.py

A possible query to get all molecules which have not yet been stereoisomerized is (see also query_get_taut_smiles.txt)

SELECT t.taut_id, taut_smiles FROM purchasable.tautomerized_smiles t left join stereoisomer_smiles s on t.taut_id = s.taut_id where s.taut_id is null

Generate stereoisomers and load molecules to db using

cl_charg_taut_stereo_load.py -j stereoisomerize

On the cluster, loading to the db might be slow. In this case, cat the files and load them manually using mysql_load_stereoisomers.py

Get chunks to make dockable database
Get smiles of interest out of stereoisomer table with mysql_get_smiles_chunks.py

cl_make_and_load_db.py [this script is currently not working]