Core comp chem functions implemented with RDKit
/home/dmai/miniconda3/envs/mrl/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: to-Python converter for boost::shared_ptr<RDKit::FilterCatalogEntry const> already registered; second conversion method ignored.
  return f(*args, **kwds)

RDKit i/o

to_mol and to_smile are functions that make it easy to work with both SMILES strings and RDKit Mol objects. For example, if a function requires a mol input, adding mol = to_mol(mol) allows the function to take either SMILES strings or RDKit mols as input

to_mol[source]

to_mol(smile_or_mol)

smart_to_mol[source]

smart_to_mol(smile_or_mol)

to_smile[source]

to_smile(smile_or_mol)

to_kekule[source]

to_kekule(smile_or_mol)

to_smart[source]

to_smart(smart_or_mol)

to_mols[source]

to_mols(list_of_inputs)

to_smiles[source]

to_smiles(list_of_inputs)

to_smarts[source]

to_smarts(list_of_inputs)

smart_to_rxn[source]

smart_to_rxn(smart)

canon_smile[source]

canon_smile(smile)

remove_stereo[source]

remove_stereo(smile)

assert type(to_smile('CCC')) == str
assert type(to_mol('CCC')) == Chem.Mol
assert type(to_smile(Chem.MolFromSmiles('CCC'))) == str
assert type(to_mol(Chem.MolFromSmiles('CCC'))) == Chem.Mol

smile_to_selfie[source]

smile_to_selfie(smile)

selfie_to_smile[source]

selfie_to_smile(selfie)

split_selfie[source]

split_selfie(selfie)

s = 'O=C(NCc1ccc(Br)cc1F)C(=O)NCC1(Cc2ccccc2)CC1'
assert selfie_to_smile(smile_to_selfie(s)) == s

Misc Functions

Miscellaneous RDKit related functions

neutralize_atoms[source]

neutralize_atoms(mol)

Neutralize charges, from rdkit.org/docs/Cookbook.html#neutralizing-molecules

initialize_neutralisation_reactions[source]

initialize_neutralisation_reactions()

github.com/kaiwang0112006/clusfps/blob/master/third_party_package/RDKit_2015_03_1/Docs/Book/Cookbook.rst

neutralize_charges[source]

neutralize_charges(mol, reactions=None)

github.com/kaiwang0112006/clusfps/blob/master/third_party_package/RDKit_2015_03_1/Docs/Book/Cookbook.rst

find_bond_groups[source]

find_bond_groups(mol)

Find groups of contiguous rotatable bonds and return them sorted by decreasing size

https://www.rdkit.org/docs/Cookbook.html

draw_mols[source]

draw_mols(mols, legends=None, mols_per_row=3, sub_img_size=(300, 300))

Mol Descriptors

These are a bunch of wrappers for standard RDKit functions. The reason for doing this is RDKit functions can't be pickled, which causes all sorts of problems for multiprocessing. This is fixed by creating a wrapper function

For example:

try:
    output = maybe_parallel(rdMolDescriptors.CalcExactMolWt, [to_mol('CCC')])
    print('Parallel execution succeeded')
except:
    print('parallel execution failed')
    
def wrapper(mol):
    return rdMolDescriptors.CalcExactMolWt(mol)

try:
    output = maybe_parallel(wrapper, [to_mol('CCC')])
    print('Parallel execution succeeded')
except:
    print('parallel execution failed')
Parallel execution succeeded
Parallel execution succeeded

Sadly, having a generic wrapper constructor also fails to pickle because such a wrapper requires an RDKit function as input to construct the wrapper, which brings back the pickle problems (see code example below). This leaves us with manually defining wrapper functions for RDKit functions

def rdkit_wrapper(rdkit_func):
    def wrapper(mol):
        return rdkit_func(mol)
    
    return wrapper

try:
    output = maybe_parallel(rdkit_wrapper(rdMolDescriptors.CalcExactMolWt), [to_mol('CCC')])
    print('Parallel execution succeeded')
except:
    print('parallel execution failed')
Parallel execution succeeded

add_hs[source]

add_hs(mol)

remove_hs[source]

remove_hs(mol)

molwt[source]

molwt(mol)

molecular weight

hbd[source]

hbd(mol)

number of hydrogen bond donors

hba[source]

hba(mol)

number of hydrogen bond acceptors

tpsa[source]

tpsa(mol)

total polar surface area

rotbond[source]

rotbond(mol)

number of rotatable bonds

loose_rotbond[source]

loose_rotbond(mol)

number of rotatable bonds, includes things like amides and esters

rot_chain_length[source]

rot_chain_length(mol)

Length of longest contiguous rotatable bond chain

fsp3[source]

fsp3(mol)

fraction sp3 hybridized atoms

logp[source]

logp(mol)

logp

rings[source]

rings(mol)

number of rings

max_ring_size[source]

max_ring_size(mol)

size of largest ring

min_ring_size[source]

min_ring_size(mol)

size of smallest ring

heteroatoms[source]

heteroatoms(mol)

number of heteroatoms

all_atoms[source]

all_atoms(mol)

total number of atoms

heavy_atoms[source]

heavy_atoms(mol)

number of heavy atoms

formal_charge[source]

formal_charge(mol)

formal charge

molar_refractivity[source]

molar_refractivity(mol)

molar refractivity

aromaticrings[source]

aromaticrings(mol)

number of aromatic rings

qed[source]

qed(mol)

QED Score doi:10.1038/nchem.1243 

sa_score[source]

sa_score(mol)

Synthetic Accessability score doi.org/10.1186/1758-2946-1-8

num_bridgeheads[source]

num_bridgeheads(mol)

Number of bridgehead atoms

num_spiro[source]

num_spiro(mol)

Number of spiro atoms

chiral_centers[source]

chiral_centers(mol)

Number of chiral centers

num_radicals[source]

num_radicals(mol)

Number of radical electrons

penalized_logp[source]

penalized_logp(mol)

Reward that consists of log p penalized by SA and # long cycles, as described in (Kusner et al. 2017). Scores are normalized based on the statistics of 250k_rndm_zinc_drugs_clean.smi dataset :param mol: rdkit mol object :return: float

try:
    _ = maybe_parallel(hbd, [to_mol('CCC')])
    output = 'success'
except:
    output = 'fail'
    
assert output == 'success'

Conformer Generation

conformer_generation[source]

conformer_generation(mol, num_confs, max_iter=200, rms_thresh=0.2, use_torsion=True, use_basic=True, enforce_chirality=True, nthreads=12, minimize=True, align=True, seed=None, mmffVariant='MMFF94s')

conformer_generation - generates conformers for input mol

Inputs:

  • mol Chem.Mol: Input mol

  • max_iter int: maximum iterations for conformer generation and minimization

  • rms_thresh float: Retain only the conformers that are at least this far apart from each other.

  • use_torsion bool: if True, impose experimental torsion angle preferences

  • use_basic bool: if True, imposes basic knowledge (ie flat rings)

  • enforce_chirality bool: enforce chirality if chiral centers are present

  • nthreads int: number of threads to use. If zero, all threads are used

  • minimize bool: if True, MMFF force field is used to minimize conformers

  • align bool: if True, conformers are aligned

  • seed Optional[int]: seed to use

  • mmffVariant str[MMFF94s, MMFF94]: force field to use if minimize=True

Substructure Matching

This class is used for substructure matching an input Mol against a list of SMARTS.

Note: Substructure matching is tricky. Be sure to verify your SMARTS before putting a large number of them into a filter.

CatalogMatch functions as a base class to match Mol objects against any generic catalog. has_match will return a single boolean value for if the Mol matches one of he filters in the catalog. get_matches will return a list of bools for all elements in the catalog. percent_matches returns a list of floats for what percentage of filters match.

SMARTSMatch will generate a catalog from a list of SMARTS

PAINSMatch, PAINSAMatch, PAINSBMatch and PAINSCMatch specify different PAINS catalogs present in RDKit (see here)

class Catalog[source]

Catalog(catalog)

Base Class for SMARTS matching

Inputs:

  • catalog FilterCatalog: RDKit FilterCatalog

Catalog.__call__[source]

Catalog.__call__(mol, criteria=None)

call - run mol against self.catalog

Inputs:

  • mol [Chem.Mol, list[Chem.Mol]]: input mols

  • criteria ['any', 'all', float, int]: match criteria. (match any filter, match all filters, match float percent of filters, match int number of filters)

If any, returns True if any smarts match.

If all, returns True if all smarts match.

If float, returns True if more than float percent (inclusive) of smarts match.

If int, returns True if more than int total (inclusive) smarts match

class SmartsCatalog[source]

SmartsCatalog(smarts) :: Catalog

Class for SMARTS matching

Inputs: smarts, list of smarts

class ParamsCatalog[source]

ParamsCatalog(params) :: Catalog

Generates CatalogMatch object from params, a FilterCatalogParams object

class PAINSCatalog[source]

PAINSCatalog() :: ParamsCatalog

Full PAINS filter matching

class PAINSACatalog[source]

PAINSACatalog() :: ParamsCatalog

PAINS A filter matching

class PAINSBCatalog[source]

PAINSBCatalog() :: ParamsCatalog

PAINS B filter matching

class PAINSCCatalog[source]

PAINSCCatalog() :: ParamsCatalog

PAINS C filter matching

class ZINCCatalog[source]

ZINCCatalog() :: ParamsCatalog

ZINC filter matching

class BRENKCatalog[source]

BRENKCatalog() :: ParamsCatalog

BRENK filter matching

class NIHCatalog[source]

NIHCatalog() :: ParamsCatalog

NIH filter matching

smarts = [
    '[*]-[#6]1:[#6]:[#6](-[#0]):[#6]:[#6](-[*]):[#6]:1',
    '[*]-[#6]1:[#6]:[#6](-[*]):[#6]:[#6]:[#6]:1',
    '[*]-[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1',
    '[*]-[#6]1:[#6]:[#6](-[#7]-[*]):[#6]:[#6]:[#6]:1',
    '[#6]1:[#6]:[#7]:[#6]:[#6]:[#6]:1'
]

sm = SmartsCatalog(smarts)

smiles = [
    'c1ccccc1',
    'Cc1cc(NC)ccc1',
    'Cc1cc(NC)cnc1',
    'Cc1cccc(NCc2ccccc2)c1'
]

mols = [to_mol(i) for i in smiles]

assert sm(mols, criteria='any') == [False, True, True, True]
assert sm(mols, criteria=0.5) == [False, True, False, True]
assert sm(mols[1], criteria=3)==True

Fingerprints

This section deals with creating and manipulating molecular fingerprints. Below are functions for generating different forms of Morgan fingerprints (ECFP4, ECFP6, FCFP4, FCFP6). Fingerprints by default are generated as RDKit ExplicitBitVect objects, but can be converted to numpy arrays using the fp_to_array function.

Fingerprint similarity functions using Tanimoto, Dice and Cosine metrics are implemented for both ExplicitBitVect and ndarray objects.

Note following cheminformatics convention, fingerprint metrics are implemented as similarities rather than distances. The metrics used have the relationship similarity = 1 - distance. For using fingerprint difference metrics in machine learning applications, be sure you are using the correct relationship (similarity vs difference) for your task.

morgan_fp[source]

morgan_fp(mol, radius=3, nbits=2048, use_features=False)

morgan fingerprint

ECFP4[source]

ECFP4(mol)

ECFP4 Fingerprint

ECFP6[source]

ECFP6(mol)

ECFP6 Fingerprint

FCFP4[source]

FCFP4(mol)

FCFP4 Fingerprint

FCFP6[source]

FCFP6(mol)

FCFP6 Fingerprint

failsafe_fp[source]

failsafe_fp(mol, fp_function)

Returns vector of zeros if failure

fp_to_array[source]

fp_to_array(fp)

Converts RDKit ExplicitBitVec to numpy array

tanimoto[source]

tanimoto(fps1, fps2)

Tanimoto similarity

tanimoto_rd[source]

tanimoto_rd(fp, fps)

dice[source]

dice(fps1, fps2)

Dice similarity

dice_rd[source]

dice_rd(fp, fps)

cosine[source]

cosine(fps1, fps2)

Cosine similarity

cosine_rd[source]

cosine_rd(fp, fps)

When computing similarities between fingerprints, several things need to be lined up. Different methods are needed for different fingerprint formats (ndarray vs ExplicitBitVect) and different distance metrics.

The FP class holds logic to make this easy.

The FP.get_fingerprint function allows for parallel processing of fingerprint generation.

The FP.fingerprint_similarity routes fingerprints to the correct similarity function based on the fingerprint's array type and the similarity metric used.

For cases where instantiating a class isn't helpful, get_fingerprint and fingerprint_similarities work as functional wrappers around FP.

class FP[source]

FP()

FP - class for manipulating molecular fingerprints

fp = FP()

mol_fp = fp.get_fingerprint(mol, 'ECFP6')

FP.get_fingerprint[source]

FP.get_fingerprint(mol, fp_type='ECFP6', output_type='rdkit')

get_fingerprint - Generates fingerprint for mol.

Inputs:

  • mol [Chem.Mol, list[Chem.Mol]] - input mols

  • fp_type str: Fingerprint type, must be a key of FP.fps

  • output_type str['rdkit', 'numpy']: Output datatype. Numpy ndarray or RDKit ExplicitBitVec

FP.fingerprint_similarity[source]

FP.fingerprint_similarity(fps1, fps2, metric)

fingerprint_similarity - Computes the similarity between fps1 and fps2 using metric

Inputs:

  • fps1 [ndarray, ExplicitBitVect]: first fingerprint set

  • fps2 [ndarray, ExplicitBitVect]: second fingerprint set

Returns:

  • similarities [ndarray]: matrix of similarities between all fingerprints in fps1 and all fingerprints in fps2

Fingerprints can either be a Numpy ndarray or an RDKit ExplicitBitVec. Both fingerprint inputs must be the same datatype.

Numpy fingerprints can eiher be a 1D vector or a 2D matrix of stacked fingerprints

RDKit fingerprints can either be an ExplicitBitVec or a list of ExplicitBitVec objects.

get_fingerprint[source]

get_fingerprint(mol, fp_type, output_type='rdkit')

fingerprint_similarities[source]

fingerprint_similarities(fps1, fps2, metric)

bulk_smiles_similarity[source]

bulk_smiles_similarity(smiles1, smiles2=None, fp_type='ECFP6', metric='tanimoto')

fp = FP()
fps = fp.get_fingerprint(mols, fp_type='ECFP4', output_type='rdkit')
fps_np = fp_to_array(fps)

assert np.allclose(fp.fingerprint_similarity(fps, fps, 'tanimoto'), 
                   fp.fingerprint_similarity(fps_np, fps_np, 'tanimoto'))

Custom Fingerprint Functions

Here is an example on how to add new fingerprint functions and distance functions

def my_fp(mol):
    mol = to_mol(mol)
    fp =  AllChem.RDKFingerprint(mol)
    return fp

class MyFP(FP):
    def __init__(self):
        super().__init__()
        self.fps['my_fp'] = my_fp
        
fp = MyFP()
fps = fp.get_fingerprint(mols, fp_type='my_fp', output_type='rdkit')
def my_dist(fps1, fps2):
    # make sure your distance function works on binary/boolean arrays!!
    return 1-distance.cdist(fps1, fps2, metric='russellrao')

def my_dist_rd(fp, fps):
    # make sure the RDKit method gives the same result as scipy, not always the case
    return DataStructs.BulkRusselSimilarity(fp, fps)

class MyFP(FP):
    def __init__(self):
        super().__init__()
        self.similarities['my_metric'] = {'rdkit' : my_dist_rd,
                                          'numpy' : my_dist}
                
fp = MyFP()
fps = fp.get_fingerprint(mols, fp_type='ECFP6', output_type='numpy')
fp.fingerprint_similarity(fps, fps, 'my_metric')
array([[0.00195312, 0.00097656, 0.00048828, 0.00146484],
       [0.00097656, 0.01123047, 0.00585938, 0.00830078],
       [0.00048828, 0.00585938, 0.01171875, 0.00439453],
       [0.00146484, 0.00830078, 0.00439453, 0.01806641]])

Mol Operations

Functions for editing or manipulating Mol objects.

Fragmenting functions like fragment_smile break molecules into fragments by cutting single bonds.

fuse_on_atom_mapping fuses fragments following RDKit's atom mapping conventions.

[*:1]-R1-[*:2] + [*:1]-R2 >> [*:2]-R1-R2

fuse_on_link relies on user-defined linkages such as heavy atoms.

[Rb]-R1-[Pb] + [Rb]-R2 >> [Pb]-R1-R2

fragment_mol[source]

fragment_mol(mol, max_cuts, return_mols=False)

Wrapper for RDKit mol fragmentation

fragment_smile[source]

fragment_smile(smile, cuts)

fragment_smile - fragment smile based on cuts

smile str: smiles string to fragment

cuts [int, list[int]]: Number of cuts to make. If list, code iterates over items in cuts and generates fragments from each value.

Note that the RDKit fragmentation uses only a single cut value. So fragmenting with 5 cuts will not include the result of fragmenting with 4 cuts

fragment_smiles[source]

fragment_smiles(smiles, cuts)

fragment and deduplicate a list of smiles

fragment_smile('CCCCCCCC', [1])
['*C', '*CCCCCCC', '*CCCCCC', '*CC', '*CCCC', '*CCCCC', '*CCC']

fuse_on_atom_mapping[source]

fuse_on_atom_mapping(fragment_string)

fuse_on_atom_mapping - Merges a series of molecular fragments into a single compound by atom mapping

ie R1-[*:1].R2-[*:1] >> R1-R2

Fragments with paired atom mappings will be fused on the atoms connected to the mapped dummies. Mappings that occur once or more than 2 times are ignored

Inputs:

  • fragment_string [str, Chem.Mol]: input molecule fragments

Outputs:

  • new_smile str: fused molecule
assert fuse_on_atom_mapping('[*:1]CC.[*:1]CC') == 'CCCC'
assert fuse_on_atom_mapping(to_mol('[*:1]CC.[*:1]CC')) == 'CCCC'
assert fuse_on_atom_mapping('[*:1]CC.[*:2]CC') == 'CC[*:1].CC[*:2]'

fuse_on_link(fragment_string, links)

fuse_on_link - Merges a series of molecular fragments into a single compound by links

ie fuse_on_link('R1-[Rb].R2-[Rb]', ['[Rb]']) >> 'R1-R2'

Fragments matching a given link are defined by substring searching. Links that occur once or more than 2 times are ignored

Note: inputs with RDKit atom mapping (ie [*:1]CC) will fail

Inputs:

  • fragment_string [str, Chem.Mol]: input molecule

  • links list: list of defines linkages

Outputs:

  • str: fused molecule
assert fuse_on_link('[Rb]CC.[Rb]CC', ['[Rb]']) == 'CCCC'
assert fuse_on_link('[Rb]CC.[Rb]CC', ['[Pb]']) == 'CC[Rb].CC[Rb]'
fragment_smile = 'C1CCC([*:1])CC1.C([*:3])CC.c1cncc([*:2])c1.c1nc([*:1])c2c([*:3])nc([*:2])cc2n1'
mol = to_mol(fragment_smile)
mol
fused_smile = fuse_on_atom_mapping(fragment_smile)
new_mol = to_mol(fused_smile)
new_mol

murcko_scaffold[source]

murcko_scaffold(smile, generic=False, remove_stereochem=False)

murcko_scaffold - convert smile to murcko scaffold.

If generic, all atoms are converted to carbon

smile = 'Cc1cc(Oc2nccc(CCC)c2)ccc1'
scaffold = murcko_scaffold(smile)
scaffold_generic = murcko_scaffold(smile, generic=True)
draw_mols(to_mols([smile, scaffold, scaffold_generic]))

scaffold_split[source]

scaffold_split(smiles, percent_valid, percent_test=0.0, remove_no_scaffold=True, generic=False, remove_stereochem=False)

scaffold_split - split smiles into train, valid, test sets by scaffold

Inputs:

  • smiles list[str] - input smiles strings

  • percent_valid float: percent for validation set

  • percent_test Optional[float]: percent for test set

  • remove_no_scaffold Bool: if True, compounds with no murcko scaffold are removed

  • generic Bool: if True, scaffolds are converted to all saturated carbons with single bonds

  • remove_stereochem Bool: if True, scaffolds have stereochemistry removed

Structure Enumeration

Often it can be useful to enumerate variants of the same core structure. For example, generating every 6 member ring variant with 2 nitrogens. The StructureEnumerator class provides a way of enumerating over a core structure defined by a smarts string and a set of user inputs. The structure enumerator can also add wildcard atoms.

For examples on using the StructureEnumerator class, see the Structure Enumeration tutorial page

add_map_nums[source]

add_map_nums(mol)

Adds map numbers to all atoms in mol

check_ring_bonds[source]

check_ring_bonds(smile)

Looks for SP hybridized atoms in rings

decorate_smile[source]

decorate_smile(smile, num_attachments)

decorate_smiles - adds wildcard atoms to smile based on num_attachments. If there are N atoms with at lest one implicit hydrogen, N choose num_attachments combinations are generated

decorate_smiles[source]

decorate_smiles(smiles, num_attachments)

Decorate all items in smiles and cleans results

remove_atom[source]

remove_atom(rwmol, atom_idx, add_bond=True)

remove_atom - removes atom based on atom_idx. If add_bond and the removed atom is connected to two other heavy atoms, a single bond is formed between neighbors

generate_spec_template[source]

generate_spec_template(mol)

generates blank atom_spec and bond_spec for mol. returns blank specs and matching smarts

class StructureEnumerator[source]

StructureEnumerator(smarts, atom_spec, bond_spec, max_num=1000000, substitute_bonds=None)

StructureEnumerator - class for enumerating molecular structures

Inputs:

  • smarts str: base smarts string to enumerate

  • atom_spec dict: dict of the form {atom_map_num:[allowed_atom_types]} where elements in allowed_atom_types match keys in self.atom_types

  • bond_spec dict: dict of the form {(bond_start_map_num, bond_end_map_num) : [possible_bond_types]} where elements in possible_bond_types match keys in self.bond_types

  • max_num int: max number of combinations to iterate

  • substitute_bonds Optional[List]: List of bond types for new bonds formed from removing atoms. Bond types should match keys in self.bond_types. If None, all keys in self.bond_types are used

add_one_atom[source]

add_one_atom(inputs)

helper function for add_atom_combi

add_atom_combi[source]

add_atom_combi(smile, atom_types, cpus=0)

add_atom_combi - creates variants of smile with one atom added or removed, defined by atom_types

Inputs:

  • smile str: smiles string to modify

  • atom_types list[str, int]: list of allowed atom types to add. If -1 is in the list, variants of smile with one atom removed will be generated. If -2 is in the list, the code will look for atoms with two neighbors, remove the center atom and bond the neighbors (ie ring contraction)

  • cpus Optional[int]: number of cpus to use for multiprocessing. If None, serial processing is used

out = add_atom_combi('C1CN=CCC1', ['C', 'N', 'O', 'F', -1, -2])

add_bond_combi[source]

add_bond_combi(smile, max_ring_size=8, cpus=0)

add_bond_combi - creates variants of smile with a single bond added or removed

Inputs:

  • smile str: smiles string to modify

  • max_ring_size: maximum allowed ring size

  • cpus Optional[int]: number of cpus to use for multiprocessing. If None, serial processing is used

add_one_bond[source]

add_one_bond(inputs)

out = add_bond_combi('C1CN=CCC1')

Proteins

Functions designed for manipulating proteins as amino acid sequences.

Current Limitations

The underlying RDKit utils for amino acids are somewhat more restricted than those for SMILES strings. Only standard amino acid characters can be used (ie no wildcards).

Proteins are represented as FASTA sequences, ie MKDCSNGCSAECTGEGG

to_protein[source]

to_protein(sequence_or_mol)

Convert amino acid sequence to Chem.Mol

to_sequence[source]

to_sequence(sequence_or_mol)

Converts sequence Mol into string

to_proteins[source]

to_proteins(list_of_inputs)

to_sequences[source]

to_sequences(list_of_inputs)

assert type(to_protein('MKDCSNGCSAECTGEGG'))==Chem.Mol
assert to_sequence(to_protein('MKDCSNGCSAECTGEGG')) == 'MKDCSNGCSAECTGEGG'

Nucleic Acids

Functions designed for manipulating DNA/RNA as nucleic acid sequences.

Current Limitations

The underlying RDKit utils for nucleic are somewhat more restricted than those for SMILES strings. Only standard nucleic acid characters can be used. This means no wildcards (*) or hybrid nucleic acids (N)

Polynucleotides are represented as FASTA sequences, ie ATGCATGC. FASTA sequences are resolved into uncapped Polynucleotides.

to_dna[source]

to_dna(sequence_or_mol)

Convert DNA nucleic acid sequence to Chem.Mol

to_dnas[source]

to_dnas(list_of_inputs)

to_rna[source]

to_rna(sequence_or_mol)

Convert RNA nucleic acid sequence to Chem.Mol

to_rnas[source]

to_rnas(list_of_inputs)

assert type(to_dna('ATGC'))==Chem.Mol
assert to_sequence(to_dna('ATGC')) == 'ATGC'

assert type(to_rna('AUGC'))==Chem.Mol
assert to_sequence(to_rna('AUGC')) == 'AUGC'