fragmenstein.monster package

class fragmenstein.Monster(hits: List[Mol], average_position: bool = False, joining_cutoff: float = 5, random_seed: int | None = None)[source]

Bases: _MonsterFF, _MonsterPlace, _MonsterCombine

This creates a stitched together monster. For initilialisation for either placing or combining, it needs a list of hits (rdkit.Chem.Mol).

Note, the hits have to be 3D embedded as present in the protein —it would defeat the point otherwise! For a helper method to extract them from crystal structures see Victor.extract_mol.

The calculation are done either by place or merge.

## Place

>>> monster.place(mol)

Given a RDKit molecule and a series of hits it makes a spatially stitched together version of the initial molecule based on the hits. The reason is to do place the followup compound to the hits as faithfully as possible regardless of the screaming forcefields.

.mol_options are the possible equiprobable alternatives.
.positioned_mol is the desired output (rdkit.Chem.Mol object)
.initial_mol is the input (rdkit.Chem.Mol object), this is None in a .combine call.
.modifications['scaffold'] is the combined version of the hits (rdkit.Chem.Mol object).
.modifications['chimera'] is the combined version of the hits, but with differing atoms made to match the followup (rdkit.Chem.Mol object).

.get_positional_mapping, which works also as a class method, creates a dictionary of mol_A atom index to mol_B atom index based on distance (cutoff 2Å) and not MCS.

The code works in two broad steps, first a scaffold is made, which is the combination of the hits (by position). Then the followup is placed. It is not embedded with constrained embedding functionality of RDKit as this requires the reference molecule to have a valid geometry, which these absolutely do not have this. Novel side chains are added by aligning an optimised conformer against the closest 3-4 reference atoms. Note that .initial_mol is not touched. .positioned_mol may have lost some custom properties, but the atom indices are the same.

If an atom in a Chem.Mol object is provided via attachment argument and the molecule contains a dummy atom. Namely element R in mol file or * in string.

## Combine

>>> monster.combine(keep_all=True, collapse_rings=True, joining_cutoff= 5))

Combines the hits by merging and linking. collapse_rings argument results in rings being collapsed before merging to avoid oddities. The last step within the call is fixing any oddities of impossible chemistry via the call rectify. This uses the separate class Rectifier to fix it.

## Attributes

Common input derived

Variables:

hits (list)
throw_on_discard (bool) – filled by keep_all

Common derived

Variables:

matched (List[str]) – (dynamic) accepted hit names
unmatched (List[str]) – discarded hit names
journal (Logger) – The “journal” is the log of Dr Victor Frankenstein (see Victor for more)
modifications (dict) – copies of the mols along the way
mol_options (list) – equally valid alternatives to self.positioned_mol

place specific:

Variables:

positioned_mol (Mol)
attachment (NoneType)
initial_mol (NoneType)
average_position (bool)
num_common (int) – (dynamic) number of atoms in common between follow-up and hits
percent_common (float) – (dynamic) percentage of atoms of follow-up that are present in the hits

combine specific:

Variables:

joining_cutoff (int) – how distant (in Å) is too much?
atoms_in_bridge_cutoff (int) – how many bridge atoms can be deleted? (0 = preserves norbornane, 1 = preserves adamantane)

Class attributes best ignored:

Variables:

closeness_weights (list) – list of functions to penalise closeness (ignore for most applications)
dummy (Mol) – The virtual atom where the targets attaches. by default *. Best not override.
dummy_symbol (str) – The virtual atom where the targets attaches. by default *. Best not override.
matching_modes (list)

MMFF_score(mol: Mol | None = None, delta: bool = False, mode: str = 'MMFF') → float

Merck force field. Chosen over Universal for no reason at all.

Parameters:

mol (Chem.Mol optional. If absent extracts from pose.) – ligand
delta (bool) – report difference from unbound (minimized)
mode (str) – ‘MMFF’ or ‘UFF’

Returns:

kcal/mol

Return type:

float

Warning:

This was moved out of Igor. Victor has the method for calling it with igor.mol_from_pose

__init__(hits: List[Mol], average_position: bool = False, joining_cutoff: float = 5, random_seed: int | None = None)

Initialisation starts Monster, but it does not do any mergers or placements. This is changed in revision 0.6 (previously mol was specified for the latter)

Parameters:

hits
average_position
joining_cutoff – joining cutoff used in “full” mode
random_seed – A random seed for rdkit embedding calculations during placement

atoms_in_bridge_cutoff = 2

by_expansion(primary_name: str | None = None, min_mode_index: int = 0) → Mol: Get the maps. Find the map with the most atoms covered. Use that map as the base map for the other maps.

closeness_weights = [(<function _MonsterCommunal._closest__is_warhead_marked>, nan), (<function _MonsterCommunal._closest__is_fullbonded>, 1.0), (<function _MonsterCommunal._closest__is_ring_atom>, 0.5)]

collapse_mols(mols: List[Mol])

collapse_ring(mol: Mol) → Mol

Collapses a ring(s) into a single dummy atom(s). Stores data as JSON in the atom.

Parameters:: mol
Returns:

combine(keep_all: bool = True, collapse_rings: bool = True, joining_cutoff: int = 5)

Merge/links the hits. (Main entrypoint)

Parameters:

keep_all
collapse_rings
joining_cutoff

Returns:

convert_origins_to_custom_map(mol: Mol | None = None, forbiddance=True) → Dict[str, Dict[int, int]]

The origins stored in the followup differ in format from the custom_map. The former is a list of lists of hit_name+atom_index, while the latter is a dictionary of hit_name to dictionary of hit atom indices to _intended_ followup index. This method converts the former to the latter. If forbiddance is True, non mapping atoms are marked with negatives.

If mol is None, then self.positioned_mol is used.

Returns:

cutoff = 2

draw_nicely(mol, show=True, **kwargs) → MolDraw2DSVG

Draw with atom indices for Jupyter notebooks.

Parameters:

mol
kwargs – Key value pairs get fed into PrepareAndDrawMolecule.

Returns:

dummy = <rdkit.Chem.rdchem.Mol object>: The virtual atom where the targets attaches

dummy_symbol = '*'

expand_custom_map(custom_map: Dict[str, Dict[int, int]], addend: Dict[str, Dict[int, int]]) → Dict[str, Dict[int, int]]

expand_ring(mol: Mol) → Mol

Undoes collapse ring

Parameters:: mol – untouched.
Returns:

classmethod extract_atoms(protein: Mol, keepers: List[int], expand_aromatics: bool = True) → Mol: Extract the given atom indices (keepers) from protein. Expanding to full aromatic ring and copying conformers

extract_from_neighborhood(system: Mol) → Mol: Given a system of a neighbourhood + ligand extract everything that is not marked IsNeighborhood.

fix_custom_map(custom_map: Dict[str, Sequence[Tuple[int, int]] | Dict[int, int]]) → Dict[str, Dict[int, int]]

This is duplicated in SpecialCompareAtoms, but will be deprecated in favour of this one.

Make sure its Dict[str, Dict[int, int]]

There is a bit of confusion about the custom map. Converts the custom map from dict of lists of 2-element tuples to dict of dicts.

fix_hits(hits: List[Mol]) → List[Mol]: Adds the _Name Prop if needed asserts everything is a Chem.Mol calls store_positions :param hits: :return:

full_blending() → None: a single scaffold is made (except for .unmatched)

get_atom_map_fromProp(mol)

classmethod get_best_scoring(mols: List[RWMol]) → Mol: Sorts molecules by how well they score w/ Merch FF

classmethod get_close_indices(query: Mol, target: Mol, cutoff: float = 5.0) → List[int]: Give an rdkit Chem.Mol query get the atom idices of target that are with cutoff Å.

get_color_origins() → Dict[str, Dict[int, str]]: Get for the hits and followup the color of the origin as seen in show_comparison :return:

classmethod get_combined_rmsd(followup_moved: Mol, followup_placed: Mol | None = None, hits: List[Mol] | None = None) → float

Depracated. The inbuilt RMSD calculations in RDKit align the two molecules, this does not align them. This deals with the case of multiple hits. For euclidean distance the square root of the sum of the differences in each coordinates is taken. For a regular RMSD the still-squared distance is averaged before taking the root. Here the average is done across all the atom pairs between each hit and the followup. Therefore, atoms in followup that derive in the blended molecule by multiple atom are scored multiple times.

As a classmethod followup_placed and hits must be provided. But as an instance method they don’t.

Parameters:

followup_moved – followup compound moved by Igor or similar
followup_placed – followup compound as placed by Monster
hits – list of hits.

Returns:

combined RMSD

get_hit_by_name(name: str) → Mol: Given a name of a hit (as defined in _Name property), return the hit. Do note fix_hits will have been called, so the name may be assigned. :param name: :return:

get_largest_fragment(mol)

get_legend(show_positioned_mol: bool = True) → str

get_mcs_mapping(hit, followup, min_mode_index: int = 0) → Tuple[Dict[int, int], dict]

This is a weird method. It does a strict MCS match. And then it uses laxer searches and finds the case where a lax search includes the strict search.

Parameters:

hit – query molecule
followup – target/ref molecule
min_mode_index – the lowest index to try (opt. speed reasons)

Returns:

mapping and mode

get_mcs_mappings(hit: Mol, followup: Mol, min_mode_index: int = 0, custom_map: Dict[str, Dict[int, int]] | None = None) → Tuple[List[Dict[int, int]], ExtendedFMCSMode]

This is a curious method. It does a strict MCS match. And then it uses laxer searches and finds the case where a lax search includes the strict search.

Parameters:

hit – query molecule
followup – target/ref molecule
min_mode_index – the lowest index to try (opt. speed reasons)
custom_map – is the user defined hit name to list of tuples of hit and followup index pairs

Returns:

mappings and mode

get_neighborhood(apo_block: str, cutoff: float, mol: Mol | None = None, addHs=True) → Mol: Get the neighborhood of the protein from the apo_block around the cutoff of the mol. Note: The atoms will have a prop IsNeighborhood which is used after it is combined.

classmethod get_pair_rmsd(molA, molB, mapping: List[Tuple[int, int]]) → float

classmethod get_positional_mapping(mol_A: Mol, mol_B: Mol, dummy_w_dummy=True) → Dict[int, int]

Returns a map to convert overlapping atom of A onto B Cutoff 2 Å (see class attr.)

Parameters:

mol_A – first molecule (Chem.Mol) will form keys
mol_B – second molecule (Chem.Mol) will form values
dummy_w_dummy – match */R with */R.

Returns:

dictionary mol A atom idx -> mol B atom idx.

guess_origins(mol: Mol = None, hits: List[Mol] | None = None)

Given a positioned mol guess its origins…

Parameters:: mol
Returns:

static inspect_amide_torsions(mol): The most noticeable torsions are the amide ones. This is to describe what is happening.

join_neighboring_mols(mol_A: Mol, mol_B: Mol)

Joins two molecules by first calling _find_closest to find closest. That method does all the thinking. then by calling _join_atoms.

Parameters:

mol_A
mol_B

:return:Ï

journal = <Logger Fragmenstein (DEBUG)>

keep_copies(mols: List[Mol], label=None)

keep_copy(mol: Mol, label=None)

property linker_atom_zahl: Getter for linker_atom_zahl To change set linker_element class property.

linker_element = 'O'

make_chimera(template: Mol, min_mode_index=0) → Mol

This is to avoid extreme corner corner cases. E.g. here the MCS is ringMatchesRingOnly=True and AtomCompare.CompareAny, while for the positioning this is not the case.

Called by full and partial blending modes.

Returns:

make_ideal_mol(mol: Mol | None = None, ff_minimise: bool = False) → Mol

make_pse(filename='test.pse', extra_mols: Iterable[Mol] | None = None)

This is specifically for debugging the full fragment merging mode. For general use. Please use the Victor method make_pse.

Parameters:: filename
Returns:

property matched: List[str]

This is the counter to unmatched. It’s dynamic as you never know…

Returns:

matching_modes = [{'atomCompare': rdkit.Chem.rdFMCS.AtomCompare.CompareAny, 'bondCompare': rdkit.Chem.rdFMCS.BondCompare.CompareAny, 'ringCompare': rdkit.Chem.rdFMCS.RingCompare.PermissiveRingFusion, 'ringMatchesRingOnly': False}, {'atomCompare': rdkit.Chem.rdFMCS.AtomCompare.CompareAny, 'bondCompare': rdkit.Chem.rdFMCS.BondCompare.CompareOrder, 'ringCompare': rdkit.Chem.rdFMCS.RingCompare.PermissiveRingFusion, 'ringMatchesRingOnly': False}, {'atomCompare': rdkit.Chem.rdFMCS.AtomCompare.CompareElements, 'bondCompare': rdkit.Chem.rdFMCS.BondCompare.CompareOrder, 'ringCompare': rdkit.Chem.rdFMCS.RingCompare.PermissiveRingFusion, 'ringMatchesRingOnly': False}, {'atomCompare': rdkit.Chem.rdFMCS.AtomCompare.CompareAny, 'bondCompare': rdkit.Chem.rdFMCS.BondCompare.CompareAny, 'ringCompare': rdkit.Chem.rdFMCS.RingCompare.PermissiveRingFusion, 'ringMatchesRingOnly': True}, {'atomCompare': rdkit.Chem.rdFMCS.AtomCompare.CompareAny, 'bondCompare': rdkit.Chem.rdFMCS.BondCompare.CompareOrder, 'ringCompare': rdkit.Chem.rdFMCS.RingCompare.PermissiveRingFusion, 'ringMatchesRingOnly': True}, {'atomCompare': rdkit.Chem.rdFMCS.AtomCompare.CompareElements, 'bondCompare': rdkit.Chem.rdFMCS.BondCompare.CompareOrder, 'ringCompare': rdkit.Chem.rdFMCS.RingCompare.PermissiveRingFusion, 'ringMatchesRingOnly': True}]

max_from_mol(mol: Mol = None)

merge_pair(scaffold: Mol, fragmentanda: Mol, mapping: Optional = None) → Mol

To specify attachments use .merge. To understand what is going on see .categorize

Parameters:

scaffold – mol to be added to.
fragmentanda – mol to be fragmented
mapping – see get_positional_mapping. Optional in _pre_fragment_pairs

Returns:

merge_pairing_lists(nonunique_pairs: List[Tuple[Atom, Atom]], ringcore_first=True) → List[Tuple[Atom, Atom]]

mmff_minimize(mol: Mol | None = None, neighborhood: Mol | None = None, ff_max_displacement: float = 0.0, ff_constraint: int = 10, ff_max_iterations: int = 200, ff_cutoff: float = 100.0, allow_lax: bool = True, prevent_cis: bool = True) → MinizationOutcome

Minimises a mol, or self.positioned_mol if not provided, with MMFF constrained to ff_max_displacement Å. Gets called by Victor if the flag .monster_mmff_minimisation is true during PDB template construction.

Parameters:

mol – Molecule to minimise. If None, self.positioned_mol is used.
neighborhood – Protein neighboorhood (ignored if None)
ff_max_displacement – Distance threshold (Å) for atomic positions mapped to hits for MMFF constrains. if NaN then fixed point constraints (no movement) are used. This is passed as maxDispl to MMFFAddPositionConstraint.
ff_constraint – Force constant for MMFF constraints.
ff_cutoff – kcal/mol diff value to consider a failed minimisation.
allow_lax – If True and the minimisation fails, the constraints are halved and the minimisation is rerun.

Returns:

None

Note that most methods calling this via Victor now use its .settings['ff_max_displacement'] and .settings['ff_constraint'] and do not use the defaults.

no_blending(broad=False) → None: no merging is done. The hits are mapped individually. Not great for small fragments.

property num_common: int

offset(mol: Mol)

This is to prevent clashes. The numbers of the ori indices stored in collapsed rings are offset by the class variable (_collapsed_ring_offset) multiples of 100. (autoincrements to avoid dramas)

Parameters:: mol
Returns:

origin_from_mol(mol: Mol = None)

these values are stored from Monster for scaffold, chimera and positioned_mol See make_chimera or place_from_map for more info on _Origin

Parameters:: mol – Chem.Mol
Returns:: stdev list for each atom

partial_blending() → None: multiple possible scaffolds for placement and best is chosen

partially_blend_hits(hits: List[Mol] | None = None) → List[Mol]

This is the partial merge algorithm, wherein the hits are attempted to be combined. If the combination is bad. It will not be combined. Returning a list of possible options. These will have the atoms changed too.

Parameters:

hits
distance

Returns:

property percent_common: int

pick_best() → Tuple[Mol, int]

Method for partial merging for placement

Returns:: unrefined_scaffold, mode_index

place(mol: Mol, attachment: Mol | None = None, custom_map: Dict[str, Dict[int, int]] | None = None, merging_mode: str = 'expansion', enforce_warhead_mapping: bool = True, primary_name=None)

Positioned a given mol based on the hits. (Main entrypoint) accepts the argument merging_mode, by default it is “expansion”, but was “permissive_none”, which call .by_expansion and .no_blending(broad=True) respectively. “off” (does nothing except fill the attribute initial_mol), “full” (.full_blending()), “partial” (.partial_blending()) and “none” (.no_blending(), but less thorough) are accepted.

Parameters:

mol
attachment – This the SG of the cysteine if covalent
custom_map – Dict of hit_name to Dict of hit_idx to followup_idx
merging_mode
primary_name – override the name of the primary hit if merging_mode is ‘expansion’

Returns:

place_from_map(target_mol: Mol, template_mol: Mol, atom_map: Dict | None = None, random_seed=None) → Mol

This method places the atoms with known mapping and places the ‘uniques’ (novel) via an aligned mol (the ‘sextant’) This sextant business is a workaround for the fact that only minimised molecules can use the partial embedding function of RDKit.

The template molecule may be actually two or more fragments, as happens for the no blending mode. In RDKit, the fragments within a “molecule” are not connected, but have sequential atom indices.

Parameters:

target_mol – target mol
template_mol – the template/scaffold to place the mol
atom_map – something that get_mcs_mapping would return.

Returns:

place_smiles(smiles: str, long_name: str | None = None, **kwargs)

post_ff_addition_step(mol: ~rdkit.Chem.rdchem.Mol, ff: <module 'rdkit.ForceField' from '/home/docs/checkouts/readthedocs.org/user_builds/fragmenstein/envs/latest/lib/python3.11/site-packages/rdkit/ForceField/__init__.py'>): THis is an empty method for user created subclasses to add their own constraints to the MMFF minimisation.

posthoc_refine(scaffold: Mol, indices: List[int] | None = None) → Mol

Given a scaffold and a list of indices, refine the scaffold.

Parameters:

scaffold
indices – if absent, use all atoms

Returns:

pretweak() → None

What if the fragments were prealigned slightly? Really bad things happen. Nothing currently uses this without user interverntion.

Returns:

propagate_alternatives(fewer: List[Mol]) → int: Given the alt atoms strored in the Chem.Atom property _AltSymbol try those

rectify()

static renumber_followup_custom_map(original_mol: Mol, new_mol: Mol, custom_map: Dict[str, Dict[int, int]]) → Dict[str, Dict[int, int]]: Give a followup original_mol and a copy but with its atom indices changed return a new map for the copy

sample_new_conformation(random_seed=None): This method is intended for Multivictor. It generates a new conformation based on different random seeds.

save_commonality(filename: str | None = None)

Saves an SVG of the followup fragmenstein monster with the common atoms with the chimeric scaffold highlighted.

Parameters:: filename – optinal filename to save it as. Otherwise returns a Draw.MolDraw2DSVG object.
Returns:

save_temp(mol): This is a silly debug-by-print debug method. drop it in where you want to spy on stuff.

static score_mol(mol: Mol) → float: Scores a mol without minimising

show(to_display: bool = False, show_positioned_mol: bool = True, viewer_mode='rdkit') → view' id='133176890420560'>, str]

generates a NGLWidget or a py3Dmol.view depending on viewer_mode, which is set by default to what’s installed. With the compounds and the merger if present. To override the colours: The colours will be those in the Chem.Mol’s property _color if present.

to_display if True will display the legend and the viewer. (IPython.display.display will show them too)

Returns -> nv.NGLWidget

show_comparison(*args, **kwargs)

” Show the atom provenance of the follow molecule.

Parameters:

nothing -> uses self.origin_from_mol
hit2followup -> uses the hit2followup dict

simply_merge_hits(hits: List[Mol] | None = None, linked: bool = True) → Mol

Recursively stick the hits together and average the positions. This is the monster of automerging, full-merging mapping and partial merging mapping. The latter however uses partially_blend_hits first. The hits are not ring-collapsed and -expanded herein.

Parameters:

hits – optionally give a hit list, else uses the attribute .hits.
linked – if true the molecules are joined, else they are placed in the same molecule as disconnected fragments.

Returns:

the rdkit.Chem.Mol object that will fill .scaffold

stdev_from_mol(mol: Mol = None)

these values are stored from Monster for scaffold, chimera and positioned_mol

Parameters:: mol – Chem.Mol
Returns:: stdev list for each atom

store_origin_colors_atomically(): Store the color of the origin in the mol as the private property _color :return:

store_positions(mol: Mol) → Mol

Saves positional data as _x, _y, _z and majorly _ori_i, the original index. The latter gets used by _get_new_index.

Parameters:: mol
Returns:

strict_matching_mode = {'atomCompare': rdkit.Chem.rdFMCS.AtomCompare.CompareElements, 'bondCompare': rdkit.Chem.rdFMCS.BondCompare.CompareOrder, 'matchChiralTag': True, 'ringCompare': rdkit.Chem.rdFMCS.RingCompare.StrictRingFusion, 'ringMatchesRingOnly': True}

throw_on_discard = False

to_3Dmol(show_positioned_mol: bool = True, *args, **kwargs) → <Mock name='py3Dmol.view' id='133176890420560'>

User: you probably want to use show() instead… unless you have both ngl and py3Dmol installed.

This is called by both Monster.to_3Dmol and Victor.to_3Dmol

The args and kwargs do nothing

to_nglview(show_positioned_mol: False, *args, **kwargs) → <Mock name='MolNGLWidget' id='133176890126032'>

User: you probably want to use show() instead… unless you have both ngl and py3Dmol installed.

This is called by both Monster.to_nglview and Victor.to_nglview The color can be dictated by the optional private property _color, which may have been assigned by walton.color_in() or the user.

The args and kwargs do nothing

Returns:

transfer_ring_data(donor: Atom, acceptor: Atom)

Transfer the info if a ringcore atom.

Parameters:

donor
acceptor

Returns:

fragmenstein.monster.bond_provenance module

class fragmenstein.monster.BondProvenance(value)[source]

Bases: Enum

Where does the bond come from. This is used to keep names consistent… For now original is used in places. The others are interchangeable TBH.

LINKER = 4

MAIN_NOVEL = 2

ORIGINAL = 1

OTHER_NOVEL = 3

UNASSIGNED = 5

classmethod copy_bond(donor: Bond, acceptor: Bond) → None[source]

classmethod get_bond(bond: Bond) → BondProvenance[source]

classmethod get_bonds(bonds: List[Bond]) → List[BondProvenance][source]

classmethod has_bond(bond: Bond) → bool[source]

classmethod set_all_bonds(mol: Mol, provenance_name: str) → None[source]

Sets the provenance of all bonds in mol to a category, which is a string from the provenance

Parameters:

mol
provenance_name – A string original | main_novel “ other_novel | linker

Returns:

classmethod set_bond(bond: Bond, provenance_name: str) → None[source]

classmethod set_bonds(bonds: List[Bond], provenance_name: str) → None[source]

fragmenstein.monster.positional_mapping module

class fragmenstein.monster.GPM[source]

Bases: object

This class simply contains get_positional_mapping and is inherited both by Monster and Unmerge. get_positional_mapping teturns a map to convert overlapping atom of A onto B

__init__()

cutoff = 2

classmethod get_positional_mapping(mol_A: Mol, mol_B: Mol, dummy_w_dummy=True) → Dict[int, int][source]

Returns a map to convert overlapping atom of A onto B Cutoff 2 Å (see class attr.)

Parameters:

mol_A – first molecule (Chem.Mol) will form keys
mol_B – second molecule (Chem.Mol) will form values
dummy_w_dummy – match */R with */R.

Returns:

dictionary mol A atom idx -> mol B atom idx.

fragmenstein.monster.unmerge_mapper module

class fragmenstein.monster.Unmerge(followup: Mol, mols: List[Mol], maps: Dict[str, List[Dict[int, int]]], no_discard: bool = False)[source]

Bases: GPM

This class tries to solve the mapping problem by try all possible mappings of the target to the ligand. It is one of three in Monster (full merge, partial merge, unmerge.

It is great with fragments that do not connect, but is bad when a hit has a typo.

the positions must overlap if any atom is mapped in two maps
no bond can be over 3 A

The chosen map combined_map is a dict that goes from followup mol to combined mol which is the hits in a single molecule.

Note that some molecules are discarded entirely.

__init__(followup: Mol, mols: List[Mol], maps: Dict[str, List[Dict[int, int]]], no_discard: bool = False)[source]

At the minute maps is a dict of hit name to a list of possible maps, wherein each map is a dict of atom index in the followup molecule to atom index in the target molecule, not the reverse as it will be soon… (see mcs_mapping)

Parameters:

followup (Chem.Mol) – the molecule to place
mols (List[Chem.Mol]) – 3D molecules
maps (Dict[str, List[Dict[int, int]]]) – can be generated outseide of Monster by .make_maps
no_discard – do not allow any to be discarded

bond(idx: int | None = None) → Mol[source]: Add bonds. As in the verb ‘to bond’ …

calculate(accounted_for: set)[source]: perform the calculations

check_possible_distances(other, possible_map, combined, combined_map, cutoff=2.5)[source]

cutoff = 2

distance_cutoff = 3: how distance is too distant in Å

get_inter_distance(molA: Mol, molB: Mol, idxA: int, idxB: int) → float[source]

get_key(d: dict, v: Any)[source]: Given a value and a dict and a value get the key. :param d: :param v: :return:

classmethod get_positional_mapping(mol_A: Mol, mol_B: Mol, dummy_w_dummy=True) → Dict[int, int]

Returns a map to convert overlapping atom of A onto B Cutoff 2 Å (see class attr.)

Parameters:

mol_A – first molecule (Chem.Mol) will form keys
mol_B – second molecule (Chem.Mol) will form values
dummy_w_dummy – match */R with */R.

Returns:

dictionary mol A atom idx -> mol B atom idx.

get_possible_map(other: Mol, label: str, o_map: Dict[int, int], inter_map: Dict[int, int], combined: Mol, combined_map: Dict[int, int]) → Dict[int, int][source]

This analyses a single map (o_map) and returns a possible map

Parameters:

other
label
o_map – followup -> other
inter_map
combined
combined_map – followup -> combined

Returns:

followup -> other

goodness_sorter_factory(offness_weight: int = 3) → Callable[source]: This is a factory for symmetry with template sorter… there is zero other reason for it to be so.

judge_n_move_on(combined, combined_map, other, possible_map, others, disregarded)[source]

The mutables need to be within their own scope

Parameters:

combined
combined_map
other
possible_map
others
disregarded

Returns:

classmethod make_maps(target: Mol, mols: List[Mol], mode: Dict[str, Any] | None = None) → Dict[str, List[Dict[int, int]]][source]

This is basically if someone is using this class outside of Monster

Returns a dictionary of key mol name and value a list of possible dictionary with index of an atom in target mol to the index given mol. Note that a bunch of mapping modes can be found in Monster init mixin class.

Parameters:

target – the molecule to be mapped
mols – the list of molecules with positional data to be mapped to
mode – dict of setting for MCS step

Returns:

max_strikes = 3: number of discrepancies tollerated.

measure_map(mol: Mol, mapping: Dict[int, int]) → array[source]

Returns a vector with the distances but not of length len(mapping) This used by offness to score how bad the mapping is

Parameters:

mol
mapping – followup to comined

Returns:

offness(mol: Mol, mapping: Dict[int, int], cutoff_distance: float = 2.5) → int[source]

How many bonds are too long?

Parameters:

mol
mapping

Returns:

pick = -1

rotational_approach = True

store(combined: Mol, combined_map: Dict[int, int], disregarded: List[Mol])[source]: Stores combined molecule and its map into the instance.

template_sorter_factory(accounted_for) → Callable[source]: returns the number of atoms that have not already been accounted for.

unmerge_inner(combined: Mol, combined_map: Dict[int, int], others: List[Mol], disregarded: List[Mol]) → None[source]

Assesses a combination of maps rejections: unmapped (nothing maps) / unnovel (adds nothing)

Do note that this method uses a lot of instance attributes. self.maps has the mapping data. This method combines to make self.combined_map.

Parameters:

combined
combined_map – This is passed empty the first time.
others – It’s a list of a deque that is rotated (if rotational_approach is True) of self.mols
disregarded

Returns:

fragmenstein.monster.mcs_mapping module

class fragmenstein.monster.SpecialCompareAtoms(comparison: AtomCompare = rdkit.Chem.rdFMCS.AtomCompare.CompareAnyHeavyAtom, custom_map: Dict[str, Dict[int, int]] | None = None, exclusive_mapping: bool = True)[source]

Bases: VanillaCompareAtoms

This works like the _get_atom_maps did prior to Fragmentein version 0.9. The mapping as discussed in GitHub issue #23 is in the format

mapping = { ‘hit1’: {1:1,2:5} ‘hit2’: {3:3,4:4,4:6}}

The hit index is first, followup index is the second. The index -1 for a followup index is the same as not providing the hit index, it is described here solely for clarity not for use.

mapping = { ‘hit1’: {1:1,2:5, 3:-1} ‘hit2’: {3:3,4:4,4:6}}

The index -2 for a followup index will result in the hit atom index not matching any followup index.

mapping = { ‘hit1’: {1:1,2:5, 3:-2} ‘hit2’: {3:3,4:4,4:6}}

If exclusive_mapping argument of __init__ is True, then if a followup index is present in one hit, but not in a second hit, then no atom of the second hit will match that followup atom. A negative index for a hit atom index means that no atom in that hit will match the corresponding followup index.

mapping = { ‘hit1’: {1:1,2:5,-1:3, -2: 7} ‘hit2’: {3:3,4:4,4:6}}

However, a positive integer on a different hit overrides it, therefore, in the above followup atom 3 cannot be matched to any atom in hit1, but will match atom 3 in hit2. Followup atom 7 will not match either.

SpecialCompareAtoms(custom_map=mapping, exclusive_mapping=True)

CheckAtomCharge((MCSAtomCompare)self, (MCSAtomCompareParameters)parameters, (Mol)mol1, (int)atom1, (Mol)mol2, (int)atom2) → bool :

Return True if both atoms have the same formal charge

C++ signature :: bool CheckAtomCharge(RDKit::PyMCSAtomCompare {lvalue},RDKit::MCSAtomCompareParameters,RDKit::ROMol,unsigned int,RDKit::ROMol,unsigned int)

CheckAtomChirality((MCSAtomCompare)self, (MCSAtomCompareParameters)parameters, (Mol)mol1, (int)atom1, (Mol)mol2, (int)atom2) → bool :

Return True if both atoms have, or have not, a chiral tag

C++ signature :: bool CheckAtomChirality(RDKit::PyMCSAtomCompare {lvalue},RDKit::MCSAtomCompareParameters,RDKit::ROMol,unsigned int,RDKit::ROMol,unsigned int)

CheckAtomRingMatch((MCSAtomCompare)self, (MCSAtomCompareParameters)parameters, (Mol)mol1, (int)atom1, (Mol)mol2, (int)atom2) → bool :

Return True if both atoms are, or are not, in a ring

C++ signature :: bool CheckAtomRingMatch(RDKit::PyMCSAtomCompare {lvalue},RDKit::MCSAtomCompareParameters,RDKit::ROMol,unsigned int,RDKit::ROMol,unsigned int)

__init__(comparison: AtomCompare = rdkit.Chem.rdFMCS.AtomCompare.CompareAnyHeavyAtom, custom_map: Dict[str, Dict[int, int]] | None = None, exclusive_mapping: bool = True)[source]: Whereas the atomCompare is an enum, this is a callable class. But in parameters there is no compareElement booleans etc. only Isotope… In https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/FMCS/Wrap/testFMCS.py it is clear one needs to make one’s own.

fix_custom_map(custom_map: Dict[str, Sequence[Tuple[int, int]] | Dict[int, int]]) → Dict[str, Dict[int, int]][source]

This will be deprecated in the future. As Monster.fix_custom_map is better.

Make sure its Dict[str, Dict[int, int]]

There is a bit of confusion about the custom map. Converts the custom map from dict of lists of 2-element tuples to dict of dicts.

get_custom(hit_mol: Mol, hit_atom_idx: int) → int[source]: What idx of followup corresponds to the hit_atom_idx index of hit_mol? If nothing, -1 is returned

get_valid_matches(parameters: MCSAtomCompareParameters, common: Mol, hit: Mol, followup: Mol) → List[Dict[int, int]][source]

get_valid_matches(parameters: MCSParameters, common: Mol, hit: Mol, followup: Mol) → List[IndexMap]

Returns a list of possible matches, each a dictionary of hit to follow indices, that obey the criteria of the atomic comparison. (Formerly this was a IndexAtom)

This however does not check if all the atoms in custom are present. For that, Monster._validate_vs_custom is called in the method Monster.get_mcs_mappings, after calling Monster._get_atom_maps, which calls this method. (Monster.get_mcs_mappings tries different matching schema, while Monster._get_atom_maps is for one single scheme). The primary reason why this is so, is that there are two tiers of requirements:

The custom map must be present vs
The custom map may be present, but has to be obeyed.

parameters can be rdFMCS.MCSAtomCompareParameters or rdFMCS.MCSParameters

class fragmenstein.monster.IndexMap

Sequence (Tuple or List) of tuples of two indices, the first is the hit index, the second is the followup

alias of TypeVar(‘IndexMap’, bound=Sequence[Tuple[int, int]])

__init__(name, *constraints, bound=None, covariant=False, contravariant=False)

class fragmenstein.monster.ExtendedFMCSMode[source]

These are the acceptable types for the modes of FindMCS, plus three that can be configured in rdFMCS.MCSParameters

__init__(*args, **kwargs)

atomCompare: AtomCompare

bondCompare: BondCompare

clear() → None. Remove all items from D.

completeRingsOnly: bool

copy() → a shallow copy of D

classmethod fromkeys(iterable, value=None, /): Create a new dictionary with keys from iterable and values set to value.

get(key, default=None, /): Return the value for key if key is in the dictionary, else default.

items() → a set-like object providing a view on D's items

keys() → a set-like object providing a view on D's keys

matchChiralTag: bool

matchFormalCharge: bool

matchStereo: bool

matchValences: bool

maxDistance: float

maximizeBonds: bool

pop(k[, d]) → v, remove specified key and return the corresponding value.: If the key is not found, return the default if given; otherwise, raise a KeyError.

popitem()

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

ringCompare: RingCompare

ringMatchesRingOnly: bool

seedSmarts: str

setdefault(key, default=None, /)

Insert key with a value of default if key is not in the dictionary.

Return the value for key if key is in the dictionary, else default.

threshold: float

timeout: int

update([E, ]**F) → None. Update D from dict/iterable E and F.: If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() → an object providing a view on D's values

verbose: bool