API documentation#
Core API#
Most users will only need to deal with the MOF
class.
Defining the main representation of a MOF.
- class moffragmentor.mof.MOF(structure, structure_graph)[source]#
Main representation for a MOF structure.
This container holds a structure and its associated graph. It also provides some convenience methods for getting neighbors or results of the fragmentation.
Internally, this code typically uses IStructure objects to avoid bugs due to the mutability of Structure objects (e.g. the fragmentation code performs operations on the structure and we want to be sure that there is no impact on the input).
Examples
>>> from moffragmentor import MOF >>> mof = MOF(structure, structure_graph) >>> # equivalent is to read from a cif file >>> mof = MOF.from_cif(cif_file) >>> # visualize the structure >>> mof.show_structure() >>> # get the neighbors of a site >>> mof.get_neighbor_indices(0) >>> # perform fragmentation >>> fragments mof.fragment()
- property bridges: Dict[int, int]#
Get a dictionary of bridges.
Bridges are edges in a graph that, if deleted, increase the number of connected components.
- Returns:
dictionary of bridges
- Return type:
Dict[int, int]
- property frac_coords: ndarray#
Return fractional coordinates of the structure.
We cache this call as pymatgen seems to re-compute this.
- Returns:
- fractional coordinates of the structure
in array of shape (n_sites, 3)
- Return type:
np.ndarray
- fragment(check_dimensionality=True, create_single_metal_bus=False, break_organic_nodes_at_metal=True)[source]#
Split the MOF into building blocks.
The building blocks are linkers, nodes, bound, unbound solvent, net embedding of those building blocks.
- Parameters:
check_dimensionality (bool) – Check if the node is 0D. If not, split into isolated metals. Defaults to True.
create_single_metal_bus (bool) – Create a single metal BUs. Defaults to False.
break_organic_nodes_at_metal (bool) – Break nodes into single metal BU if they appear “too organic”.
- Returns:
FragmentationResult object.
- Return type:
FragmentationResult
- classmethod from_cif(cif, symprec=None, angle_tolerance=None, get_primitive=True)[source]#
Initialize a MOF object from a cif file.
Note that this method, by default, symmetrizes the structure.
- Parameters:
cif (str) – path to the cif file
symprec (float, optional) – Symmetry precision
angle_tolerance (float, optional) – Angle tolerance
get_primitive (bool) – Whether to get the primitive cell
- Returns:
MOF object
- Return type:
- get_neighbor_indices(site)[source]#
Get list of indices of neighboring sites.
- Return type:
List
[int
]
- property nx_graph#
Structure graph as networkx graph object
- property terminal_indices: List[int]#
Return the indices of the terminal sites.
A terminal site is a site that has only one neighbor. And is connected via a bridge to the rest of the structure. That means, splitting the bond between the terminal site and the rest of the structure will increase the number of connected components.
Typical examples of terminal sites are hydrogren atoms, or halogen functional groups.
- Returns:
indices of the terminal sites
- Return type:
List[int]
Command line interface#
Command line interfaces
SBU subpackage#
Defines datastructures for the building blocks as well as collections of building blocks
Representation for a secondary building block.
- class moffragmentor.sbu.sbu.SBU(molecule, molecule_graph, graph_branching_indices, binding_indices, molecule_original_indices_mapping=None, dummy_molecule=None, dummy_molecule_graph=None, dummy_molecule_indices_mapping=None, dummy_branching_indices=None)[source]#
Representation for a secondary building block.
It also acts as container for site indices:
- graph_branching_indices: are the branching indices according
to the graph-based definition. They might not be part of the molecule.
- binding_indices: are the indices of the sites between
the branching index and metal
- original_indices: complete original set of indices that has been selected
for this building blocks
Note
The coordinates in the molecule object are not the ones directly extracted from the MOF. They are the coordinates of sites unwrapped to ensure that there are no “broken molecules” .
To obtain the “original” coordinates, use the _coordinates attribute.
Note
Dummy molecules
In dummy molecules the binding and branching sites are replaces by dummy atoms (noble gas). They also have special properties that indicate the original species.
Examples
>>> # visualize the molecule >>> sbu_object.show_molecule() >>> # search pubchem for the molecule >>> sbu_object.search_pubchem()
- property hash: str#
Return hash.
The hash is a combination of Weisfeiler-Lehman graph hash and center.
- Returns:
Hash.
- Return type:
str
- search_pubchem(listkey_counts=10, **kwargs)[source]#
Search for a molecule in pubchem # noqa: DAR401
Second element of return tuple is true if there was an identity match
- Parameters:
listkey_counts (int) – Number of list keys to return (relevant for substructure search). Defaults to 10.
kwargs – Additional arguments to pass to PubChem.search
- Returns:
List of pubchem ids and whether there was an identity match
- Return type:
Tuple[List[str], bool]
- property smiles: str#
Return canonical SMILES.
Use openbabel to compute the SMILES, but then get the canonical version with RDKit as we observed sometimes the same molecule ends up as different canonical SMILES for openbabel. If RDKit cannot make a canonical SMILES (can happen with organometallics) we simply use the openbabel version.
- Returns:
Canonical SMILES
- Return type:
str
Collection for MOF building blocks
- class moffragmentor.sbu.sbucollection.SBUCollection(sbus)[source]#
Container for a collection of SBUs
- property smiles#
Return a list of the SMILES strings of the SBUs.
- Returns:
A list of smiles strings.
- Return type:
List[str]
Create Python containers for node building blocks.
Here we understand metal clusters as nodes.
- class moffragmentor.sbu.node.Node(molecule, molecule_graph, graph_branching_indices, binding_indices, molecule_original_indices_mapping=None, dummy_molecule=None, dummy_molecule_graph=None, dummy_molecule_indices_mapping=None, dummy_branching_indices=None)[source]#
Container for metal cluster building blocks.
Will typically automatically be constructured by the fragmentor.
- classmethod from_mof_and_indices(mof, node_indices, branching_indices, binding_indices)[source]#
Build a node object from a MOF and some intermediate outputs of the fragmentation.
- Parameters:
mof (MOF) – The MOF to build the node from.
node_indices (Set[int]) – The indices of the nodes in the MOF.
branching_indices (Set[int]) – The indices of the branching points in the MOF that belong to this node.
binding_indices (Set[int]) – The indices of the binding points in the MOF that belong to this node.
- Returns:
A node object.
Describing a collection of nodes
- class moffragmentor.sbu.nodecollection.NodeCollection(sbus)[source]#
Collection of node building blocks
Describing the organic building blocks, i.e., linkers.
- class moffragmentor.sbu.linker.Linker(molecule, molecule_graph, graph_branching_indices, binding_indices, molecule_original_indices_mapping=None, dummy_molecule=None, dummy_molecule_graph=None, dummy_molecule_indices_mapping=None, dummy_branching_indices=None)[source]#
Describe a linker in a MOF
Describing an collection of linkers
- class moffragmentor.sbu.linkercollection.LinkerCollection(sbus)[source]#
Collection of linker building blocks
- property building_block_composition: List[str]#
Return a list of strings of building blocks.
Strings are of the form of L{i} where i is an integer.
- Returns:
List of strings of building blocks.
- Return type:
List[str]
molecule subpackage#
Defines datastructures for the non-building-block molecules (e.g. solvent) as well as collections of such molecules
Dealing with molecules that not part of a secondary building unit.
- class moffragmentor.molecule.nonsbumolecule.NonSbuMolecule(molecule, molecule_graph, indices, connecting_index=None)[source]#
Class to handle solvent or other non-SBU molecules.
- classmethod from_structure_graph_and_indices(structure_graph, indices)[source]#
Create a a new NonSbuMolecule from a part of a structure graph.
- Parameters:
structure_graph (StructureGraph) – Structure graph with structure attribute
indices (List[int]) – Indices that label nodes in the structure graph, indexing the molecule of interest
- Returns:
Instance of NonSbuMolecule
- Return type:
Collections of molecules, e.g. bound solvents and non-bound solvents.
Fragmentor subpackage#
This subpackage is not optimized for end-users. It is intended for developers who wish to customize the behavior of the fragmentor.
Routines for finding branching points in a structure graph of a MOF.
Note that those routines do not work for other reticular materials as they assume the presence of a metal.
- moffragmentor.fragmentor.branching_points.get_branch_points(mof)[source]#
Get all branching points in the MOF.
- Parameters:
mof (MOF) – MOF object.
- Returns:
List of indices of branching points.
- Return type:
List[int]
Some pure functions that are used to perform the node identification.
Node classification techniques described in https://pubs.acs.org/doi/pdf/10.1021/acs.cgd.8b00126.
- class moffragmentor.fragmentor.nodelocator.NodelocationResult(nodes, branching_indices, connecting_paths, binding_indices, to_terminal_from_branching)#
- binding_indices#
Alias for field number 3
- branching_indices#
Alias for field number 1
- connecting_paths#
Alias for field number 2
- nodes#
Alias for field number 0
- to_terminal_from_branching#
Alias for field number 4
- moffragmentor.fragmentor.nodelocator.find_node_clusters(mof, unbound_solvent_indices=None, forbidden_indices=None)[source]#
Locate the branching indices, and node clusters in MOFs.
Starting from the metal indices it performs depth first search on the structure graph up to branching points.
- Parameters:
mof (MOF) – moffragmentor MOF instance
unbound_solvent_indices (List[int], optionl) – indices of unbound solvent atoms. Defaults to None.
forbidden_indices (List[int], optional) – indices not considered as metals, for instance, because they are part of a linker. Defaults to None.
- Returns:
- nametuple with the slots “nodes”, “branching_indices” and
”connecting_paths”
- Return type:
Based on the node location, locate the linkers
- moffragmentor.fragmentor.linkerlocator.create_linker_collection(mof, node_location_result, node_collection, unbound_solvents, bound_solvents)[source]#
Based on MOF, node locaion and unbound solvent location locate the linkers
- Return type:
Tuple
[LinkerCollection
,dict
]
Funnctions that can be used to locate bound and unbound sovlent
- moffragmentor.fragmentor.solventlocator.get_all_bound_solvent_molecules(mof, node_atom_sets)[source]#
Identify all bound solvent molecules.
Bound solvent is defined as being connected via one bridge to one metal center.
- Parameters:
mof (MOF) – instance of a MOF object
node_atom_sets (List[Set[int]]) – List of indices for the MOF nodes
- Returns:
- Collection of NonSbuMolecule objects
containing the bound solvent molecules
- Return type:
- moffragmentor.fragmentor.solventlocator.get_floating_solvent_molecules(mof)[source]#
Create a collection of NonSbuMolecules from a MOF.
- Parameters:
mof (MOF) – instance of MOF
- Returns:
collection of NonSbuMolecules
- Return type:
Extraction of pymatgen Molecules from a structure for which we know the branching points.
This module contains functions that perform filtering on indices or fragments.
Those fragments are typically obtained from the other fragmentation modules.
- moffragmentor.fragmentor.filter.bridges_across_cell(mof, indices)[source]#
Check if a molecule of indices bridges across the cell
- Return type:
bool
- moffragmentor.fragmentor.filter.in_hull(pointcloud, hull)[source]#
Test if points in p are in hull.
Taken from https://stackoverflow.com/a/16898636
- Parameters:
pointcloud (np.array) – points to test (NxK coordinates of N points in K dimensions)
hull (np.array) – Is either a scipy.spatial.Delaunay object or the MxK array of the coordinates of M points in K dimensions for which Delaunay triangulation will be computed
- Returns:
True if all points are in the hull, False otherwise
- Return type:
bool
Generate molecules as the subgraphs from graphs
- moffragmentor.fragmentor.molfromgraph.wrap_molecule(mol_idxs, mof, starting_index=None, add_additional_site=True)[source]#
Wrap a molecule in the cell of the MOF by walking along the structure graph.
For this we perform BFS from the starting index. That is, we use a queue to keep track of the indices of the atoms that we still need to visit (the neighbors of the current index). We then compute new coordinates by computing the Cartesian coordinates of the neighbor image closest to the new coordinates of the current atom.
To then create a Molecule with the correct ordering of sites, we walk through the hash table in the order of the original indices.
- Parameters:
mol_idxs (Iterable[int]) – The indices of the atoms in the molecule in the MOF.
mof (MOF) – MOF object that contains the mol_idxs.
starting_index (int, optional) – Starting index for the walk. Defaults to 0.
add_additional_site (bool) – Whether to add an additional site
- Returns:
wrapped molecule
- Return type:
Molecule
Utils subpackage#
Also the utils
subpackage is not optimized for end-users.
Helper functions.
- class moffragmentor.utils.IStructure(lattice, species, coords, charge=None, validate_proximity=False, to_unit_cell=False, coords_are_cartesian=False, site_properties=None)[source]#
pymatgen IStructure with faster equality comparison.
This dramatically speeds up lookups in the LRU cache when an object with the same __hash__ is already in the cache.
- moffragmentor.utils.enable_logging()[source]#
Set up the mofdscribe logging with sane defaults.
- Return type:
List
[int
]
- moffragmentor.utils.get_sub_structure(mof, indices)[source]#
Return a sub-structure of the structure with only the sites with the given indices.
- Parameters:
mof (MOF) – MOF object
indices (Collection[int]) – Collection of integers
- Returns:
sub-structure of the structure with only the sites with the given indices
- Return type:
Structure
- moffragmentor.utils.is_tool(name)[source]#
Check whether name is on PATH and marked as executable.
https://stackoverflow.com/questions/11210104/check-if-a-program-exists-from-a-python-script
- Parameters:
name (str) – The name of the tool to check for.
- Returns:
True if the tool is on PATH and marked as executable.
- Return type:
bool
Errors reused across the moffragmentor package
- exception moffragmentor.utils.errors.JavaNotFoundError[source]#
Raised if Java executable could not be found
Methods to rank molecules according to some measure of similarity.
- moffragmentor.utils.mol_compare.mcs_rank(smiles_reference, smiles, additional_attributes=None)[source]#
Rank SMILES based on the maximum common substructure to the reference smiles.
- moffragmentor.utils.mol_compare.tanimoto_rank(smiles_reference, smiles, additional_attributes=None)[source]#
Rank SMILES based on the Tanimoto similarity to the reference smiles.
Methods on structure graphs
Methods for running systre