Enumerations#
Enums used across atomworks.
- class atomworks.enums.ChainType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
IntEnum
IntEnum representing the type of chain in a RCSB mmCIF file from the Protein Data Bank (PDB).
Useful constants relating to ChainType are defined in
ChainTypeInfo
.Note
The chain type fields in the PDB are not stable; note the specific versions of the dictionaries used (updated November, 2024)
References
RCSB mmCIF Dictionary - entity.type RCSB mmCIF Dictionary - entity_poly.type
- BRANCHED = 10#
- CYCLIC_PSEUDO_PEPTIDE = 0#
- DNA = 3#
- DNA_RNA_HYBRID = 4#
- MACROLIDE = 11#
- NON_POLYMER = 8#
- OTHER_POLYMER = 1#
- PEPTIDE_NUCLEIC_ACID = 2#
- POLYPEPTIDE_D = 5#
- POLYPEPTIDE_L = 6#
- RNA = 7#
- WATER = 9#
- static as_enum(value: str | int | ChainType) ChainType [source]#
Convert a string, int, or ChainType to a ChainType enum.
- Parameters:
value – The value to convert to a ChainType enum.
- Returns:
The corresponding ChainType enum.
- Raises:
ValueError – If the value cannot be converted to a ChainType.
- classmethod from_string(str_value: str) ChainType [source]#
Convert a string to a ChainType enum.
- Parameters:
str_value – The string value to convert.
- Returns:
The corresponding ChainType enum.
- Raises:
ValueError – If the string value is not a valid chain type.
- static get_all_types() list[ChainType] [source]#
Get a list of all chain types.
- Returns:
List of all chain types.
- static get_chain_type_strings() list[str] [source]#
Get a list of all chain type strings.
- Returns:
List of all valid chain type strings.
- static get_non_polymers() list[ChainType] [source]#
Get a list of all non-polymer chain types.
- Returns:
List of non-polymer chain types.
- static get_nucleic_acids() list[ChainType] [source]#
Get a list of all nucleic acid chain types.
- Returns:
List of nucleic acid chain types.
- static get_polymers() list[ChainType] [source]#
Get a list of all polymer chain types.
- Returns:
List of polymer chain types.
- static get_proteins() list[ChainType] [source]#
Get a list of all protein chain types.
- Returns:
List of protein chain types.
- get_valid_chem_comp_types() set[str] [source]#
Get the set of valid chemical component types for a ChainType.
- Returns:
Set of valid chemical component types for this chain type.
- is_non_polymer() bool [source]#
Check if a ChainType is a non-polymer.
- Returns:
True if this chain type represents a non-polymer, False otherwise.
- is_nucleic_acid() bool [source]#
Check if a ChainType is a nucleic acid.
- Returns:
True if this chain type represents a nucleic acid, False otherwise.
- is_polymer() bool [source]#
Check if a ChainType is a polymer.
- Returns:
True if this chain type represents a polymer, False otherwise.
- class atomworks.enums.ChainTypeInfo[source]#
Bases:
object
Companion class containing metadata and helper methods for ChainType enum.
This class should not be instantiated - it serves as a namespace for ChainType-related constants and utilities.
- ATOMS_AT_POLYMER_BOND: Final[mappingproxy[ChainType, tuple[str, str]]] = mappingproxy({<ChainType.POLYPEPTIDE_D: 5>: ('C', 'N'), <ChainType.POLYPEPTIDE_L: 6>: ('C', 'N'), <ChainType.CYCLIC_PSEUDO_PEPTIDE: 0>: ('C', 'N'), <ChainType.RNA: 7>: ("O3'", 'P'), <ChainType.DNA: 3>: ("O3'", 'P'), <ChainType.DNA_RNA_HYBRID: 4>: ("O3'", 'P')})#
Mapping of chain types to the atoms that they link when part of a polymer.
- CHEM_COMP_TYPE_TO_ENUM: Final[mappingproxy[str, ChainType]] = mappingproxy({'PEPTIDE-LIKE': <ChainType.PEPTIDE_NUCLEIC_ACID: 2>, 'D-BETA-PEPTIDE, C-GAMMA LINKING': <ChainType.POLYPEPTIDE_D: 5>, 'L-PEPTIDE COOH CARBOXY TERMINUS': <ChainType.POLYPEPTIDE_L: 6>, 'L-PEPTIDE NH3 AMINO TERMINUS': <ChainType.POLYPEPTIDE_L: 6>, 'L-PEPTIDE LINKING': <ChainType.POLYPEPTIDE_L: 6>, 'L-GAMMA-PEPTIDE, C-DELTA LINKING': <ChainType.POLYPEPTIDE_L: 6>, 'L-BETA-PEPTIDE, C-GAMMA LINKING': <ChainType.POLYPEPTIDE_L: 6>, 'D-PEPTIDE LINKING': <ChainType.POLYPEPTIDE_D: 5>, 'D-GAMMA-PEPTIDE, C-DELTA LINKING': <ChainType.POLYPEPTIDE_D: 5>, 'D-PEPTIDE COOH CARBOXY TERMINUS': <ChainType.POLYPEPTIDE_D: 5>, 'D-PEPTIDE NH3 AMINO TERMINUS': <ChainType.POLYPEPTIDE_D: 5>, 'PEPTIDE LINKING': <ChainType.POLYPEPTIDE_L: 6>, 'DNA OH 3 PRIME TERMINUS': <ChainType.DNA_RNA_HYBRID: 4>, 'DNA OH 5 PRIME TERMINUS': <ChainType.DNA_RNA_HYBRID: 4>, 'DNA LINKING': <ChainType.DNA_RNA_HYBRID: 4>, 'RNA LINKING': <ChainType.RNA: 7>, 'L-RNA LINKING': <ChainType.RNA: 7>, 'RNA OH 5 PRIME TERMINUS': <ChainType.RNA: 7>, 'RNA OH 3 PRIME TERMINUS': <ChainType.RNA: 7>, 'L-DNA LINKING': <ChainType.DNA_RNA_HYBRID: 4>})#
Mapping from chemical component types to ChainType enums.
- ENUM_TO_STRING: Final[mappingproxy[ChainType, str]] = mappingproxy({<ChainType.CYCLIC_PSEUDO_PEPTIDE: 0>: 'CYCLIC-PSEUDO-PEPTIDE', <ChainType.OTHER_POLYMER: 1>: 'OTHER', <ChainType.PEPTIDE_NUCLEIC_ACID: 2>: 'PEPTIDE NUCLEIC ACID', <ChainType.DNA: 3>: 'POLYDEOXYRIBONUCLEOTIDE', <ChainType.DNA_RNA_HYBRID: 4>: 'POLYDEOXYRIBONUCLEOTIDE/POLYRIBONUCLEOTIDE HYBRID', <ChainType.POLYPEPTIDE_D: 5>: 'POLYPEPTIDE(D)', <ChainType.POLYPEPTIDE_L: 6>: 'POLYPEPTIDE(L)', <ChainType.RNA: 7>: 'POLYRIBONUCLEOTIDE', <ChainType.BRANCHED: 10>: 'BRANCHED', <ChainType.MACROLIDE: 11>: 'MACROLIDE', <ChainType.NON_POLYMER: 8>: 'NON-POLYMER', <ChainType.WATER: 9>: 'WATER'})#
Mapping from ChainType enums to chain_type strings.
- NON_POLYMERS: Final[tuple[ChainType, ...]] = (ChainType.BRANCHED, ChainType.MACROLIDE, ChainType.NON_POLYMER, ChainType.WATER)#
- NUCLEIC_ACIDS: Final[tuple[ChainType, ...]] = (ChainType.DNA, ChainType.RNA, ChainType.DNA_RNA_HYBRID)#
- POLYMERS: Final[tuple[ChainType, ...]] = (ChainType.CYCLIC_PSEUDO_PEPTIDE, ChainType.OTHER_POLYMER, ChainType.PEPTIDE_NUCLEIC_ACID, ChainType.DNA, ChainType.DNA_RNA_HYBRID, ChainType.POLYPEPTIDE_D, ChainType.POLYPEPTIDE_L, ChainType.RNA)#
- PROTEINS: Final[tuple[ChainType, ...]] = (ChainType.POLYPEPTIDE_D, ChainType.POLYPEPTIDE_L, ChainType.CYCLIC_PSEUDO_PEPTIDE)#
- STRING_TO_ENUM: Final[mappingproxy[str, ChainType]] = mappingproxy({'CYCLIC-PSEUDO-PEPTIDE': <ChainType.CYCLIC_PSEUDO_PEPTIDE: 0>, 'OTHER': <ChainType.OTHER_POLYMER: 1>, 'PEPTIDE NUCLEIC ACID': <ChainType.PEPTIDE_NUCLEIC_ACID: 2>, 'POLYDEOXYRIBONUCLEOTIDE': <ChainType.DNA: 3>, 'POLYDEOXYRIBONUCLEOTIDE/POLYRIBONUCLEOTIDE HYBRID': <ChainType.DNA_RNA_HYBRID: 4>, 'POLYPEPTIDE(D)': <ChainType.POLYPEPTIDE_D: 5>, 'POLYPEPTIDE(L)': <ChainType.POLYPEPTIDE_L: 6>, 'POLYRIBONUCLEOTIDE': <ChainType.RNA: 7>, 'BRANCHED': <ChainType.BRANCHED: 10>, 'MACROLIDE': <ChainType.MACROLIDE: 11>, 'NON-POLYMER': <ChainType.NON_POLYMER: 8>, 'WATER': <ChainType.WATER: 9>})#
Mapping from chain_type strings to ChainType enums.
- VALID_CHEM_COMP_TYPES: Final[mappingproxy[ChainType, set[str]]] = mappingproxy({<ChainType.CYCLIC_PSEUDO_PEPTIDE: 0>: frozenset({'PEPTIDE-LIKE', 'D-BETA-PEPTIDE, C-GAMMA LINKING', 'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-PEPTIDE NH3 AMINO TERMINUS', 'L-PEPTIDE LINKING', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'L-BETA-PEPTIDE, C-GAMMA LINKING', 'D-PEPTIDE LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'D-PEPTIDE NH3 AMINO TERMINUS', 'PEPTIDE LINKING'}), <ChainType.PEPTIDE_NUCLEIC_ACID: 2>: frozenset({'D-BETA-PEPTIDE, C-GAMMA LINKING', 'L-PEPTIDE NH3 AMINO TERMINUS', 'L-PEPTIDE LINKING', 'DNA OH 3 PRIME TERMINUS', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'DNA OH 5 PRIME TERMINUS', 'DNA LINKING', 'D-PEPTIDE NH3 AMINO TERMINUS', 'PEPTIDE LINKING', 'RNA LINKING', 'PEPTIDE-LIKE', 'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-BETA-PEPTIDE, C-GAMMA LINKING', 'D-PEPTIDE LINKING', 'L-RNA LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'RNA OH 5 PRIME TERMINUS', 'RNA OH 3 PRIME TERMINUS', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'L-DNA LINKING'}), <ChainType.DNA: 3>: frozenset({'DNA OH 3 PRIME TERMINUS', 'L-DNA LINKING', 'DNA OH 5 PRIME TERMINUS', 'DNA LINKING'}), <ChainType.DNA_RNA_HYBRID: 4>: frozenset({'DNA OH 3 PRIME TERMINUS', 'L-RNA LINKING', 'DNA OH 5 PRIME TERMINUS', 'RNA LINKING', 'DNA LINKING', 'RNA OH 5 PRIME TERMINUS', 'RNA OH 3 PRIME TERMINUS', 'L-DNA LINKING'}), <ChainType.POLYPEPTIDE_D: 5>: frozenset({'D-BETA-PEPTIDE, C-GAMMA LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'D-PEPTIDE NH3 AMINO TERMINUS', 'PEPTIDE LINKING', 'D-PEPTIDE LINKING'}), <ChainType.POLYPEPTIDE_L: 6>: frozenset({'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-PEPTIDE NH3 AMINO TERMINUS', 'L-PEPTIDE LINKING', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'L-BETA-PEPTIDE, C-GAMMA LINKING', 'PEPTIDE LINKING'}), <ChainType.RNA: 7>: frozenset({'L-RNA LINKING', 'RNA OH 3 PRIME TERMINUS', 'RNA LINKING', 'RNA OH 5 PRIME TERMINUS'})})#
Mapping from ChainType enums to valid chemical component types.
- class atomworks.enums.GroundTruthConformerPolicy(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
IntEnum
Enum for ground truth conformer policy.
- Possible values are:
REPLACE: Use the ground-truth coordinates as the reference conformer, replacing the coordinates generated by RDKit in-place (and add a flag to indicate that the coordinates were replaced)
ADD: Return an additional feature (with the same shape as ref_pos) containing the ground-truth coordinates
FALLBACK: Use the ground-truth coordinates only if our standard conformer generation pipeline fails (e.g., we cannot generate a conformer with RDKit, and the molecule is either not in the CCD or the CCD entry is invalid)
IGNORE: Do not use the ground-truth coordinates as the reference conformer under any circumstances
- ADD = 2#
- FALLBACK = 3#
- IGNORE = 4#
- REPLACE = 1#
- class atomworks.enums.HydrogenPolicy(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
StrEnum
Enum for hydrogen policy.
- Possible values are:
KEEP: Keep the hydrogens as they are
REMOVE: Remove the hydrogens
INFER: Infer the hydrogens from the atom array
- INFER = 'infer'#
- KEEP = 'keep'#
- REMOVE = 'remove'#