Enumerations#

Enums used across atomworks.

class atomworks.enums.ChainType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: IntEnum

IntEnum representing the type of chain in a RCSB mmCIF file from the Protein Data Bank (PDB).

Useful constants relating to ChainType are defined in ChainTypeInfo.

Note

The chain type fields in the PDB are not stable; note the specific versions of the dictionaries used (updated November, 2024)

References

RCSB mmCIF Dictionary - entity.type RCSB mmCIF Dictionary - entity_poly.type

BRANCHED = 10#
CYCLIC_PSEUDO_PEPTIDE = 0#
DNA = 3#
DNA_RNA_HYBRID = 4#
MACROLIDE = 11#
NON_POLYMER = 8#
OTHER_POLYMER = 1#
PEPTIDE_NUCLEIC_ACID = 2#
POLYPEPTIDE_D = 5#
POLYPEPTIDE_L = 6#
RNA = 7#
WATER = 9#
static as_enum(value: str | int | ChainType) ChainType[source]#

Convert a string, int, or ChainType to a ChainType enum.

Parameters:

value – The value to convert to a ChainType enum.

Returns:

The corresponding ChainType enum.

Raises:

ValueError – If the value cannot be converted to a ChainType.

classmethod from_string(str_value: str) ChainType[source]#

Convert a string to a ChainType enum.

Parameters:

str_value – The string value to convert.

Returns:

The corresponding ChainType enum.

Raises:

ValueError – If the string value is not a valid chain type.

static get_all_types() list[ChainType][source]#

Get a list of all chain types.

Returns:

List of all chain types.

static get_chain_type_strings() list[str][source]#

Get a list of all chain type strings.

Returns:

List of all valid chain type strings.

static get_non_polymers() list[ChainType][source]#

Get a list of all non-polymer chain types.

Returns:

List of non-polymer chain types.

static get_nucleic_acids() list[ChainType][source]#

Get a list of all nucleic acid chain types.

Returns:

List of nucleic acid chain types.

static get_polymers() list[ChainType][source]#

Get a list of all polymer chain types.

Returns:

List of polymer chain types.

static get_proteins() list[ChainType][source]#

Get a list of all protein chain types.

Returns:

List of protein chain types.

get_valid_chem_comp_types() set[str][source]#

Get the set of valid chemical component types for a ChainType.

Returns:

Set of valid chemical component types for this chain type.

is_non_polymer() bool[source]#

Check if a ChainType is a non-polymer.

Returns:

True if this chain type represents a non-polymer, False otherwise.

is_nucleic_acid() bool[source]#

Check if a ChainType is a nucleic acid.

Returns:

True if this chain type represents a nucleic acid, False otherwise.

is_polymer() bool[source]#

Check if a ChainType is a polymer.

Returns:

True if this chain type represents a polymer, False otherwise.

is_protein() bool[source]#

Check if a ChainType is a protein.

Returns:

True if this chain type represents a protein, False otherwise.

to_string() str[source]#

Convert a ChainType enum to a string.

Note

Returns UPPERCASE string (e.g., “POLYPEPTIDE(D)” instead of “polypeptide(D)”)

Returns:

Uppercase string representation of the chain type.

class atomworks.enums.ChainTypeInfo[source]#

Bases: object

Companion class containing metadata and helper methods for ChainType enum.

This class should not be instantiated - it serves as a namespace for ChainType-related constants and utilities.

ATOMS_AT_POLYMER_BOND: Final[mappingproxy[ChainType, tuple[str, str]]] = mappingproxy({<ChainType.POLYPEPTIDE_D: 5>: ('C', 'N'), <ChainType.POLYPEPTIDE_L: 6>: ('C', 'N'), <ChainType.CYCLIC_PSEUDO_PEPTIDE: 0>: ('C', 'N'), <ChainType.RNA: 7>: ("O3'", 'P'), <ChainType.DNA: 3>: ("O3'", 'P'), <ChainType.DNA_RNA_HYBRID: 4>: ("O3'", 'P')})#

Mapping of chain types to the atoms that they link when part of a polymer.

CHEM_COMP_TYPE_TO_ENUM: Final[mappingproxy[str, ChainType]] = mappingproxy({'PEPTIDE-LIKE': <ChainType.PEPTIDE_NUCLEIC_ACID: 2>, 'D-BETA-PEPTIDE, C-GAMMA LINKING': <ChainType.POLYPEPTIDE_D: 5>, 'L-PEPTIDE COOH CARBOXY TERMINUS': <ChainType.POLYPEPTIDE_L: 6>, 'L-PEPTIDE NH3 AMINO TERMINUS': <ChainType.POLYPEPTIDE_L: 6>, 'L-PEPTIDE LINKING': <ChainType.POLYPEPTIDE_L: 6>, 'L-GAMMA-PEPTIDE, C-DELTA LINKING': <ChainType.POLYPEPTIDE_L: 6>, 'L-BETA-PEPTIDE, C-GAMMA LINKING': <ChainType.POLYPEPTIDE_L: 6>, 'D-PEPTIDE LINKING': <ChainType.POLYPEPTIDE_D: 5>, 'D-GAMMA-PEPTIDE, C-DELTA LINKING': <ChainType.POLYPEPTIDE_D: 5>, 'D-PEPTIDE COOH CARBOXY TERMINUS': <ChainType.POLYPEPTIDE_D: 5>, 'D-PEPTIDE NH3 AMINO TERMINUS': <ChainType.POLYPEPTIDE_D: 5>, 'PEPTIDE LINKING': <ChainType.POLYPEPTIDE_L: 6>, 'DNA OH 3 PRIME TERMINUS': <ChainType.DNA_RNA_HYBRID: 4>, 'DNA OH 5 PRIME TERMINUS': <ChainType.DNA_RNA_HYBRID: 4>, 'DNA LINKING': <ChainType.DNA_RNA_HYBRID: 4>, 'RNA LINKING': <ChainType.RNA: 7>, 'L-RNA LINKING': <ChainType.RNA: 7>, 'RNA OH 5 PRIME TERMINUS': <ChainType.RNA: 7>, 'RNA OH 3 PRIME TERMINUS': <ChainType.RNA: 7>, 'L-DNA LINKING': <ChainType.DNA_RNA_HYBRID: 4>})#

Mapping from chemical component types to ChainType enums.

ENUM_TO_STRING: Final[mappingproxy[ChainType, str]] = mappingproxy({<ChainType.CYCLIC_PSEUDO_PEPTIDE: 0>: 'CYCLIC-PSEUDO-PEPTIDE', <ChainType.OTHER_POLYMER: 1>: 'OTHER', <ChainType.PEPTIDE_NUCLEIC_ACID: 2>: 'PEPTIDE NUCLEIC ACID', <ChainType.DNA: 3>: 'POLYDEOXYRIBONUCLEOTIDE', <ChainType.DNA_RNA_HYBRID: 4>: 'POLYDEOXYRIBONUCLEOTIDE/POLYRIBONUCLEOTIDE HYBRID', <ChainType.POLYPEPTIDE_D: 5>: 'POLYPEPTIDE(D)', <ChainType.POLYPEPTIDE_L: 6>: 'POLYPEPTIDE(L)', <ChainType.RNA: 7>: 'POLYRIBONUCLEOTIDE', <ChainType.BRANCHED: 10>: 'BRANCHED', <ChainType.MACROLIDE: 11>: 'MACROLIDE', <ChainType.NON_POLYMER: 8>: 'NON-POLYMER', <ChainType.WATER: 9>: 'WATER'})#

Mapping from ChainType enums to chain_type strings.

NON_POLYMERS: Final[tuple[ChainType, ...]] = (ChainType.BRANCHED, ChainType.MACROLIDE, ChainType.NON_POLYMER, ChainType.WATER)#
NUCLEIC_ACIDS: Final[tuple[ChainType, ...]] = (ChainType.DNA, ChainType.RNA, ChainType.DNA_RNA_HYBRID)#
POLYMERS: Final[tuple[ChainType, ...]] = (ChainType.CYCLIC_PSEUDO_PEPTIDE, ChainType.OTHER_POLYMER, ChainType.PEPTIDE_NUCLEIC_ACID, ChainType.DNA, ChainType.DNA_RNA_HYBRID, ChainType.POLYPEPTIDE_D, ChainType.POLYPEPTIDE_L, ChainType.RNA)#
PROTEINS: Final[tuple[ChainType, ...]] = (ChainType.POLYPEPTIDE_D, ChainType.POLYPEPTIDE_L, ChainType.CYCLIC_PSEUDO_PEPTIDE)#
STRING_TO_ENUM: Final[mappingproxy[str, ChainType]] = mappingproxy({'CYCLIC-PSEUDO-PEPTIDE': <ChainType.CYCLIC_PSEUDO_PEPTIDE: 0>, 'OTHER': <ChainType.OTHER_POLYMER: 1>, 'PEPTIDE NUCLEIC ACID': <ChainType.PEPTIDE_NUCLEIC_ACID: 2>, 'POLYDEOXYRIBONUCLEOTIDE': <ChainType.DNA: 3>, 'POLYDEOXYRIBONUCLEOTIDE/POLYRIBONUCLEOTIDE HYBRID': <ChainType.DNA_RNA_HYBRID: 4>, 'POLYPEPTIDE(D)': <ChainType.POLYPEPTIDE_D: 5>, 'POLYPEPTIDE(L)': <ChainType.POLYPEPTIDE_L: 6>, 'POLYRIBONUCLEOTIDE': <ChainType.RNA: 7>, 'BRANCHED': <ChainType.BRANCHED: 10>, 'MACROLIDE': <ChainType.MACROLIDE: 11>, 'NON-POLYMER': <ChainType.NON_POLYMER: 8>, 'WATER': <ChainType.WATER: 9>})#

Mapping from chain_type strings to ChainType enums.

VALID_CHEM_COMP_TYPES: Final[mappingproxy[ChainType, set[str]]] = mappingproxy({<ChainType.CYCLIC_PSEUDO_PEPTIDE: 0>: frozenset({'PEPTIDE-LIKE', 'D-BETA-PEPTIDE, C-GAMMA LINKING', 'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-PEPTIDE NH3 AMINO TERMINUS', 'L-PEPTIDE LINKING', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'L-BETA-PEPTIDE, C-GAMMA LINKING', 'D-PEPTIDE LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'D-PEPTIDE NH3 AMINO TERMINUS', 'PEPTIDE LINKING'}), <ChainType.PEPTIDE_NUCLEIC_ACID: 2>: frozenset({'D-BETA-PEPTIDE, C-GAMMA LINKING', 'L-PEPTIDE NH3 AMINO TERMINUS', 'L-PEPTIDE LINKING', 'DNA OH 3 PRIME TERMINUS', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'DNA OH 5 PRIME TERMINUS', 'DNA LINKING', 'D-PEPTIDE NH3 AMINO TERMINUS', 'PEPTIDE LINKING', 'RNA LINKING', 'PEPTIDE-LIKE', 'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-BETA-PEPTIDE, C-GAMMA LINKING', 'D-PEPTIDE LINKING', 'L-RNA LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'RNA OH 5 PRIME TERMINUS', 'RNA OH 3 PRIME TERMINUS', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'L-DNA LINKING'}), <ChainType.DNA: 3>: frozenset({'DNA OH 3 PRIME TERMINUS', 'L-DNA LINKING', 'DNA OH 5 PRIME TERMINUS', 'DNA LINKING'}), <ChainType.DNA_RNA_HYBRID: 4>: frozenset({'DNA OH 3 PRIME TERMINUS', 'L-RNA LINKING', 'DNA OH 5 PRIME TERMINUS', 'RNA LINKING', 'DNA LINKING', 'RNA OH 5 PRIME TERMINUS', 'RNA OH 3 PRIME TERMINUS', 'L-DNA LINKING'}), <ChainType.POLYPEPTIDE_D: 5>: frozenset({'D-BETA-PEPTIDE, C-GAMMA LINKING', 'D-GAMMA-PEPTIDE, C-DELTA LINKING', 'D-PEPTIDE COOH CARBOXY TERMINUS', 'D-PEPTIDE NH3 AMINO TERMINUS', 'PEPTIDE LINKING', 'D-PEPTIDE LINKING'}), <ChainType.POLYPEPTIDE_L: 6>: frozenset({'L-PEPTIDE COOH CARBOXY TERMINUS', 'L-PEPTIDE NH3 AMINO TERMINUS', 'L-PEPTIDE LINKING', 'L-GAMMA-PEPTIDE, C-DELTA LINKING', 'L-BETA-PEPTIDE, C-GAMMA LINKING', 'PEPTIDE LINKING'}), <ChainType.RNA: 7>: frozenset({'L-RNA LINKING', 'RNA OH 3 PRIME TERMINUS', 'RNA LINKING', 'RNA OH 5 PRIME TERMINUS'})})#

Mapping from ChainType enums to valid chemical component types.

class atomworks.enums.GroundTruthConformerPolicy(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: IntEnum

Enum for ground truth conformer policy.

Possible values are:
  • REPLACE: Use the ground-truth coordinates as the reference conformer, replacing the coordinates generated by RDKit in-place (and add a flag to indicate that the coordinates were replaced)

  • ADD: Return an additional feature (with the same shape as ref_pos) containing the ground-truth coordinates

  • FALLBACK: Use the ground-truth coordinates only if our standard conformer generation pipeline fails (e.g., we cannot generate a conformer with RDKit, and the molecule is either not in the CCD or the CCD entry is invalid)

  • IGNORE: Do not use the ground-truth coordinates as the reference conformer under any circumstances

ADD = 2#
FALLBACK = 3#
IGNORE = 4#
REPLACE = 1#
class atomworks.enums.HydrogenPolicy(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: StrEnum

Enum for hydrogen policy.

Possible values are:
  • KEEP: Keep the hydrogens as they are

  • REMOVE: Remove the hydrogens

  • INFER: Infer the hydrogens from the atom array

INFER = 'infer'#
KEEP = 'keep'#
REMOVE = 'remove'#