Selection Utilities#

Tools for atom and segment selection on AtomArray and AtomArrayStack.

Provides helpers to compute segment boundaries and apply expressive selection syntax to structures.

Key public objects: - AtomSelection - AtomSelectionStack - SegmentSlice

See individual docstrings for usage and examples.

class atomworks.io.utils.selection.AtomSelection(chain_id: str = '*', res_name: str = '*', res_id: int | str = '*', atom_name: str = '*', transformation_id: int | str = '*')[source]#

Bases: object

Represent a selection of atoms in a molecular structure.

classmethod from_pymol_str(pymol_string: str) AtomSelection[source]#

Create a selection from a PyMOL atom label string.

PyMOL strings are of the form CHAIN/RES`RESID/ATOM and do not support transformation_id. "*" may be used as a wildcard.

PyMOL strings do not support transformation_id.

We introduce to default PyMOL syntax the “*” operator as a wildcard to select all atoms in a given granularity.

Example

>>> # Selects the OD2 atom of the ASP residue at chain A, residue index 37
>>> AtomSelection.from_pymol_str("A/ASP`37/OD2")
classmethod from_selection_str(selection_string: str) AtomSelection[source]#

Create a selection from CHAIN/RES/RESID/ATOM/TRANSFORM syntax.

"*" acts as a wildcard for any field. Trailing fields may be omitted and default to "*".

Examples

>>> # Selects the CA atom of the ALA residue at chain A, residue index 1
>>> AtomSelection.from_selection_str("A/ALA/1/CA")
>>> # Selects the CB atom of the ALA residue in any chain at any residue index
>>> AtomSelection.from_selection_str("*/ALA/*/CB")
>>> # Selects all atoms of the ALA residue at chain A
>>> AtomSelection.from_selection_str("A/ALA/")
>>> # Selects the CA atom of the ALA residue at chain A, residue index 1, transformation index 1
>>> AtomSelection.from_selection_str("A/ALA/1/CA/1")
get_idxs(atom_array: AtomArray) ndarray[source]#

Get the indices of atoms selected by this AtomSelection.

get_mask(atom_array: AtomArray) ndarray[source]#

Create a boolean mask using this AtomSelection on an AtomArray.

class atomworks.io.utils.selection.AtomSelectionStack(selections: list[AtomSelection])[source]#

Bases: object

Manage multiple AtomSelection objects as a unioned query.

Supports ranges and comma-separated tokens via from_query() and contiguous ranges via from_contig().

classmethod from_contig(contig: str) AtomSelectionStack[source]#

Create a stack from contiguous residue ranges.

Contig strings specify inclusive residue index ranges, e.g. "A1-2" or "A1-2, B3-10".

Parameters:

contig – Contiguous residue selection string like "A1-2, B3-10".

Examples

>>> # Selects residues 1..2 in chain A
>>> AtomSelectionStack.from_contig("A1-2")
>>> # Selects residues 1..2 in chain A and 3..10 in chain B
>>> AtomSelectionStack.from_contig("A1-2, B3-10")

See also

from_query()

classmethod from_query(query: str | list[str]) AtomSelectionStack[source]#

Create a stack from extended query syntax with ranges.

Extended syntax overview: - Chains: A (all atoms in chain A), A/ALA (all ALA in chain A) - Ranges (``res_id`` only): A/*/5-10 selects residues 5..10 in chain A

Grammar per field (CHAIN/RES/RESID/ATOM/TRANSFORM): - "*" wildcard - Exact value, e.g. "A", "ALA", "CA" - Range (res_id only): "5-10" (inclusive)

Notes: - Fields are in order: CHAIN_ID/RES_NAME/RES_ID/ATOM_NAME/TRANSFORMATION_ID - Wildcard is “*”. Missing trailing fields default to “*”. - Multiple comma-separated tokens are combined by union.

Multiple tokens may be provided as a comma-separated string or list[str].

Examples

>>> # Selects residues 5..10 in chain A
>>> AtomSelectionStack.from_query("A/*/5-10")
>>> # Selects residues 5..10 in chain A and 3..10 in chain B
>>> AtomSelectionStack.from_query("A/*/5-10, B/*/3-10")
>>> # Selects residues 5..10 in chain A and 3..10 in chain B
>>> AtomSelectionStack.from_query(["A/*/5-10", "B/*/3-10"])
get_center_of_mass(atom_array: AtomArray | AtomArrayStack) ndarray[source]#

Return the center of mass of the selected atoms.

Returns:

(3,) array. For AtomArrayStack: (n_models,) array of means.

Return type:

For AtomArray

Raises:

ValueError – If no atoms are selected.

get_mask(atom_array: AtomArray | AtomArrayStack) ndarray[source]#

Create a boolean mask by unioning all selections.

get_principal_components(atom_array: AtomArray | AtomArrayStack) ndarray[source]#

Return principal axes (eigenvectors) of the selected atoms via SVD.

Returns:

(3, 3) array for AtomArray. (n_models, 3, 3) array for AtomArrayStack.

Raises:

ValueError – If no atoms are selected.

atomworks.io.utils.selection.annot_start_stop_idxs(atom_array: AtomArray | AtomArrayStack, annots: str | list[str], add_exclusive_stop: bool = False) ndarray[source]#

Computes the start and stop indices for segments in an AtomArray where any of the specified annotation(s) change.

Parameters:
  • atom_array – The AtomArray to process.

  • annots – Annotation name or names to define segments.

  • add_exclusive_stop – Append an exclusive stop index at the end. Defaults to False.

Returns:

1D array of start/stop indices that bound segments.

Example

>>> atom_array = AtomArray(...)
>>> start_stop_idxs = annot_start_stop_idxs(atom_array, annots="chain_id", add_exclusive_stop=True)
>>> print(start_stop_idxs)
[0, 5, 10, 15]
atomworks.io.utils.selection.get_annotation(atom_array: AtomArray | AtomArrayStack, annot: str, n_body: int | None = None, default: Any = None) ndarray[source]#

Return an annotation array if present, otherwise default.

If n_body is None, the dimensionality is auto-detected by probing 1D then 2D annotation categories.

Parameters:
  • atom_array – Structure to query.

  • annot – Annotation category name.

  • n_body – 1 for 1D annotations, 2 for 2D annotations; auto-detected if None.

  • default – Value to return if the annotation is missing. Defaults to None.

Returns:

The requested annotation array or default if missing.

atomworks.io.utils.selection.get_residue_starts(atom_array: AtomArray | AtomArrayStack, add_exclusive_stop: bool = False) ndarray[source]#

Get the start (and optionally stop) indices of residues in an AtomArray.

This is a more robust version of biotite.structure.residues.get_residue_starts() that additionally differentiates residues across different transformation_id values when present. It is backwards compatible if the annotation is absent.

Parameters:
  • atom_array – Structure to analyze.

  • add_exclusive_stop – Append an exclusive stop index at the end. Defaults to False.

Returns:

1D array of residue boundary indices.

References