Installation Guide¶

Apptainer Image (Recommended)¶

There is an Apptainer image provided in the RFdiffusion2 repository, it is located at RFdiffusion2/rf_diffusion/exec/bakerlab_rf_diffusion_aa.sif. This file can be run with either Apptainer or Singularity, if you have any issues using it please create an issue. An example of how to use this image is given in the README.

If you need to generate your own image, the .spec file used to generate the given .sif file can be found at RFdiffusion2/rf_diffusion/exec/rf_diffusion_aa.spec.

Troubleshooting¶

lz4 compression issues

Full error message you might see:

FATAL: container creation failed: mount hook function failure: mount /proc/self/fd/3->/var/apptainer/mnt/session/rootfs error: while mounting image /proc/self/fd/3: squashfuse_ll exited with status 255: Squashfs image uses lz4 compression, this version supports only zlib.

Or you may see

FATAL: kernel reported a bad superblock for squashfs image partition,possible causes are that your kernel doesn't support the compression algorithm or the image is corrupted.

To fix this issue you can rebuild the sif on your HPC cluster:

apptainer build --sandbox rfd2_sandbox /path/to/bakerlab_rf_diffusion_aa.sif
apptainer build rfd2_zlib.sif rfd2_sandbox

Thank you to those who posted in Issue 10 for reporting this problem and documenting a solution.

Creating Your Own Environment¶

You do not need to install RFdiffusion2 itself, but you do need to install several dependencies to be able to use the Python scripts that will run the inference calculations. This is what the Apptainer image above supplies, an environment where the dependencies required by RFdiffusion2 have already been installed. If this container works on your computing system, we highly recommend using it.

However, if you need to set up your own environment, the instructions below should help you determine the dependency versions you need to get RFdiffusion2 running on your system.

Using Provided Environment Files¶

We have created a few environment files to automatically generate a conda environment that will allow RFdiffusion2 to run.

Note: Due to variations in GPU types and drivers, we are not able to guarantee that any of the provided environment files successfully install all the required dependencies. See the section below if none of the provided environment files are appropriate for your computing system.

You can find the prepared environment files in the envs directory

cuda121_env.yml - This is appropriate for systems able to run CUDA 12.1 and PyTorch 2.4.0
- This uses requirements_cuda121.txt to install dependencies via pip
cuda124_env.yml - This is appropriate for systems able to run CUDA 12.4 and PyTorch 2.4.0
- This uses requirements_cuda124.txt to install dependencies via pip

If you have trouble with these files but they should work based on your system specifications here are a few things to try:

Separate the creation of the environment and the installation of dependencies via pip:

#. Remove the last two lines from the above .yml files #.
conda env create -f cuda121_env.yml conda activate rfd2_env pip install -r requirements_cuda121.txt
This will force the dependencies you want installed by CUDA to be installed before pip is used.
Check to make sure the python that is being referenced is the one from your conda environment once it is activated. On clusters different modules you have imported might overrule the python in your conda environment. You can either manually give the path to your Python or change your system settings or environment variables to prefer the environment’s python installation.
You can try to install any dependencies that pip hangs on using CUDA instead of pip. If you have created an environment file that runs RFdiffusion for a different CUDA version or other dependency versions, create a PR to add it to the envs directory.

Creating the Environment Manually¶

Some of the dependencies listed below will vary based on your system, especially the version of CUDA available on your cluster. You will likely need to change some of the versions of the tools below to successfully install RFdiffusion2. The instructions below are for CUDA 12.4 and PyTorch 2.4. For some useful troubleshooting tips, see the Troubleshooting section below.

Create a conda environment using miniforge and activate it

Point to the correct NVIDIA-CUDA channel, and install PyTorch, Python 3.11, and pip based on what is available on your system:

conda install --yes \
 -c nvidia/label/cuda-12.4.0 \
 -c https://conda.rosettacommons.org \
 -c pytorch \
 -c dglteam/label/th24_cu124 \
 python==3.11 \
 pip \
 numpy"<2" \
 matplotlib \
 jupyterlab \
 conda-forge::openbabel==3.1.1 \
 cuda \
 pytorch==2.4 \
 pytorch-cuda==12.4 \
 pyrosetta

REMEMBER: You will need to change your CUDA version based on what is available on your system. This will need to be changed in the NVIDIA channel, the dglteam channel, the pytorch version, and the pytorch-cuda version.

Use pip to install several Python libraries:

pip install \
hydra-core==1.3.1 \
ml-collections==0.1.1 \
addict==2.4.0 \
assertpy==1.1.0 \
biopython==1.83 \
colorlog \
compact-json \
cython==3.0.0 \
cytoolz==0.12.3 \
debugpy==1.8.5 \
deepdiff==6.3.0 \
dm-tree==0.1.8 \
e3nn==0.5.1 \
einops==0.7.0 \
executing==2.0.0 \
fastparquet==2024.5.0 \
fire==0.6.0 \
GPUtil==1.4.0 \
icecream==2.1.3 \
ipdb==0.13.11 \
ipykernel==6.29.5 \
ipython==8.27.0 \
ipywidgets \
mdtraj==1.10.0 \
numba \
omegaconf==2.3.0 \
opt_einsum==3.3.0 \
pandas==1.5.0 \
plotly==5.16.1 \
pre-commit==3.7.1 \
py3Dmol==2.2.1 \
pyarrow==17.0.0 \
pydantic \
pyrsistent==0.19.3 \
pytest-benchmark \
pytest-cov==4.1.0 \
pytest-dotenv==0.5.2 \
pytest==8.2.0 \
rdkit==2024.3.5 \
RestrictedPython \
ruff==0.6.2 \
scipy==1.13.1 \
seaborn==0.13.2 \
submitit \
sympy==1.13.2 \
tmtools \
tqdm==4.65.0 \
typer==0.12.5 \
wandb==0.13.10

Install Biotite and several libraries related to PyTorch, and pylibcugraphops:

pip install biotite
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.4.0+cu124.html
pip install -U -i https://pypi.anaconda.org/rapidsai-wheels-nightly/simple "pylibcugraphops-cu12>=24.6.0a24"

REMEMBER: You will need to change the link for installing the PyTorch-related libraries (the second line in the code block above) to have it match your PyTorch and CUDA versions.

Install a version of TorchData that still has DataPipes:
```
pip install torchdata==0.9.0
```
Install a version of the Deep Graph Library based on the version of PyTorch and CUDA you are using:
```
conda install -c dglteam/label/th24_cu124 dgl
```
REMEMBER: You will need to change the conda channel to the correct version of PyTorch (th24 in the line above) and CUDA (cu124 in the line above). Use the Deep Graph Library’s Installation guide to determine the correct conda or pip command.

Set your PYTHONPATH environment variable:

export PYTHONPATH=$PYTHONPATH:/path/to/RFdiffusion2

You can add this to your environment via

conda env config vars set PYTHONPATH=$PYTHONPATH:/path/to/RFdiffusion2

so that you do not need to set it every time.

Troubleshooting¶

Ran into an installation issue not covered here? [Create a new issue!](https://github.com/RosettaCommons/RFdiffusion2/issues)

How to determine the highest available CUDA version on your system

The nvidia-smi command will print out information about the available GPUs you can access on your cluster. The first line in the result will look something like:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.230.02             Driver Version: 535.230.02   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+

Here, this means that this system can only support up to CUDA 12.2. However, if you look at the possible PyTorch versions and Deep Graph Library versions on their installation pages, you’ll notice that they don’t have versions for 12.2, so in this situation you would need to change the installation instructions to work with CUDA 12.1.

Cannot find DGL C++ graphbolt library at...

Seeing this error likely means that the version of the Deep Graph Library (DGL) that you have installed does not match the corresponding version of PyTorch your system is finding. Double check that you installed the correct versions of these tools and ensure that your system does not have a different version of PyTorch it is finding.

It might also be useful to ls in the given directory to see what version of the DGL libraries you have installed. For example, if your error says it is looking for graphbolt/libgraphbolt_pytorch_2.4.0.so it means your system is using Pytorch version 2.4.0. Meanwhile if you ls in the directory you might see that you only have libgraphbolt_pytorch_2.1.2.so meaning that the version of DGL you downloaded was only mean to work with PyTorch versions up to 2.1.2.

No module named 'torchdata.datapipes'

Newer versions of TorchData have stopped supporting their DataPipes tools. You will need to downgrade the version of TorchData you have installed to one at or below version 0.9.0. You can learn more about this change on TorchData’s PyPI page.