PyTorch related

  1. How to improve GPU utilization

Beyond an optimal number (experiment!), throwing more worker processes at the IOPS barrier WILL NOT HELP, it’ll make it worse. Try htop or top to check CPU utilization before increasing num_workers.

  1. DP vs DDP

<DistributedDataParallel (DDP):> Suitable for single-machine multi-card/multi-machine multi-card, not limited by GIL due to the use of multiprocessing parallelism. When using DDP, the model is replicated on each process and each model copy is fed with a different set of inputs. DDP maintains synchronization of model copies through gradient communication. Internal Design

  1. Switch CUDA version

Modify the PATH variable in ~/.bashrc and ~/.profile to switch CUDA version. For example, to switch from CUDA 10.2 to CUDA 11.1, add the following lines to ~/.bashrc and ~/.profile:

export CUDA_HOME=/usr/local/cuda
export CUDA_PATH=/usr/local/cuda
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:$LD_LIBRARY_PATH
export PATH=${CUDA_HOME}/bin:${PATH}

Use softlink to switch between different CUDA versions.

sudo ln -s /usr/local/cuda-{version} /usr/local/cuda
Chengqi (William) Li
Chengqi (William) Li

My research interests include 3D perception, computer vision, and machine learning.