Computing

See also our course notes: https://iprogramming.bacpop.org/

(Session 2 has notes on setting up a nice workflow on the cluster.)

Software

Package managers ^^^^^^^^^^^^^^^^^ Prefer to use package managers to install software and keep it up to date. On OS X we have:

App store (needs your Apple ID)
Managed software centre (managed by EBI)
Homebrew https://brew.sh/
conda

conda ^^^^^ A lot of the software we use tends to be available via conda, so a quick guide:

Install micromamba <https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html>__. If you're doing this on codon, see :ref:storage for where to do this.
Add channels in this order::

conda config --add channels bioconda conda config --add channels conda-forge
Ensure the defaults channel is not being used.
If you've got an M1 Mac, follow this guide <https://conda-forge.org/docs/user/tipsandtricks.html#installing-apple-intel-packages-on-apple-silicon>__ to use Intel packages.
Never install anything in your base environment.
To create a new environment, use something like micromamba create env -n poppunk python==3.9 poppunk.
Environments are used with micromamba activate poppunk and exited with micromamba deactivate.
Create a new environment for every project or software.
Use mamba for CI tasks <https://www.johnlees.me/posts/mamba_ci/>__.
For help: https://conda-forge.org/docs/user/tipsandtricks.html

Also note:

anaconda: package repository and company.
conda: tool to interact with anaconda repository.
miniconda: minimal version of the anaconda repostory and conda.
conda-forge: general infrastructure for creating open-source packages on anaconda.
bioconda: biology-specific version of conda-forge.
mamba: a different conda implementation, faster for resolving environments.

Subscriptions ^^^^^^^^^^^^^

Check the Managed Software Centre for:

R and Rstudio
iTerm2
JetBrains (CLion/Pycharm)
Microsoft Office
Xcode and developer tools
Visual studio code
XQuartz
Slack
Firefox/Chrome
Github desktop
Spotify
1password (password manager)

If you need any of the following:

Paperpile (citation manager)
Adobe CC
- Photoshop (photo editing)
- Illustrator (figure/logos)
- Indesign (publishing/slides)
- Premier Pro (video editing)
Biorender (biological figures)

Let me know, we can get you a license.

I also recommend the following free packages:

Alacritty (terminal emulator)
Github pro
Gitkraken

The latter two should be free to academic use. Don't forget to sign up to github education too.

Zoom ^^^^ EMBL has a Zoom license, which tends to be the easiest way to make video calls (though Slack also supports calls). Click 'SSO' at the bottom when you open up Zoom then use embl-org.zoom.us as the domain, and use your normal username and password.

Hardware

Equipment ^^^^^^^^^ You will be given a work laptop, a monitor, some connectors, a keyboard and a mouse, and a charger. You can plug a lot of this into your monitor.

Feel free to bring/use anything else you prefer, and let me know if there's anything you need.

Your work computer is the easiest way to access the VPN and EBI resources such as the cluster. Personal laptops won't work on the wired network or EBI WiFi. For personal computers and your phone, use eduroam (which is handy when you're travelling too).

WFH ^^^ We can't specifically provide equipment to WFH, bue we do currently have a lot of spare monitors, so please ask for one if it would be helpful.

Backups ^^^^^^^ You will also get an external USB hard drive to take back ups on. Please do so regularly (at least every couple of weeks). The easiest thing is just to use Time Machine. You can either partition your disk to stop it all being used, or set this option in the settings.

Leave the hard drive somewhere different from your laptop (either in the office or at home) in case of fire/flood/theft.

We've had some data losses in the past and they are extremely painful! On codon, the research area is backed up. On the workstation, hard drives are internally redundant but otherwise there is no backup.

Cluster (codon)

Connect with ssh -X -Y codon-login.
Use tmux to keep your session active after logging out.

Computing ^^^^^^^^^ Codon uses SLURM now. This is helpful to generate job script templates: https://slurm-jsg.ebi.ac.uk/

Use sbatch to submit jobs to workers.
Use sacct with a formatting string to view job resource use.
Use -o and -e to write results, don't use the email default. Use %J and %I in the names with job arrays.
Use sinfo to see which queues are available, and their parameters.
Small tasks, including micromamba install or compiling can be done on the head node.
Anything longer than about 30s, or using more than 1-2Gb RAM should be submitted using sbatch.
Default memory request is 20Gb, which is large. Look at memory use afterwards and try to tune it closer to the actual size needed.
If you requested multiple cores, check they were used effectively.
Use job arrays whenever you have more than five similar tasks. You can set custom ranges, and also max jobs at a time if you don't want to hammer storage.

.. _storage:

Storage ^^^^^^^

Don't store anything in your home directory, other than your shell profile (or similar small text documents).
Try to use /hps/scratch/jlees/<username> or /hps/nobackup/jlees/<username> to run any large or data-intensive jobs (but note, no write from the head node).
Data (genetic, modelling etc) you need should be stored in /nfs/research/jlees/<username>.
Shared datasets, such as standard genome collections, should be stored in /nfs/research/jlees/shared. Consider copying them before use, be careful not to edit or delete. We will change this to read-only.
Always think twice before running rm -rf.
Software and your conda environments go in /hps/software/users/jlees/<name>.
Use the datamovers queue to move and copy large amounts of data, or if you need to use the ftp site.

References:

ssh keys ^^^^^^^^ Generate yourself some ssh keys, github has a nice guide <https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent>__

You'll then want to:

Add them to github, as detailed here <https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account>__.
Switch github clones from https to ssh.
Add the public key to ~/.ssh/ on codon.
Add them to your OS X keychain, so that you don't have to enter the password every time they're used.

Workstation guide (WPIA-DIDE295)

This is a decent computer, equivalent to a single node on the cluster that we have sole access and admin rights over. It's particularly useful for GPU work and development, profiling, and long running compute jobs.

TODO: We will reinstall the OS to something a bit easier to use in early 2025.

Connecting ^^^^^^^^^^ First, contact me to get an account set up.

Then, on the EBI network/VPN:

ssh -X -Y <username>@WPIA-DIDE295

If you are outside the network set up a remote.it <https://www.remote.it>__ account and ask me to be added to the list on that, the workstation will then appear on your device list. After clicking connect, choose connect via command, you will then be able to paste a command into your terminal which will look something like::

ssh -l jlees jl-workstation-ssh-workstation.at.remote.it -p 33000

To avoid entering your password every time add your public ssh key under ~/.ssh/ as usual.

I would recommend using tmux so that your session is retained between logins, and running jobs keeping going if you log off.

Using tmux ^^^^^^^^^^ Type tmux new to start a new session. After this, when you login type tmux attach to reconnect to your previous session. To log out from a session (detach from it) the default is to use Ctrl-B followed by d.

Some more commands can be found here: https://ostechnix.com/tmux-command-examples-to-manage-multiple-terminal-sessions/

Software ^^^^^^^^ If there's some software you'd like to use and are having trouble installing let me know and I can try and help. Generally I'd recommend using micromamba where possible, as above.

Fast disk ^^^^^^^^^ .. warning:: There is currently little space left here so please avoid using it where possible.

There is about 1.5Tb of fast disk space shared between our home directories ~. Use this space to install software and run analysis. When you are finished please move data files (especially if over ~5Gb) to the mirrored storage, as this space is limited. This space is striped, so hopefully resilient to local failure.

Backed up disk ^^^^^^^^^^^^^^ You will also have a folder on /media/mirrored-hdd/<username> for storing large datasets or finished results. There is 14Tb of space here currently, with an additional 14Tb possible if this gets full. This is backed up locally (resilient to hard drive failure, but not if e.g. the office gets flooded).

Running jobs ^^^^^^^^^^^^ I have no queuing system so log in and run tasks directly from the command line. Please let me know if you are planning a massive analysis which would use a majority of the resources for a long time, or if you anticipate doing anything 'unusual'.

Use top and/or htop to look at current resource use (nvtop or nvidia-smi for the GPUs).

Compute resources ^^^^^^^^^^^^^^^^^

2x20 Intel Xeon Gold CPUs = 40 cores. These have 'hyperthreading', so in theory you can use up to 80 cores if you are using memory/disk heavy jobs. Please generally keep jobs to max 20 cores and don't run more than one at a time as it might be hard for me to deal with issues without a manual restart. If you log off the job will get cancelled, unless you are running through tmux or have backgrounded it and run nohup!
768Gb RAM. You don't need to request this. If your job goes over it will use swap space (on the disks and very slow) rather than fail, so keep an eye on anything massive. You can use ulimit to set a max (see below).
Nvidia GTX 3090 Ti GPU (CUDA 8.6. 24Gb RAM). This is device 0. Libraries containing GPU accelerated code should work out of the box. If you are compiling your own GPU code see below.
Nvidia RTX 2080 Ti GPU (CUDA 7.5. 11Gb RAM). This is device 1.

Limiting memory ^^^^^^^^^^^^^^^ If you are going to be running a large memory job, it's sensible to limit the maximum memory size. For very large/exploding requests not doing this can lead to thrashing and require a manual system reboot.

Check if you have limits set by running ulimit -a. Set a memory limit with e.g. ulimit -m 250000000 (which is 250Gb). Note this only applies to jobs launched after this session, so check it's been set with ulimit -a before running your job.

Using the GPUs/CUDA ^^^^^^^^^^^^^^^^^^^ Currently v12.0 of the CUDA toolkit is centrally installed. You may need to first run::

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-12.0/lib64 export PATH=$PATH:/usr/local/cuda-12.0/bin

You can add these lines to your ~/.bash_rc to avoid having to run these in each session.

Both profilers are available::

/opt/nvidia/nsight-compute/2020.1.2/ncu /opt/nvidia/nsight-systems/2020.3.2/bin/nsys

Systems

Dealing with the EBI email system ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I strongly recommend setting up filters on your EBI email, otherwise you'll get a lot of spam in your inbox throughout every weekday. To do this, log in to webmail.ebi.ac.uk, go to Settings->Filters->Create

.. image:: images/email_filters.png :alt: Creating email filters :align: center

Create a folder to put certain messages in, and create rules to filter to this folder. Many of the campus email lists have subjects in square brackets such as [Systems] to help you do this. For example, here is my filter for campus-wide emails:

.. image:: images/campus_filter.png :alt: Campus email filter example :align: center

You can also set out-of-office messages through this interface, which you might want to do if you expect people to contact you over long periods of holiday.

Raising tickets on ServiceNow ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If you have IT issues (logging in, codon issues, or requests) the way to report these is through ServiceNow <https://embl.service-now.com/sp>__. Have a quick look through the catalogue for the right category, but if you can't find it just choose 'Other' and it will get triaged for you.

Try and include as much information as possible in the issue. You can add attachments once the issue is open too.

Software​

Hardware​

Cluster (codon)​

Workstation guide (WPIA-DIDE295)​

Systems​

Software

Hardware

Cluster (codon)

Workstation guide (WPIA-DIDE295)

Systems