FAQ

 Introduction to HPC Clusters

  • What is the High Performance Computing cluster?
  • High Performance computing is the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. High Performance Computing is nowadays the foundation for scientific, industrial, and societal advancements.

    As technologies like the Internet of Things (IoT), artificial intelligence (AI), and 3-D imaging evolve, the size and amount of data that organizations have to work with is growing exponentially. For many purposes, such as streaming a live sporting event, tracking a developing storm, testing new products, or analyzing stock trends, the ability to process data in real time is crucial.  

    To keep a step ahead of the competition, organizations need lightning-fast, highly reliable IT infrastructure to process, store, and analyze massive amounts of data.
    HPC solutions have three main components:

    1 - Compute
    2 - Network
    3 - Storage

    To build a high-performance computing architecture, compute servers are networked together into a cluster. Software programs and algorithms are run simultaneously on the servers in the cluster. The cluster is networked to the data storage to capture the output. Together, these components operate seamlessly to complete a diverse set of tasks.
    An HPC cluster consists of hundreds or thousands of compute servers that are networked together. Each server is called a node. The nodes in each cluster work in parallel with each other, boosting processing speed to deliver high-performance computing.
    A cluster is basically considered as a group of interconnected computers that work together to perform computationally intensive tasks.
    You can view and read more about how clusters are setup in HPCCF on: hpc.ucdavis.edu/clusters

  • What are HPC clusters' Access models?
  • There are two main model architectures of HPC clusters available:
    1 - Condo Model
    Farm cluster uses this model having three priority queues of High, Medium, Low and Free tier.
    - The High priority queue is for dedicated use of invested resource by those PIs who procure hardware and want to leverage it. Guaranteed access is       expected via this queue. It is a scalable when a PI wants to increase the amount of storage.
    - The Medium priority queue is for Shared use of idle resources above permitted limits. Jobs will be suspended (resumed later) once the high queue owner takes over the resources back and changes the idle status. There needs a checkpoint mechanism via code to make the jobs suspended or resumed after going into suspended status. 
    - The Low priority queue is for intermittent access to idle resources above limit.
    - Free Tier - is arranged to access to 192 CPU/RAM500G or 96CPU/RAM250G, it is reserved capacity, 2 nodes dedicated for special needs. 

    2 - Fair Share Model
    LSSC0 cluster fair share model is single priority queue. It is shared compute that has fair-share priority access to all resources right away.
    Dedicated compute:- Quick access to limits resource via login node and request for advanced resource reservation. This requires additional fee for maintenance.
    Requested for advanced resource reservation:- 

    models
  • What is LSSC0 Fair-Share Model?
  • lssc0FairShare
  • What is the Farm Condo Model?
  • Farm-Condo
  • Where do I get help with HPC clusters?
  • You can always directly contact us via email at hpc-help@ucdavis.edu or open a ServiceNow ticket.
    You can track your open cases/tickets in Servicehub >> My Stuff Section.
    Please note Servicehub is the internal catalogue of UCD services for internal users only - https://servicehub.ucdavis.edu/
    For requesting User account in HPC clusters, please refer to "How Do I Request Access to HPC Clusters?" article in this page.
    You can find our communication channel on UCD Slack as well, lookup for #hpc and connect with us.
    Please see Helpful Docs pages on our website (Home > Support > Helpful Documents) for additional helpful information.
  • How can I open a ticket calling out an issue with the HPC clusters?
  • Users can open a ticket and contact us in case of any issue or any question through ServiceNow ticketing system and also they can check on their  case update in Servicehub. You can track your open cases/tickets in Servicehub >> My Stuff Section. Please note Servicehub is the internal catalogue of UCD services for internal users only - https://servicehub.ucdavis.edu/

    Here are some documents about using ServiceNow:  https://kb.ucdavis.edu/?id=3187 

    Please note that you users can always directly send us email at hpc-help@ucdavis.edu  

Access to the Clusters

  • How do I request access to HPC clusters?
  • In order, to qualify for an account, you must have a UC Davis affiliation and must be sponsored by a lab owner/PI or an equipment owner.  To request an account in one of the clusters, fill in the following forms with your sponsor, and public key information:-
    To request an account in Atomate , send us email at: hpc-help@ucdavis.edu
    Click to request an account in Cardio
    Click to request an account in HPC1
    Click to request an account in Peloton.
    Click to request account in HPC2.
    Click to request an account in FARM.
    Click to request an account in Crick.

    For “Temporary Computing Account for Special UC Davis Affiliates” 
    If you don’t have a UC Davis affiliation, you need to have a Faculty member sponsor your account. The easiest way to do that is the sponsor/PI has to fill out a temporary account request form.
  • How do I know which cluster I need access to?
  • You can find out about your cluster association when you know:
    - Which lab you and your sponsored PI (Principle Investigator) are associated with.
    - Which resource you need access to or which resources your PI has sponsored in a particular cluster. 
    You can find information about the clusters on this page.  
    Collect this information and see the FAQ article above; "How do I request access to HPC clusters?" 
    If that doesn't help, please reach out to the HPC Core Facility: hpc-help@ucdavis.edu
  • How can I purchase storage in a cluster and whom should I contact?
  • - For purchasing storage in Farm cluster, you have to contact Adam Getchell, Director of Information Technology at College of Agricultural and Environmental    Sciences: acgetchell@ucdavis.edu

    - For storage purchase in HPC1 and HPC2, contact Steve Pigg, Executive Director of Information Technology at College of Engineering:
      sapigg@ucdavis.edu

    - For storage purchase in other clusters such as LSSC0 please reach out to:- hpc-help@ucdavis.edu
  • How can we access HPC services via commercial clouds and how is the cost calculated?
  • If you consider exploring UCD-managed research computing options in commercial clouds, these cost-estimate calculators can help you to get started:

    GCP info pricing info on GPUs and TPUs:-

    https://cloud.google.com/compute/gpus-pricing

    https://cloud.google.com/tpu/pricing

    40GB A100 costs $2,142 per month. 80GB A100 is not even offered

    Currently, there is the Google Cloud Platform free tier available for the UCD researchers, more information about this tier can be read on this webpage:-

     https://cloud.google.com/free/docs/gcp-free-tier

    More general documentation of Google Cloud Platform can be found on - https://cloud.google.com/docs

    Please contact HPC-CF (hpc-help@ucdavis.edu) for further assistance and more information.

  • What are GCP services and how do I request Google Cloud account through HPC?

  • Google Cloud services are now available for the HPC users and researchers.

     It provides a reliable place to compute and store data and helps developers build, test, and deploy apps.

    The GCP services give our users/Professors the autonomy to have their own project and construct a framework that their autonomy exists within that framework. They should be able to add users and grant access to projects for their students. Doing that so that they can't harm others or exhaust resources for others unintentionally.

    In order to have an enabled GCP account, IET should enable the Google Admin access page for the users.

    You can send your list of usernames to your PI for having GCP accounts.

    Then the request goes to ithelp@ucdavis.edu and ask them to get the IET Cloud Services team to enable Google Developer accounts for the list of users provided in that ticket.

    This way, there will exist a log of these requests and the ticketing system can be used to communicate when things are done.

  • What authentication method is used to provide access to the clusters?
  • We don't use passwords. We use your UCD Kerberos account and for shell access we typically use SSH public key authentication.
    With public key authentication, you typically generate a key pair which is made up of a private key and matching public key.
    You keep the private portion, well, private. The public portion can be sent over email, shared with your friends, and even published on the web. 
    After you generate your SSH key pair, please submit it to us using the account request form.

    More information on how to generate SSH key can be found in the article below, "What is an SSH key and how can I create it?". 
    Access links to request forms for specific clusters can be found in article "How do I request access to HPC clusters?"
     
  • What is an SSH key and how can I create it?
  • SSH keys are an authentication method used to gain access to an encrypted connection between systems and then ultimately use that connection to access and manage the remote system. An SSH key pair is always required to log into HPC clusters.
    SSH keys are generated as a matched pair of a private key and a public key. Keep your private key safe and use a strong, memorable passphrase.
    We support one key per user. If you need to access the cluster from multiple computers, such as a desktop and a laptop, copy your private key. 
    Note that if you forget your passphrase or lose your private key, we cannot reset it, you'll need to generate a new key pair, following the same directions as when you first created it.
    Besides, to avoid typing the passphrase for every login, an SSH-agent or keychain can be used.

    Overview
    Public key authentication requires the generation of a key pair which contains a private key and a public key.

    Your private key should be:

    - Kept confidential
    - Copied to any machine you sit in front of and want to login from.
    - Never sent by email
    - Never sent to anyone else, including a system administrator (who should never ask)
    - Copied or used only on trustworthy machines
    - If compromised or lost anyone trusting the public key should be notified as soon as possible.
    - Never be shared with another user

    The public key should be:

    - Installed on any machine you want to be able to login to
    - Can be emailed, shared, and/or published
    - Should ideally be emailed as an attachment (to avoid formatting issues)
    - For a successful login the client must have access to the private key and the server must be configured to trust the matching public key.

    Setup
    SSH is an encrypted protocol and since we require the use of an SSH key, there is an initial setup process. That process requires you to generate a public and private key pair.

    Passphrases
    Passphrases are required to use private SSH keys and access our systems. If you do not have the passphrase, you cannot use the private key. It should be noted that using an empty passphrase is only advised on private networks (like from a head node to compute nodes). If your key does not require a passphrase and you are using it on a public network then anyone who gains access to that account now has access to all other accounts that you have access to. In other words, please use a difficult passphrase!
    If you want to enter to the cluster in an automatic manner and without any passphrase prompt, hit enter three times.

    Helpful external links regarding passphrases:

    http://www.useapassphrase.com
    https://www.eff.org/dice
    https://xkcd.com/936/
    Key Encryption Sizes
    For RSA we recommend using a key size of 2048 or 4096.

  • How to generate a key pair on your machine?
  • Linux and OS X
    OS X and almost all Linux distributions include OpenSSH client tools by default. You can use a tool SSH-keygen to generate SSH key pairs like so:

    $ ssh-keygen 
      Generating public/private rsa key pair.
      Enter file in which to save the key (/home/MyUser/.ssh/id_rsa):  **ACCEPT DEFAULT**
      Enter passphrase (empty for no passphrase):
      Enter same passphrase again:
      Your identification has been saved in ~/.ssh/id_rsa.
      Your public key has been saved in ~/.ssh/id_rsa.pub.
      The key fingerprint is:
      a-long-hex-string user@host


    The ssh-keygen command will generate two files: the private key (usually named id_rsa), and the public key (usually named id_rsa.pub) in the .ssh/ directory under your home directory. The file with the .pub extension is your public key and should be sent for HPC access. The file without an extension is your private key and should be kept secret.

  • How to generate SSH keypair in a Windows operating system?
  • We recommend MobaXterm as the most straight forward SSH client. You can download its free home edition from: https://mobaxterm.mobatek.net/

    To Generate SSH keys:
    Launch MobaXterm, click the Start local terminal button .

    Mobaxterm


    - Now you have a bash prompt that functions almost exactly like the equivalent Linux or macOS prompt, so you can run ssh-keygen as explained in the previous section.
    - The ssh-keygen command will generate the two files that make up an ssh key pair for you, "id_rsa" (the private portion) and "id_rsa.pub" (the public portion). - By default, both files will be saved to "C:\Users\WindowsLoginName\Documents\MobaXterm\home\.ssh\."
    - Protect your private key file, "id_rsa".
    - Choose a strong passphrase when creating it, keep the file safe, and never share it with anyone.
      Or, you can just hit enter if you wish to not enter any passphrase when logging into the cluster.
    - When requesting an account on a cluster, submit the file "id_rsa.pub" as your public key.

    To import an existing SSH keypair:
    Copy the files to C:\Users\WindowsLoginName\Documents\MobaXterm\home\.ssh\.

    To backup your SSH keypair:
    In order to backup your keys and keep them safe:- 
    If you have created your keys via MobaXterm, you can copy the contents of C:\Users\WindowsLoginName\Documents\MobaXterm\home\.ssh\ to a secure location in your computer.

  • How to generate SSH keypair with PuTTYgen?
  • If you need to export the private ssh key generated with PuTTYgen, here are the messy steps. (Since you are most likely needing to use the key on a Linux system its highly recommended to generate the ssh keys under Linux and import the private ssh key with PuTTYgen to Windows. Much easier!)

    You will need to have putty-tools installed on a Linux system.
    1 - On the Windows system, open PuTTYgen and load your ppk ssh key.
    2 - Under the Conversions tab in PuTTYgen choose 'Export ssh.com key'.
    3 - Copy this file to the Linux system with putty-tools installed.
    4 - Change the permissions of the key to be owner read-only:-  chmod 600 private-key-file
    5 - Assuming the private key file you copied over is named ssh2private: puttygen ssh2private -O private-openssh -o privateLinux
    6 - you will be prompted for the key's passphrase, if it has one.
    7 - Copy the newly created privateLinux file to your .ssh directory.
    scp privateLinux ~/.ssh chmod 600 ~/.ssh/privateLinux

    8 - Assuming you already have another private ssh key in this location using the default name id_rsa, you will need to create ~/.ssh/config or specify the       private key on the ssh command line.
     host hpc1
     Hostname hpc1.cse.ucdavis.edu
     User smith
     IdentityFile ~/.ssh/privateLinux

    Then,

    ssh hpc1

    is the same as the command:

    ssh -l smith -i ~/.ssh/privateLinux hpc1.cse.ucdavis.edu

    If you do not have a ~/.ssh directory already, you may create it with

    ssh-keygen -t rsa

    In this case you will want to rename the private key you created to id_rsa so you need not create the ~/.ssh/config file to specify which ssh key to use, id_rsa is the default names SSH uses.

  • How do I add my private key in PuTTYgen in order to access cluster via PuTTY?
  • If you have created your SSH keypair in Powershell and cannot connect to HPC cluster via PuTTY. Since you generated the SSH Keypair in Power Shell and Putty doesn't have your keys inside it, you should import the private key in PuTTYgen. Here are the steps to use the same keypair in PuTTY in order to connect to HPC cluster:

    1 - Open "PuTTYgen" and click on Conversion menu: you will see import key on top click on that and open your SSH keypair location (where your Power Shell stored your keypair) 

    Puttygen_conversion


    2 - Select your private key and click "Save private key" and save it in desktop, this step will convert it to ".ppk" format for PuTTY to read it.

    3 - Then close that window and open "PuTTY" - Write the hostname (the hpc cluster address):

    Puttygen _hostname
     

    4 - Then go to "SSH--Auth" and hit the Browse button to find the private key file (.ppk) and add it in the system:

    puttygen_auth


    5 - Click open and it will open the PuTTy screen and will ask for your user id, enter your user id and it should let you login.

  • How to convert .ppk key format to OpenSSH key using PuTTYGen?

  • PuTTY's private SSH key can't be used interchangeably with OpenSSH clients because they both use and support a different key format.

    You need to convert PuTTY's key file, which uses the .ppk extension and stands for PuTTY Private Key, to the OpenSSH Private Key Format before you can use it with OpenSSH.

    PuTTYGen or PuTTY Key Generator is a tool you can use to convert the PuTTY Private Key (.ppk) file to the OpenSSH private key format.
    Click on the Load button on PuTTYGen's main interface

    Puttygen


    Select your PuTTY's private key file which normally ends with .ppk extension and click on the Open button.

    putty_key


    Enter the key's passphrase if prompted and then click OK.

    Putty_passphrase
  • What are SSH keypair permissions and how can I change configuration options using SSH keypair?
  • There are a few things to keep in mind when using SSH.

    File System Permissions
    You should keep your keys secure from other users on the system. Your private and public keys both should be only readable/writable by you.
    Use the following command to change the read, write and execute permissions of your key:

    chmod 600 ~/.ssh/id_[rd]sa*

    Chmod (Change Mode) -  used to change the access permissions of file system objects (files and directories) sometimes known as modes.
    You should also note that nobody other than root should have writeable access to either your home:

    chmod go-w ~ or the .ssh directory in your home: chmod 700 ~/.ssh.

    If these permissions allow others to read or write these directories or files, public key authentication will fail.

    Using Different Usernames
    SSH has some handy configuration options that you can use to make it easier to get to a remote system. You can add these configuration options to a special file called: ~/.ssh/config.

    Changing the default name of your ssh key is strongly discouraged. If you do change the default name for the ssh key, you will need to create file named “config” in ~/.ssh naming the ssh private key.

    Host kaos
    HostName kaos.ucdavis.edu
    User tlknight
    IdentityFile ~/.ssh/kaos

    Now, this command

    ssh kaos is equivalent to this command

    ssh -i ~/.ssh/kaos -l tlknight kaos.ucdavis.edu

    There are many more options also. You can get them all by looking at the manual page (man ssh_config).

  • How to move and copy SSH keys?
  • We only store one SSH public key per person. The public key can be shared with anyone in the world (thus the name public). Only you should have access to your private key. Keep this very close. Since many people use more than one workstation/laptop you might need to copy your keypair (both your private and public key) to another machine. There are some security implications to doing this.

    Here are a couple to keep in mind:

     - Make sure you trust every workstation that has your private key. These machines need to be kept up-to-date, be physically secure (ie locked rooms, screensavers, etc), and have trusted administrators.

     - Make sure you move your private key over a secure medium/channel. You can use a medium to physically walk the key from one place to another, encrypted email, SSH, etc. Do not use FTP, unencrypted email (most email is unencrypted), HTTP, etc.

     - Make sure your permissions are set properly on the new machine. Here are some guidelines:
       - ~/.ssh : 700 or 500 (only you should be able to read/write here)
       -  ~/.ssh/id_rsa and ~/.ssh/id_dsa : 600 or 400 (this is your private key, protect it)
       -  ~/.ssh/id_rsa.pub and ~/.ssh/id_rsa.pub : no restrictions (this is your public key)

    Here is the process using an External Drive or USB thumb drive:

    1 - Copy your private key (typically ~/.ssh/id_rsa or ~/.ssh/id_dsa) to a USB stick from your old workstation.
    2 - Copy your public key (typically ~/.ssh/id_rsa.pub or ~/.ssh/id_dsa.pub) to a USB stick from your old workstation.
    3 - Carry the USB stick to your new workstation.
    4 - Make the ~/.ssh directory on the new workstation if it doesn't exist.
    5 - Change the permissions of the ~/.ssh directory to 700.
    6 - Copy your private key from the USB stick to your ~/.ssh directory.
    7 - Change the permissions of the private key to be either 600 or 400.
    8 - Copy your public key from the USB stick to your ~/.ssh directory.
    9 - Test your key by connecting to a remote host.
    10 - Remove the private and public key from the USB stick.
    11 - The process is generally the same if you use an encrypted channel (like SSH). Again, do not use email.

  • How to convert a SSH private key for use in FileZilla?
  • 1. In FileZilla → Edit →  Settings
    2. Select Connection → SFTP
    3. Press the Add key file... button


    4. The button will open the explorer
    5. Select the private key you already use to connect to Farm
    6. If you get a prompt to Convert key file you must choose yes:


    7. This key can then be used to login to Farm with FileZilla.
     

  • How can I connect to clusters using command prompt?
  • In order to connect to HPC clusters, you will need a terminal emulator software to work as a command prompt in order to log into our clusters and run jobs. The software you choose must be able to use SSH keys to connect. This information is typically available in the documentation for the software. Common software choices can be found below.  These software options are free (or have free versions).  UC Davis does not verify or maintain any of these software options.  

    terminal.app or iTerm 2 are common MacOS terminal emulator options.

    Mobaxterm is an all-in-one terminal emulator for Windows that gives a very Linux-like terminal environment.

    PuTTY is the most common free and open-source terminal emulator for Windows. (PuTTY Setup Link)

    Windows Subsystem for Linux can provide a Linux terminal within Windows 10. Once you have it installed, you can follow the Linux-based directions to generate a key pair and use ssh at the command-line to connect.

    Windows Terminal for Windows 10 can provide a terminal experience much like Linux or MacOS. 

    Preview code may be available  at: Microsoft's Github.

    Once you have an SSH key and your account has been created, you can connect to Farm. In most text-based terminal emulators (Linux and MacOS), this is how you will connect:

    SSH yourusername@[link to the cluster node] for example: SSH yourusername@farm.cse.ucdavis.edu

  • I cannot connect to an HPC cluster and I get the ssh connection error "Permission denied (publickey)"
  • If you are getting an error message when trying to connect to the cluster, there is a high possibility of having an incorrect or mismatching public key.
     - The login in node could be down or inaccessible.  Be sure to check the maintenance schedule.
     - The user’s public key file must match with our records.
     - Users can locate their SSH public key in their SSH client home directory in a directory called “ssh”.
     - The general path to public key in MobaXterm is C:\Users\WindowsLoginName\Documents\MobaXterm\home\.ssh\.

    If the issue persists, please contact us and include details about the login attempt and the error message at the following email: hpc-help@ucdavis.edu
  • How do I change or reset my password when I want to connect/ log into Genome Center servers?
  • You can only use the password change form if your password is valid.  If your password is expired or forgotten, you'll need to use the
    reset request form instead. Note that SSH keys are not supported for login.
    Visit the new password request page:

       https://computing.genomecenter.ucdavis.edu/account/request_password_reset/

    Once an admin in your lab approves your request, you'll be emailed a token to reset your password.

  • Can I get "root" access to a server to install my own software and modules?
  • Unfortunately it is not possible to delegate root access on nodes connected to the HPC clusters.
    You can always view available/installed modules of the HPC cluster using the command:

    module avail
    module avail -l | grep -i <moduleName> 

    It will show a list of installed modules in the cluster and you can load a module using:

    module load <moduleName>

    Please contact hpc-help@ucdavis.edu and send us the module name, version and link to the installation package so we install that for you.
  • What is X11 DISPLAY variable, xclock and how to fix an empty DISPLAY variable so that GUI's maybe displayed?
  • The DISPLAY environment variable instructs an X client which X server it is to connect to by default.
    The X display server installs itself normally as display number 0 on your local machine. In Putty, the “X display location” box reads localhost:0 by default.

    A display is managed by a server program, known as an X server. The server serves displaying capabilities to other programs that connect to it.

    The remote server knows where it has to redirect the X network traffic via the definition of the DISPLAY environment variable which generally points to an X Display server located on your local computer.

    The SSH protocol has the ability to securely forward X Window System applications over an encrypted SSH connection, so that you can run an application on the SSH server machine and have it put its windows up on your local machine without sending any X network traffic in the clear. $DISPLAY on the remote machine should point to localhost. SSH does the forwarding.


    xclock is a little X11 executable installed along with X11 which makes it easy to test your DISPLAY setup.
    Log on hpc1 or hpc2 and type xclock or xcalc
    If your DISPLAY variable is set, you will see them pop up.
    oschreib@hpc1:~$ which xclock
    /usr/bin/xclock
    oschreib@hpc1:~$ dpkg -S $(which xclock)
    x11-apps: /usr/bin/xclock
    oschreib@hpc1:~$ xclock &
    oschreib@hpc1:~$ xcalc &


    Here is how to troubleshoot if an empty DISPLAY variable is shown to the user:-
    If xclock does not come up, check DISPLAY variable on host with
    echo $DISPLAY
    If empty, check 
    echo $DISPLAY
    on desktop
    If not empty connect using:
    ssh -X -Y joeuser@hpc2.engr.ucdavis.edu
    [...]

    Then try again:
    echo $DISPLAY
    xclock

    If DISPLAY empty on desktop, it maybe X11 is not included with Mac:
    https://support.apple.com/en-gb/HT201341
    It is suggested to download XQuartz: https://www.xquartz.org
    After download, log out and re-login, check
    echo $DISPLAY
    /private/tmp/com.apple.launchd.E6LzlWfKmI/org.xquartz:0

    Then 
    ssh -X -Y joeuser@hpc2.engr.ucdavis.edu

    You can read more on: https://datacadamia.com/ssh/x11/display

 

Farm Access Policy and Storage Rate Information 

  • What is the access policy and current rates for getting storage in Farm cluster?
  • All researchers in CA&ES are entitled to free access to 8 nodes with 24 CPUs and 64GB RAM each (up to a maximum of 192 CPUs and 512GB RAM) in Farm II’s low, medium, and high priority batch queues, as well as 100GB storage space.

    Additional usage and access may be purchased by contributing to Farm III by through the node and/or storage rates, or by purchasing equipment and contributing through the rack fee rate.

    Contributors always receive priority access to the resources that they have purchased within one minute with the “one-minute guarantee.” Users can also request additional unused resources on a “fair share” basis–someone who contributes twice as much will be able to use twice as many unused resources in the medium or low partitions.
    Updated info on current rates and latest farm hardware specifications can be found at HPC Website : https://hpc.ucdavis.edu/farm-cluster
     

    For access to the cluster, please fill out the Account Request Form. Choose “Farm” for the cluster, and if your PI already has access to Farm, select their name from the dropdown. Otherwise, select “Getchell” as your sponsor and notify Adam Getchell

    Purchases can be split among different groups, but have to accumulate until a full node is purchased. Use of resources can be partitioned according to financial contribution.

    Additional information for prospective investors

    You will be notified when your hardware has been installed and your account has been updated. Rather than giving you unlimited access to the specific hardware purchased, the account update will give you high-priority access to a “fair share” of resources equivalent to the purchase. For example, if you have purchased one compute node, you will always have hi-priority access to one compute node. There is no need to worry about the details of which node you are using or which machine is storing your results. Those details are handled by the system administrators and managed directly by slurm.

    Slurm is configured you get 100% of the resources you paid for within 1 minute in the high partition. Access to unused resources is available though the medium and low partitions. Users get a “fair share” of free resources. So if you contribute twice as much as another user you get twice as large a share of any free resources.

  • What is a Temporary Affiliate Form (TAF) and how can I get a temporary account in Farm cluster?
  • If you don’t have a UC Davis affiliation, you need to have a Faculty member sponsor your account. The easiest way to do this is that the sponsor/PI has to fill out a Temporary Affiliate Form (TAF). These are typically for researchers that don't have an affiliation or have a short / contractural appointment with UC Davis. After the TAF is completed you can create a UC Davis Kerberos account. A TAF is required due to UC Davis Policies.

    This process grants external constituents (visiting faculty, concurrent students, vendors, and others) access to UC Davis computer resources. By registering for temporary access, affiliates have access to the UC Davis network, a ucdavis.edu email address, and a unique username and password which is used to verify identity and enable subsequent access privileges to various parts of the network.

    When a person has been approved as Temporary Affiliate by a sponsoring department and verified by Information & Educational Technology (IET), they are given an identity in the Identity Management System and will receive the following:

         - An account (commonly referred to as a Kerberos ID)
         - An email address (<someone>@ucdavis.edu)
         - If they are a student, they will be issued a DavisMail account.
         - If they are faculty or staff, they will be issued an Office365 account. (Non-students get access to Google Apps without Gmail (GAPP permit).
         - Authorization to login to the UC Davis wireless network
         - An entry in the Online Directory (https://directory.ucdavis.edu)
         - An account in the UC Davis central Active Directory (uConnect)
         - Directory information (name, account, etc.) will be set up in both the Active Directory and campus Lightweight Directory Access Protocol (LDAP) systems
         - Access to add any “unrestricted” services offered through http://computingaccounts.ucdavis.edu
         - By default, they are allowed access to the IET Campus Computer Labs (ILAB permit).

    In addition to receiving access to these campus systems, it is important to note that with Kerberos ID and password, Temporary Affiliates may also have access to a number of other University applications owned by various departments. Therefore care should be taken to ensure that affiliate access is granted and reviewed according to University policy.

    Features/Benefits:
    Provides a UC Davis Kerberos account used for campus single sign-on and a UC Davis email address for UC Davis-external users who need to access UC Davis computing resources.
    Any UC Davis Faculty or Staff member may sponsor a person for a TAF account, subject to approval by identified departmental approvers. 
    TAF access should be granted for the minimum duration required by business need, up to 365 days.
    Go to TAF online to register.

    For questions or help with departmental approvers please contact ithelp@ucdavis.edu 

  • How do I check my available storage, nodes in a cluster?
  • The following commands are used to show information about the filesystem and details of your nodes:- 
    df -h    >>  Will show information about  Filesystem, Size, Used, Available, Used Percentage and Mounted on Disk
    df -h  ~<userid>   >> this command will show space allocation of a specific user
    sinfo  >> lists all the information about the available partitions, time limit of availability, node's state and node list
    scontrol show node >>  A Slurm command that shows detailed information about your available nodes, the CPU total, the real memory etc
    Please note that cluster won’t show your exact nodes a user uses every time, cluster purges and uses available nodes, allocate it to run your job on it.
    More information about Slurm command found at: https://slurm.schedmd.com/man_index.html 
  • How do I request to purchase more storage space in Farm cluster?
  • All researchers in CA&ES are entitled to free access to the original 8 nodes with 24 CPUs each and 64GB ram in the low, med, and high partitions.
    Any new nodes purchased will be in the “Farm III” pool separate from existing farm partitions. Currently (June 2019) the “Farm III” pool has 24 “parallel” nodes and 13 “bigmem” nodes.

    Costs to add to farm III:

        Disk space: $1,000 per 10 TB, with compression
        Bigmem node: $25,000 (2 TB RAM, 128 cores/256 threads, 4 TB /scratch)
        GPU: $14,500 1/8th of a GPU node (A100 with 80GB GPU RAM, 16 CPU cores / 32 threads, 128GB system RAM)
        Parallel (CPU) node: $13,500 (512 GB RAM, 128 cores/256 threads, 2 TB /scratch)

    Everyone gets free access to a common storage pool for 1TB per user. If you need more please email help@cse.ucdavis.edu. Unless special arrangements are made there are no backups. Please plan accordingly.

    Researchers with bigger storage needs can purchase storage at a rate of $1,000 per 10TB.

    For access to the cluster, please fill out the Account Request Form. Choose “Farm” for the cluster, and if your PI already has access to Farm, select their name from the dropdown. Otherwise, select “Getchell” as your sponsor and notify Adam Getchell. 
    Purchases can be split among different groups, but have to accumulate until a full node is purchased. Use of resources can be partitioned according to financial contribution.

    If you are a current user in Farm, you can view the additional resource purchase information in the message of the day when logging in to the Farm cluster.
  • What are the technical specifications of Farm cluster?
  • Farm is a research and teaching cluster for the College of Agricultural and Environmental Sciences. This page documents the hardware, software, and policies surrounding this resource. The announcement archives are available online.

    Announcements
    Announcement notifications are sent to an internally maintained mailing list. If you are a user of this cluster you will be added to the mailing lists automatically.

    Operating System
    The Farm cluster runs Ubuntu 18.04 and uses the Slurm batch queue manager. System configuration and management is via Cobbler and Puppet.

    Software
    Requests for any centrally installed software should go to hpc-help@ucdavis.edu. Any software that is available in CentOS/Ubuntu is also available for installation or already installed on this cluster. In many cases we compile and install our own software packages. These custom packages include compilers, mpi layers, open source packages, commercial packages, HDF, NetCDF, WRF, and others. We use Environment Modules to manage the environment. A quick intro:

    To get a list of available applications and libraries
         - module avail
    To setup your command line or script based environment
         - module load <directory/application>
    Documentation on some of the custom installed software is at HPC Software Documentation. An (outdated) list is at Custom Software. Best to use the “module avail” command for the current list of installed software.

    Hardware
    Interconnect

    Farm III: 2 x 36 port 100Gbps Infiniband switches
    Farm II: Three 36 port QDR (40Gbps) Infiniband switches
    Farm I: Collection of 1Gbitx48 switches

    Farm III -
    - 34 Parallel nodes with 64 CPUs and 256GB RAM.
    - 18 1TB Bigmem nodes with 96 CPUs
      - Partitions
        - high2, med2, low2
        - bmh, bmm, bml
        - Specialty: bgpu, gpu, gpum

    Farm II -
    - 101 Parallel nodes with 32 CPUs and 64GB RAM
    - Farm II Interactive head node
    - 9 512GB Bigmem nodes with 64 CPUs and 1 1024GB node with 96 CPUs.
    - Partitions:
       - high, med, low
       - bigmemh, bigmemm, bigmeml

    File Servers on Farm II

    - nas-8-0: 98T
    - nas-8-2: 66T
    - nas-8-3: 47T
    - nas-9-0: 41T
    - nas-9-1: 41T
    - nas-9-2: 27T
    - nas-10-1: 198T
    - nas-10-3: 49T
    - nas-11-1: 198T
    - nas-11-2: 196T
    - nas-12-1: 245T

    Total usable (not including file system and RAID overhead) disk space around 1.2 PB

    File Servers on Farm III
    - nas-5-2: 492T
    - nas-5-3: 492T
    - nas-6-0: 582T
    - nas-6-1: 466T
    - nas-11-0: 194T
    - nas-12-2: 337T
    - nas-12-3: 337T

    Total usable (not including file system and RAID overhead) disk space around 3.9 PB

    Infiniband interconnect currently on Farm II (32Gbps) (high, medium, and low partitions)
     - 2 48-port HP 1810-48G switches
     - 1 KVM console
     -  2 APC racks
     - 4 managed PDUs
     

    Batch Partitions
    Low priority means that you might be killed at any time. Great for soaking up unused        cycles with short jobs; a particularly good fit for large array jobs with short run times.

    Medium priority means you might be suspended, but will resume when a high priority job finishes. *NOT* recommended for MPI jobs. Up to 100% of idle resources can be used.

    High priority  your job will kill/suspend lower priority jobs. High priority means your jobs will keep the allocated hardware until it's done or there's a system or power failure. Limited to the number of CPUs your group contributed. Recommended for MPI jobs.

    low- Parallel, Infiniband nodes at low priority
    med- Parallel, Infiniband nodes at medium priority
    high- Parallel, Infiniband nodes at high priority
    bigmeml- Large memory nodes at low priority
    bigmemm- Large memory nodes at medium priority
    bigmemh- Large memory nodes at high priority
    Farm II's bigmem partitions are named bml, bmm, and bmh.

    Monitoring
    1 - To check the status of Farm servers:
         https://status.farm.caes.ucdavis.edu/

    2 - Ganglia is available at
          http://stats.cse.ucdavis.edu/ganglia/?c=Agri&m=load_one&r=hour&s=descending&hc=4&mc=2

  • What software packages or modules are available on Farm cluster?
  • Farm has many software packages available for a wide range of needs. Most packages that are installed are available as environment modules using the module avail command. Use module load <module/version> to load a module
    module unload <module/version> when done.

    Generally, use as few modules as possible at a time–once you're done using a particular piece of software, unload the module before you load another one, to avoid incompatibilities.

    Many of the most up-to-date Python-based software packages may be found under the bio3 module. Load the module with module load bio3 and run conda list to see a complete and up-to-date list.

    Many additional Python 2 packages may be found under the bio module. Note that the bio and bio3 modules are mutually incompatible with one another, so do not load both at the same time.

    Visit the Environments article for much more information on getting started with software and the modules command on the cluster.

    If you can't find a piece of software on the cluster, you can request an installation for cluster-wide use. Contact the hpc-help@ucdavis.edu with the name of the cluster, your username, the name of the software, and a link to the software's website, documentation, or installation directions, if applicable.

  • What  distributions of Python are available on the Farm cluster?
  • You can check the available distributions of Python in the cluster by typing "module avail Python" 
    Ubuntu comes with python 2.7.6 preinstalled into /usr/bin and is available by default. No module is required to use it.
    Python 2.7.15 and many additional Python packages are available via conda in the bio/1.0 module. If you need other packages not already installed, please contact hpc-help@ucdavis.edu with your request. Python 3.6.8 is also available as a module.

    Use Notes
    To load use the conda-based python 2.7.15:

    module load bio/1.0
    Or to use python 3:

    module load python/3.6.8
    Python on Farm II
    Batch files run python scripts using the default version of python as specified by the current Ubuntu release being used on Farm. The Farm installation can be found here: /usr/bin/python. If “module load Python” is added to batch files, python scripts are run using a custom compilation of python maintained on Farm for bioinformatics applications.

    A simple example of running a python script as a batch job:
    user@agri:~/examples/hello_world$ more hello_world.py
    print "Hello, World! \n"

    user@agri:~/examples/hello_world$ more hello_world.sh
    #!/bin/bash -l
    #SBATCH --job-name=hello_world

    # Specify the name and location of i/o files.  
    # “%j” places the job number in the name of those files.
    # Here, the i/o files will be saved to the current directory under /home/user.
    #SBATCH --output=hello_world_%j.out
    #SBATCH --error=hello_world_%j.err

    # Send email notifications.  
    #SBATCH --mail-type=END # other options are ALL, NONE, BEGIN, FAIL
    #SBATCH --mail-user=user@ucdavis.edu

    # Specify the partition.
    #SBATCH --partition=hi # other options are low, med, bigmem, serial.

    # Specify the number of requested nodes.
    #SBATCH --nodes=1

    # Specify the number of tasks per node, 
    # which may not exceed the number of processor cores on any of the requested nodes.
    #SBATCH --ntasks-per-node=1 

    hostname # Prints the name of the compute node to the output file.
    srun python hello_world.py # Runs the job.


    user@agri:~/examples/hello_world_sarray$ sbatch hello_world.sh
    Submitted batch job X
    user@agri:~/examples/hello_world$ more hello_world_X.err
    Module BUILD 1.6 Loaded.
    Module slurm/2.6.2 loaded 
    user@agri:~/examples/hello_world$ more hello_world_X.out

    c8-22
    Hello, World! 
    A simple example of an array job:

    user@agri:~/examples/hello_world_sarray$ more hello_world.py
    import sys

    i = int(sys.argv[1])
    print "Hello, World", str(i) + "! \n"
    user@agri:~/examples/hello_world_sarray$ more hello_world.sh
    #!/bin/bash -l
    #SBATCH --job-name=hello_world

    # Specify the name and location of i/o files.
    #SBATCH --output=hello_world_%j.out
    #SBATCH --error=hello_world_%j.err

    # Send email notifications.  
    #SBATCH --mail-type=END # other options are ALL, NONE, BEGIN, FAIL
    #SBATCH --mail-user=user@ucdavis.edu

    # Specify the partition.
    #SBATCH --partition=hi # other options are low, med, bigmem, serial.

    # Specify the number of requested nodes.
    #SBATCH --nodes=1

    # Specify the number of tasks per node, 
    # which may not exceed the number of processor cores on any of the requested nodes.
    #SBATCH --ntasks-per-node=1 

    # Specify the number of jobs to be run, 
    # each indexed by an integer taken from the interval given by "array”.
    #SBATCH --array=0-1

    hostname
    echo "SLURM_NODELIST = $SLURM_NODELIST"
    echo "SLURM_NODE_ALIASES = $SLURM_NODE_ALIASES"
    echo "SLURM_NNODES = $SLURM_NNODES"
    echo "SLURM_TASKS_PER_NODE = $SLURM_TASKS_PER_NODE"
    echo "SLURM_NTASKS = $SLURM_NTASKS"
    echo "SLURM_JOB_ID = $SLURM_JOB_ID"
    echo "SLURM_ARRAY_TASK_ID = $SLURM_ARRAY_TASK_ID"

    srun python hello_world.py $SLURM_ARRAY_TASK_ID


    user@agri:~/examples/hello_world_sarray$ sbatch hello_world.sh
    Submitted batch job X


    user@agri:~/examples/hello_world_sarray$ more *.err
    ::::::::::::::
    hello_world_X+0.err
    ::::::::::::::
    Module BUILD 1.6 Loaded.
    Module slurm/2.6.2 loaded 
    ::::::::::::::
    hello_world_X+1.err
    ::::::::::::::
    Module BUILD 1.6 Loaded.
    Module slurm/2.6.2 loaded 

    user@agri:~/examples/hello_world_sarray$ more *.out
    ::::::::::::::
    hello_world_X+0.out
    ::::::::::::::
    c8-22
    SLURM_NODELIST = c8-22
    SLURM_NODE_ALIASES = (null)
    SLURM_NNODES = 1
    SLURM_TASKS_PER_NODE = 1
    SLURM_NTASKS = 1
    SLURM_JOB_ID = 76109
    SLURM_ARRAY_TASK_ID = 0
    Hello, World 0! 

    ::::::::::::::
    hello_world_X+1.out
    ::::::::::::::
    c8-22
    SLURM_NODELIST = c8-22
    SLURM_NODE_ALIASES = (null)
    SLURM_NNODES = 1
    SLURM_TASKS_PER_NODE = 1
    SLURM_NTASKS = 1
    SLURM_JOB_ID = 76110
    SLURM_ARRAY_TASK_ID = 1
    Hello, World 1! 

Managing Data 

  • How can I transfer to/from a cluster?
  • There are multiple transfer software available to copy data between the cluster and the user's computer.

    Filezilla is a multi-platform client commonly used to transfer data to and from the cluster.

    Cyberduck is another popular file transfer client for Mac or Windows computers.

    WinSCP is Windows-only file transfer software.

    Globus is another common solution, especially for larger transfers.

    rsync and scp are command-line tools to transfer data to and from the cluster. These commands should be run on your computer, not on the cluster:

    To transfer something to Farm from your local computer:

    scp -r local-directory username@farm.cse.ucdavis.edu:~/destination/

    Note: outbound scp initiated from Farm is disabled. Please initiate an inbound scp using the above method.

    To transfer something from Farm to your local computer:

    scp -r username@farm.cse.ucdavis.edu:~/farmdata local-directory

    To use rsync to transfer a file or directory from Farm to your local computer:

    rsync -aP -e ssh username@farm.cse.ucdavis.edu:~/farmdata .

    rsync has the advantage that if the connection is interrupted for any reason you can just up-arrow and run the exact same command again and it will resume where it stopped.

    See man scp and man rsync for more information.

  • How can I transfer files or entire directory to/from a server?
  • To copy an input file from your desktop to a server:

    desktop:~> cd runs/run12/
    desktop:~/runs/run12> ls
    input1  input2  input3
    desktop:~/runs/run12> scp input1 username@server.dept.ucd.edu:
    input1                                        100% 1024KB   1.0MB/s   00:00    
    desktop:~/runs/run12> 


    Or a few files:

    Desktop:~/runs/run12> scp input* username@server.dept.ucd.edu:
    input1                                        100% 1024KB   1.0MB/s   00:00    
    input2                                        100% 1024KB   1.0MB/s   00:00    
    input3                                        100% 1024KB   1.0MB/s   00:00    


    Or an entire directory tree:

    desktop:~/runs/run12> cd ~/src
    desktop:~/src> scp -r simulation-2.3/ username@server.dept.ucd.edu:src
    HISTORY                                       100%    0     0.0KB/s   00:00    
    input.c                                       100%    0     0.0KB/s   00:00    
    timestep.c                                    100%    0     0.0KB/s   00:00    
    README                                        100%    0     0.0KB/s   00:00    
    output.c                                      100%    0     0.0KB/s   00:00    
    solve.c                                       100%    0     0.0KB/s   00:00    
    Makefile                                      100%    0     0.0KB/s   00:00    
    desktop:~/src> 

  • How to transfer data from HPC server to Box using SFTP?
  • Some users may complain that outgoing sftp from HPC clusters to Box server is prohibited but it is not. The reason is on Box server's side, Box supports ftps which is different protocol. Link to the Box faq - https://support.box.com/hc/en-us/articles/360043697414-Using-Box-with-FTP-or-FTPS

    Here are the steps:-
    1. Create external password on Box.com
    2. Create a folder on Box into which files will be transferred
    3. Log into the cluster. Go to top level folder in which resides the
       folder/files you want to copy over
    4. Log into Box from above cluster location:
       lftp -u <Box username which is just ucd email> ftps://ftp.box.com
    5. Enter external password for Box.com
    6. Use the "mirror" command to copy folder and contents to Box; -R flag
       for reverse (which is the "put" command for putting files from cluster into
       Box); -L --dereference flag for transferring symbolic links as actual
       files: mirror -RL --dereference <source> <target>
    7. You could also look into using the rclone command, which is like rsync, but for cloud accounts. Instructions are here <https://rclone.org/box/>.
    Setup and configuration is more complicated than ftps, you would need to forward X11 to your local system so rclone can pop up a browser so you can authenticate to Box. If you use macOS or Linux, just add a -Y flag to your SSH command. If you use MobaXterm on Windows then X11 forwarding should already work.
    8. Exit out of Box:

    Here is the link to the article from Rutgers about using lftp to transfer directly to box from the cluster:

    https://it.rutgers.edu/box/knowledgebase/using-box-on-linux/

  • The data transfer rate in the cluster is very slow, how can I speed up the transfer process?
  • One of the most common reasons for the slow speed is the upload speed of your internet connection.
    Please check your upload speed by writing "Run Speed Test" in Google.com, that will show the upload and download speed of your internet connection.

    You may also know that we always recommend rsync command to do data transfer.
    Rsync is free and open source tool. It is useful to copy local or remote file. It reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination.
    You can use any one of the following options to add a total progress indicator when copying files.
    Use --progress option with rsync command to see the progress and speed of transfer rate:
    rsync --progress <source> <destination>
  • How can I create backup of my data?
  • There is the option to manually copy all of your data into your machine.

    scp -r local-directory username@farm.cse.ucdavis.edu:~/destination/

    See the articles about transferring data to/from cluster for more details.
    There is also the other option to keep an archived or zip directory of your data in the cluster, that will shrink the size of data.
    Use the following command to compress an entire directory or a single file. It will also compress every other directory inside a directory you specify–in other words, it works recursively.

    tar -czvf name-of-archive.tar.gz /path/to/directory-or-file

    Here’s what those switches actually mean:

    -c: Create an archive.
    -z: Compress the archive with gzip.
    -v: Display progress in the terminal while creating the archive, also known as “verbose” mode.
    The v is always optional in these commands, but it’s helpful.
    -f: Allows you to specify the filename of the archive.
    Let’s say you have a directory named “stuff” in the current directory and you want to save it to a file named archive.tar.gz. You’d run the following command:

    tar -czvf archive.tar.gz stuff
    Or, let’s say there’s a directory at /usr/local/something on the current system and you want to compress it to a file named archive.tar.gz. You’d run the following command:

    tar -czvf archive.tar.gz /usr/local/something

  • How can I set files' access permissions for different groups and users?
  • If you want to set read, write and execute permissions for shared files of a shared directory or your own files, you can use the chmod command (mostly requires root permission).
    The chmod command is used to define or change permissions or modes on files and limit access to only those who are allowed access.

    Example:
    If you’re a owner of a file called Confidential and want to change the permissions or modes so that user can read / write and execute, group members can read and execute only and others can only read, you will run the commands below:

    sudo chmod u=rwx,g=rx,o=r Confidential
     

    The commands above changes the permission of the file called Confidential so that user can read (r), write (w)and execute (x) it… group members can only read (r) and execute (x) and others can only read (x) its content. The same command above can be written as shown below using its octal permission notation:

    sudo chmod 754 Confidential

    Following tables show types of access restrictions and user restrictions

    User_Permissions_Chmod


    Read more about directories' and files' permissions on: https://help.ubuntu.com/community/FilePermissions&nbsp

  • I cannot save a file and get the error: "E667: Fsync failed"
  • This error mostly shown when there is not enough storage available. Please check your disk usage via the command:

    df -u ~

    You can see your disk usage percentage in the output of this command. You can free some space by deleting unneccessary files and retry to save the new file.
  • My account is locked due to "headnode" abuse, how can I avoid using "headnode"?
  • The basic architecture of a compute cluster consists of a “head node”, which is the computer from which a user submits jobs to run, and “compute nodes”, which are a large number of computers on which the jobs can be run. It is also possible to log into a compute node and run jobs directly from there. Never run a job directly on the head node!
    headnode_farm

    If you overwhelm the head node you will impact all the users at Farm that is why the account gets suspended.
    You will receive an email stating your account has been locked, please contact hpc-help@ucdavis.edu in order to unlock your account.
    In order to review your running jobs/processes, use the command htop  that gives a top view of what processes are running on the headnode, so if you see your jobs running there while you expect them to be running on the compute node you can stop them.

    htop -u youruserid
    It opens a friendly and interactive window that shows the processes on the headnode. 

    Besides, there is another command called pstree which is a convenient Linux command used to show running processes in a tree (data structure).

    If a user name is specified, all process trees rooted at processes owned by that user are shown. 

    To display all process trees rooted at processes owned by a specific user with their PIDs, use:

    pstree -p username
    To display a tree of processes, use:

    pstree
    To display a tree of processes with PIDs, use:

    pstree -p


Available Software and Modules 

  • What are  the available software and modules on HPC clusters?
  • Most of HPC clusters have many software packages available for a wide range of needs. Most packages that are installed are available as environment modules. You can find out about an installed software/module using the following command:-
     
    module avail 
    This command will list all the installed software and modules in your cluster.

    Use:
    module load <module/version>        to load a module
    module unload <module/version>    when done.

    Generally, use as few modules as possible at a time–once you're done using a particular piece of software, unload the module before you load another one, to avoid incompatibilities.

    If you cannot find a piece of software on the cluster, you can request an installation for cluster-wide use. Contact the hpc-help@ucdavis.edu with the name of the cluster, your username, the name of the software, and a link to the software's website, documentation, or installation directions, if applicable.

  • What are Environment Modules?
  • A software environment in the context of computing on the cluster is just a set of variables that define the locations of programs, libraries, configuration files, data, or other information that a particular piece of software needs to run. Environments are usually defined by adjusting the environment variables of the login shell.

    Different pieces of software (or different versions of the same software) can have different environment needs that may conflict with one another, which can cause errors or result in the software not functioning as expected – or at all. This has lead to a need for ways to manage one or more software environments.

    Environment Modules
    To help minimize software environment conflicts, our clusters use the module command to alter the user's shell environment. This allows for reproducible environment changes that are easily loaded or unloaded at will.

    When a user requests that a piece of software be made available to all users on the cluster, the sysadmins will often write an environment module to make it easily accessible for everyone with the module load command.

    Typical module commands:

    module show - shows what modules you currently have loaded
    module avail - a list of ALL available modules
    module load <modulename> - load a module
    module unload <modulename> - unload a module
    module purge - unload ALL modules
    module whatis <modulename> - show a short description of the module, if available
    module help <modulename> - show more detailed information about a module, if available
    Full help for the module command is available on the cluster by running man module, or on the web at the modules package website.

    Multiple versions of a the same software package may be available. By default, module load <modulename> will usually load the latest version. If you want to use the non-default version of a module, specify the version with a / after the module name. For example, to load a specific version of Hmmer: module load hmmer/2.3.2

    The environment variables that get set by a module depend on the requirements of the software. At a minimum, most modules will add the software's executable location to the user's $PATH, but it may also load other modules as dependencies, ensure that specific conflicting modules are not loaded, or take other actions to ensure that the software will run correctly.

    If you find an error in a module on one of our clusters, please contact the helpdesk (hpc-help@ucdavis.edu) and cut/paste the entire command line prompt, including your username, clustername, directory, the command you attempted to run and error output you received so that we can start the troubleshooting process to fix the module.

    Virtual Environments
    Virtual environments are another way of defining a custom environment that can be loaded and unloaded at will, while leaving the “base” or default environment intact. Virtual environments are common when a user wants to use a software environment out of his or her home directory without requiring system-wide installation, special permissions (like root or sudo) or needing sysadmin intervention.

    Miniconda is the recommended virtual environment and package manager for Python. Users can create unique virtual environments using conda based on the needs of the software they're using, including different versions of Python (Python 2 versus Python 3) and package dependencies such as numpy, scipy, matplotlib, and many others.

    Please be aware that personal virtual environments do take up disk space in your home directory, because every virtual environment installs its own version of Python and related packages.

    Administration of personal virtual environments installed in your home directory is left to the user.

    On some clusters, admin-installed conda packages and/or conda virtual environments with particular software may be available under modules bio (miniconda 2) or bio3 (miniconda 3).

  • What is the usage of /scratch/ Directory and Disk I/O?
  • Disk I/O (input/output) happens when reading to or from a file on the hard drive. Please avoid heavy I/O in your home directory, as this degrades file server performance for everyone. If you know that your software is I/O intensive, such as software that rapidly reads/writes to many files, performs many small reads/writes, and so on, you may want to copy your data out of your home directory and onto the compute node as a part of your batch job, or the network file system (NFS) can bottleneck, slowing down both your job and others, as well.

    To prevent NFS bottlenecking, Farm supports the use of the /scratch/ directory on the compute nodes when you have I/O-intensive code that needs temporary file space. Each compute node has its own independent scratch directory of about 1TB.

    Please create a unique directory for each job when you use scratch space, such as /scratch/your-username/job-id/, to avoid collisions with other users or yourself. For example, in your sbatch script, you can use /scratch/$USER/$SLURM_JOBID or /scratch/$USER/$SLURM_JOBID/$SLURM_ARRAY_TASK_ID (for array jobs).

    When your job is finished, copy any results/output that you wrote to your /scratch subdirectory (if any) and remove ALL of your files from your /scratch location.

    Note that /scratch/ is a shared space between everyone who runs jobs on a node, and is a limited resource. It is your responsibility to clean up your scratch space when your job is done or the space will fill up and be unusable by anyone.

    /scratch/ is local to each node, and is not shared between nodes and the login node so you will need to perform setup and cleanup tasks at the start and end of every job run. If you do not cleanup at the end of every run you will leave remnants behind that will eventually fill the shared space.

    The /scratch/ directory is subject to frequent purges, so do not attempt to store anything there longer than it takes your job to run.

    If you would like to purchase additional scratch space for yourself or your lab group, contact hpc-help@ucdavis.edu for more information.

  • How Can I install Python package or add a new Python environment on the cluster?
  • Most of the clusters have different versions of Python installed. Depending on the users experience and needs, different distributions are already available.
    Check the available versions of Python by entering the following command:
    module avail python

    Users can create new environments and install new packages in Python3. Follow these instructions to do so
    You can load Python3 using the following command and create virtual environments:

    module load python3
    python3 -venv testenv

    Test to see if it was created:

    ls -l testenv

    It will list down the new environment files.


    In order to install new packages use pip3 - Users can install those packages inside their home directory and then import them

    Module load Python3/3.8.2
    Pip3 install nltk

    Enter python3
    >>> import nltk
     

    If you get issues and questions when installing or adding new packages or environments, you can share details about the required package via
    hpc-help@ucdavis.edu. 
  • How can I set up an interactive session for using the program plink?
  • plink is a command line module. You cannot create an interactive session like the way "R" does. plink doesn't load as a module in the same way that R does, and you can just run commands via srun once the module is loaded and work with it. It is mostly used for automated operations, such as making CVS access a repository on a remote server.
    Check for more information in the page: https://github.com/RILAB/lab-docs/wiki/Farm-Interactive-Use

  • What does error "INFO: activate-gcc_linux-64.sh made the following environmental changes:" mean and how to resolve it?
  • One of the most common case this error is produced is when a user alters some environmental variables in Conda module or try to install in a new software in a Conda virtual environment. The disables user from transferring data to/from the cluster or causes some other types of issues.
    In such case, we suggest users move the hidden file called .bashrc.
    The purpose of a . bashrc file is to provide a place where you can set up variables, functions and aliases, define your (PS1) prompt and define other settings that you want to use every time you open a new terminal window. It works by being run each time you open up a new terminal, window or pane. 

    In order to avoid the error message use the following command:
    mv .bashrc .bashrc .bak
    and it removes those environment variables and resolves the issue.
  • How can I create a module environment and install a required package?
  • In Farm cluster, you can build up your own environment and even your own package by using the following commands:
    pip install module
    after creating the package it will put it in ~/.local/

  • What is a R Package and how do you load/use it?
  • R is a programming language and software environment for statistical computing and graphics. 
    You can obtain R in your environment by loading the R module i.e.:

    module load R
    The command R --version returns the version of R you have loaded:

    R --version
    R version 4.0.2 (2020-06-22) --
    "Taking Off Again"
    Copyright (C) 2020 The R Foundation for Statistical Computing
    Platform: x86_64-pc-linux-gnu (64-bit)
    The command which R returns the location where the R executable resides:

    which R
    /uufs/chpc.utah.edu/sys/installdir/R/4.0.2i/bin/R 
    We also maintain a number of older versions of R. You can list these versions with the command "module spider R", and load a specific version with a command such as "module load R/3.6.1".

  • How to run a R batch script on the cluster?
  • In order to run a R batch job on the compute nodes we just need to create a SLURM script/wrapper "around" the R command line.

    Below you will find the content of the corresponding Slurm batch script runR.sl:

    #!/bin/bash
    #SBATCH --time=00:10:00 # Walltime
    #SBATCH --nodes=1          # Use 1 Node     (Unless code is multi-node parallelized)
    #SBATCH --ntasks=1         # We only run one R instance = 1 task
    #SBATCH --cpus-per-task=12 # number of threads we want to run on
    #SBATCH --account=owner-guest
    #SBATCH --partition=ember-guest
    #SBATCH -o slurm-%j.out-%N
    #SBATCH --mail-type=ALL
    #SBATCH --mail-user=$USER@utah.edu   # Your email address
    #SBATCH --job-name=seaIce

    export FILENAME=seaice.R
    export SCR_DIR=/scratch/general/lustre/$USER/$SLURM_JOBID
    export WORK_DIR=$HOME/TestBench/R/SeaIce

    # Load R (version 3.3.2)
    module load R

    # Take advantage of all the threads (linear algebra)
    # $SLURM_CPUS_ON_NODE returns actual number of cores on node
    # rather than $SLURM_JOB_CPUS_PER_NODE, which returns what --cpus-per-task asks for
    export OMP_NUM_THREADS=$SLURM_CPUS_ON_NODE

    # Create scratch & copy everything over to scratch
    mkdir -p $SCR_DIR
    cd $SCR_DIR
    cp -p $WORK_DIR/* .

    # Run the R script in batch, redirecting the job output to a file
    Rscript $FILENAME > $SLURM_JOBID.out

    # Copy results over + clean up
    cd $WORK_DIR
    cp -pR $SCR_DIR/* .
    rm -rf $SCR_DIR

    echo "End of program at `date`"

  • What is the Screen and how can I start new sessions and shells using Screen?

  • Screen or GNU Screen is a terminal multiplexer. In other words, it means that you can start a screen session and then open any number of windows (virtual terminals) inside that session. Processes running in Screen will continue to run when their window is not visible even if you get disconnected.

    Install Linux GNU Screen
    The screen package is pre-installed on most HPC clusters. You can check if it is installed on your system by typing:

    screen --version
    Screen version 4.06.02 (GNU) 23-Oct-17

    To start a screen session, simply type screen in your console:

    screen

    This will open a screen session, create a new window, and start a shell in that window.
    Now that you have opened a screen session, you can get a list of commands by typing:
    Ctrl+a ?

    Starting Named Session
    Named sessions are useful when you run multiple screen sessions. To create a named session, run the screen command with the following arguments:

    screen -S session_name

    It’s always a good idea to choose a descriptive session name

    Working with Linux Screen Windows
    When you start a new screen session, it creates a single window with a shell in it.

    You can have multiple windows inside a Screen session.

    To create a new window with shell type Ctrl+a c, the first available number from the range 0...9 will be assigned to it.


    Below are some most common commands for managing Linux Screen Windows:

    Ctrl+a c Create a new window (with shell).
    Ctrl+a " List all windows.
    Ctrl+a 0 Switch to window 0 (by number).
    Ctrl+a A Rename the current window.
    Ctrl+a S Split current region horizontally into two regions.
    Ctrl+a | Split current region vertically into two regions.
    Ctrl+a tab Switch the input focus to the next region.
    Ctrl+a Ctrl+a Toggle between the current and previous windows
    Ctrl+a Q Close all regions but the current one.
    Ctrl+a X Close the current region.
    Detach from Linux Screen Session
    You can detach from the screen session at any time by typing:

    Ctrl+a d

    Reattach to a Linux Screen
    To resume your screen session use the following command:

    screen -r

    In case you have multiple screen sessions running on your machine, you will need to append the screen session ID after the r switch.
     
    To find the session ID list the current running screen sessions with:

    screen -ls

  • How Can I add Julia Packages into Farm Cluster?

  • HPC users can add Julia Packages using the following steps:

    1 - Load Julia using the following command:
    module load julia

    2 - Run Julia using the following command:
    Julia

    3 - Then hit the square bracket button
    ]

    4 - It will take a to a pkg prompt which will let you add package using the following command:

    add <<PackageName>>

    Here is detailed information on Julia's official website: - https://docs.julialang.org/en/v1/stdlib/Pkg/

    Example shown in the following image: Package added was JLD2 into Julia 1.6.2
     

    Julia_pkg_Installation
  • What is Conda and how can I create Conda environments in my account (HPC2)?
  • Conda is an open source package and environment management system that runs on Windows, Mac OS and Linux. Conda can quickly install, run, and update packages and associated dependencies. Conda can create, save, load, and switch between project specific software environments on your local computer or a cluster.

    In order to have write access to the Conda environment, you'll need to create it yourself. We have built and tested a conda environment specification that should work for you on HPC2 specifically. In order to use it, you'll need to:

    1 - Load the conda module: module load conda3. This provides access to the system-installed conda environments, and has been configured to allow you to conda create environments in your home directory. It also has the much-faster mamba command built in, which will significantly speed up environment creation.

    2 - Create the environment. First, link the environment file into a directory writable by you:

    ln -s /software/conda3/environment-specs/flow-sumo-2022-09.yml ., which is necessary do to a quirk with Conda; then, create the environment with mamba env create -f flow-sumo-2022-09.yml 
    This will create an environment in your home directory called flow-sumo-2022-09. You can create it under a different name by doing
    mamba env create -f /software/conda3/environment-specs/flow-sumo-2022-09.yml -n [ENVIRONMENT_NAME] 
    This environment includes all the flow dependencies (including the proper version of Ray with RLlib support).

    3 - Activate the environment: conda activate flow-sumo-2022-09 or conda activate [ENVIRONMENT_NAME] if you created it under a different name.
    Install flow: python -m pip install --no-deps git+https://github.com/flow-project/flow. This pulls the latest flow from GitHub and installs it in the environment. Note that --no-deps flag, as the working dependencies were already pinned via the environment file, and we don't want pip to mess with them.
    Thereafter, your steps for developing flow simulations will just be module load conda3 followed by the conda activate. Note that the conda3 environment module will conflict with user-installed miniconda or anaconda distributions; if you or your lab members have these local installations, they will need to remove the changes the installer makes to their .bashrc or .zshrc. The conda3 module will warn them and refuse to load if an existing installation is detected.
     

Batch Queue and Job Scheduler

  • What are the available job schedulers in the clusters?
  • All of our computing resources use a Batch Queue. There are many benefits to using a batch queue on a compute cluster.  Slurm is the job scheduler used for batch queue management in the clusters. We no longer support Sun Grid Engine or Condor in Farm cluster.
  • What is Slurm and how can I submit a job?
  • Slurm is an open-source resource manager (batch queue)  and job scheduler designed for Linux clusters of all sizes.

    The general idea with a batch queue is that you don't have to babysit your jobs. You submit it, and it'll run until it dies, or there is a problem. You can configure it to notify you via email when that happens. This allows very efficient use of the cluster. You can still babysit/debug your jobs if you wish using an interactive session (ie qlogin).

    Our main concern is that all jobs go through the batch queuing system. Do not bypass the batch queue. We don't lock anything down but that doesn't mean we can't or won't. If you need to retrieve files from a compute node feel free to ssh directly to it and get them, but don't impact other jobs that have gone through the queue.

  • What are the useful commands of Slurm that I can use to run my jobs?
  • Here are some useful Slurm commands with their purpose:
    sinfo reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options.
    smap reports state information for jobs, partitions, and nodes managed by SLURM, but graphically displays the information to reflect network topology.
    sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.
    squeue reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.
    srun is used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.).
    A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job's node allocation.
    smap reports state information for jobs, partitions, and nodes managed by SLURM, but graphically displays the information to reflect network topology.
    scancel is used to stop a job early. Example, when you queue the wrong script or you know it's going to fail because you forgot something.
    See more in "Monitoring Jobs" in the Slurm Example Scripts article in the Help Documents.
    More in depth information at http://slurm.schedmd.com/documentation.html
  • How are jobs priorities calculated in LSSC0?
  • Priority in the cluster is managed by the labs current FairShare value. The below table from "sshare" details the labs current raw share of the cluster (RawShare/$100 = the total buy-in from the lab), normalized shares (NormShares, the proportion raw shares represents in the whole cluster), normalized usage (NormUsage, the proportion raw usage represents in the whole cluster), the labs total usage (RawUsage), and current FairShare value. If your FairShare value = 0.5 your using as much of the cluster as you have bought in, above 0.5 means you use less than your buy in and below 0.5 means you use more than your buy in.
    Job-priority
  • What is the Slurm Priority Calculation formula?
  • slurm_job_priotiy
  • My jobs are failing on the node and restarting, how can I set the jobs to not return to the queue when a node fails?
  • You can use the following command to option to tell SLURM to not return your job to the queue on failure:

    --no-requeue

    This can be used when you want to manually resubmit jobs and make the necessary changes to allow the code to conduct a restart properly.
    For fixing the failed jobs, please provide the job ids, so the broken nodes are identified and fixed.
  • How do I extend the time limit of a job?
  • The command scontrol show job <jobnum> shows the job details.
    The command TimeLimit= shows the time limit of the job that has been set.
    The command EndTime= shows the actual date/time slurm says it will cut off.
    The following command will update the time limit of a Slurm job:
    scontrol update jobid=22698439 TimeLimit=15-00:00:00
    Please note that this command needs root privilege; You can contact cluster administrators to update the job time limit. Send a request to hpc-help@ucdavis.edu.
  • How do I cancel a job manually?
  • How to stop a job manually (i.e. Abort) First use squeue to find the job number (JOBID).

    $scancel -u $USER  <JOBID>

    If you forget the JOBID it will cancel all your jobs. 

  • How can I check why a node is down or showing other state?
  • Users can run the following command to check if the nodes are in down, drained, failed state:

    sinfo -Rl

    Or they can specify the states(s) and list nodes only having the given state(s). 

    sinfo -t drain

    This will list nodes with drain state
     

  • When I SSH to the cluster, the login is very slow, is there any way to connect faster to the cluster?

  • Those type of IO slowdowns are typically because the network attached storage (NAS) device where your home directory is stored is being heavily used. Because the NAS is a shared device, you will occasionally run into times when other people are using it heavily, which impacts everyone using that NAS. These issues tend to clear themselves up when the IO intensive jobs finish.
    If you face such issue consistently, kindly let us know, we will check the NAS where your home directory is resided.
  • When I run a job I get the message 'srun: error: Unable to allocate resources: Invalid account or account/partition combination specified'.
  • This error can also be caused by an invalid combination of values for account and partition: not all accounts work on all partitions. Check the spelling in your batch script or interactive command and be sure you have access to the account and partition. To view the combinations you can use, use the sacctmgr command; more information (including example commands) can be found on the "Batch Queue and Job Schedulers" section.
    You can also see what partitions/accounts you have access to by running the myallocation command.

  • How to build pipelines using Slurm dependencies?
  • Differences between stock sbatch and sbatch:
    Before discussing job dependencies we need to point out that sbatch on the cluster, and therefore in the examples below, is a wrapper script that returns just the jobid. That is different from stock sbatch which returns Submitted batch job 123456. You can think of the wrapper doing something equivalent to

    #! /bin/bash

    sbr="$(/path/to/real/sbatch "$@")"

    if [[ "$sbr" =~ Submitted\ batch\ job\ ([0-9]+) ]]; then
        echo "${BASH_REMATCH[1]}"
        exit 0
    else
        echo "sbatch failed"
        exit 1

    fi
    Introduction
    Job dependencies are used to defer the start of a job until the specified dependencies have been satisfied. They are specified with the --dependency option to sbatch or swarm in the format

    sbatch --dependency=<type:job_id[:job_id][,type:job_id[:job_id]]> ...
    Dependency types:

    after:jobid[:jobid...]    job can begin after the specified jobs have started
    afterany:jobid[:jobid...]    job can begin after the specified jobs have terminated
    afternotok:jobid[:jobid...]    job can begin after the specified jobs have failed
    afterok:jobid[:jobid...]    job can begin after the specified jobs have run to completion with an exit code of zero (see the user guide for caveats).
    singleton    jobs can begin execution after all previously launched jobs with the same name and user have ended. This is useful to collate results of a swarm or to send a notification at the end of a swarm.
    See also the Job Dependencies section of the Help Documents.

    To set up pipelines using job dependencies the most useful types are afterany, afterok and singleton. The simplest way is to use the afterok dependency for single consecutive jobs. For example:

    b2$ sbatch job1.sh
    11254323
    b2$ sbatch --dependency=afterok:11254323 job2.sh

    Now when job1 ends with an exit code of zero, job2 will become eligible for scheduling. However, if job1 fails (ends with a non-zero exit code), job2 will not be scheduled but will remain in the queue and needs to be canceled manually.

    As an alternative, the after any dependency can be used and checking for successful execution of the prerequisites can be done in the jobscript itself.

    The sections below give more complicated examples of using job dependencies for pipelines in bash, perl, and python.

    Bash
    The following bash script is a stylized example of some useful patterns for using job dependencies:

    #! /bin/bash

    # first job - no dependencies
    jid1=$(sbatch  --mem=12g --cpus-per-task=4 job1.sh)

    # multiple jobs can depend on a single job
    jid2=$(sbatch  --dependency=afterany:$jid1 --mem=20g job2.sh)
    jid3=$(sbatch  --dependency=afterany:$jid1 --mem=20g job3.sh)

    # a single job can depend on multiple jobs
    jid4=$(sbatch  --dependency=afterany:$jid2:$jid3 job4.sh)

    # swarm can use dependencies
    jid5=$(swarm --dependency=afterany:$jid4 -t 4 -g 4 -f job5.sh)

    # a single job can depend on an array job
    # it will start executing when all arrayjobs have finished
    jid6=$(sbatch --dependency=afterany:$jid5 job6.sh)

    # a single job can depend on all jobs by the same user with the same name
    jid7=$(sbatch --dependency=afterany:$jid6 --job-name=dtest job7.sh)
    jid8=$(sbatch --dependency=afterany:$jid6 --job-name=dtest job8.sh)
    sbatch --dependency=singleton --job-name=dtest job9.sh

    # show dependencies in squeue output:
    squeue -u $USER -o "%.8A %.4C %.10m %.20E"

    And here is a simple bash script that will submit a series of jobs for a benchmark test. This script submits the same job with 1 MPI process, 2 MPI processes, 4 MPI processes ... 128 MPI processes. The Slurm batch script 'jobscript' uses the environment variable $SLURM_NTASKS to specify the number of MPI processes that the program should start. The reason to use job dependencies here is that all the jobs write some temporary files with the same name, and would clobber each other if run at the same time.

    #!/bin/sh

    id=`sbatch --job-name=factor9-1 --ntasks=1 --ntasks-per-core=1 --output=${PWD}/results/x2650-1.slurmout jobscript`
    echo "ntasks 1 jobid $id"

    for n in 2 4 8 16 32 64 128; do
        id=`sbatch --depend=afterany:$id --job-name=factor9-$n --ntasks=$n --ntasks-per-core=1 --output=${PWD}/results/x2650-$n.slurmout jobscript`;
        echo "ntasks $n jobid $id"
    done

    The batch script corresponding to this example:

    #!/bin/bash

    module load  amber/14
    module list

    echo "Using $SLURM_NTASKS cores"

    cd /data/user  /amber/factor_ix.amber10

    `which mpirun` -np $SLURM_NTASKS `which sander.MPI` -O -i mdin -c inpcrd -p prmtop
    Perl

    A sample perl script that submits 3 jobs, each one dependent on the completion (in any state) of the previous job.

    #!/usr/local/bin/perl

    $num = 8;

    $jobnum = `sbatch --cpus-per-task=$num myjobscript`;
    chop $jobnum;
    print "Job number $jobnum submitted\n\n";

    $jobnum = `sbatch --depend=afterany:${jobnum} --cpus-per-task=8 --mem=2g mysecondjobscript`;
    chop $jobnum;
    print "Job number $jobnum submitted\n\n";

    $jobnum = `sbatch --depend=afterany:${jobnum} --cpus-per-task=8 --mem=2g mythirdjobscript`;
    chop $jobnum;
    print "Job number $jobnum submitted\n\n";

    system("sjobs");

    Python
    The sample Python script below submits 3 jobs that are dependent on each other, and shows the status of those jobs.

    #!/usr/local/bin/python

    import commands, or

    # submit the first job
    cmd = "sbatch Job1.bat"
    print "Submitting Job1 with command: %s" % cmd
    status, jobnum = commands.getstatusoutput(cmd)
    if (status == 0 ):
        print "Job1 is %s" % jobnum
    else:
        print "Error submitting Job1"

    # submit the second job to be dependent on the first
    cmd = "sbatch --depend=afterany:%s Job2.bat" % jobnum
    print "Submitting Job2 with command: %s" % cmd
    status,jobnum = commands.getstatusoutput(cmd)
    if (status == 0 ):
        print "Job2 is %s" % jobnum
    else:
        print "Error submitting Job2"

    # submit the third job (a swarm) to be dependent on the second
    cmd = "swarm -f swarmfile --module blast  --depend=afterany:%s Job2.bat" % jobnum
    print "Submitting swarm job  with command: %s" % cmd
    status,jobnum = commands.getstatusoutput(cmd)
    if (status == 0 ):
        print "Job3 is %s" % jobnum
    else:
        print "Error submitting Job3"

    print "\nCurrent status:\n"
    #show the current status with 'sjobs'
    os.system("sjobs")
    Running this script:

    [user  @biowulf ~]$ submit_jobs.py
    Submitting Job1 with command: sbatch Job1.bat
    Job1 is 25452702
    Submitting Job2 with command: sbatch --depend=afterany:25452702 Job2.bat
    Job2 is 25452703
    Submitting swarm job  with command: swarm -f swarm.cmd --module blast  --depend=afterany:25452703
    Swarm job is 25452706

    Current status:

    User    JobId            JobName   Part  St  Reason      Runtime  Walltime  Nodes  CPUs  Memory  Dependency      
    ==============================================================================================================
    user    25452702         Job1.bat  norm  PD  ---            0:00   4:00:00      1   1   2GB/cpu
    user    25452703         Job2.bat  norm  PD  Dependency     0:00   4:00:00      1   1   2GB/cpu  afterany:25452702
    user    25452706_[0-11]  swarm     norm  PD  Dependency     0:00   4:00:00      1  12   1GB/node afterany:25452703
    ==============================================================================================================
    cpus running = 0
    cpus queued = 14
    jobs running = 0
    jobs queued = 14

  • What are Srun options?
  • You may also use options to the srun command:

    $ srun [option list] [executable] [args]

    Some srun options

    -c #    The number of CPUs used by each process
    -d    Specify debug level between 0 and 5
    -i file    Redirect input to file
    -o file    Redirect output
    -n #    Number of processes for the job
    -N #    Numbers of nodes to run the job on
    -s    Print usage stats as job exits
    -t    time limit for job, <minutes>, or <hours>:<minutes> are commonly used
    -v -vv -vvv    Increasing levels of verbosity
    -x node-name    Don't run job on node-name (and please report any problematic nodes to help@cse.ucdavis.edu)

    Interactive Sessions
    (takes 30 seconds or so)

    $ srun --partition=partition-name --time=1:00:00 --unbuffered --pty /bin/bash -il 
    When the time limit expires you will be forcibly logged out and anything left running will be killed.

  • What is the maximum memory capacity in Farm nodes we can allocate to our job/s?
  • The low, med, high nodes in Farm cluster have 64 GB 
    The nodes in HPC1 have 32 GB as the maximum memory capacity 
    Detailed information about memory allocation are illustrated in the page - https://hpc.ucdavis.edu/clusters 
  • What are Slurm partitions?
  • Generally, there are three SLURM partitions (aka queues) on a cluster. These partitions divide up pools of nodes based on job priority needs.

    low    Low priority means that you might be killed at any time. Great for soaking up unused cycles with short jobs; a particularly good fit for large array jobs when individual jobs have short run times.
    med    Medium priority means you might be suspended, but will resume when a high priority job finishes. *NOT* recommended for MPI jobs. Up to 100% of idle resources can be used.
    hi    Your job will kill/suspend lower priority jobs. High priority means your jobs will keep the allocated hardware until it's done or there's a system or power failure. Limited to the number of CPUs your group contributed. Recommended for MPI jobs.

    There are other types of partitions that may exist, as well.

    bigmem, bm    Large memory nodes. Jobs will keep the allocated hardware until it's done or there's a system or power failure. (bigmems/bms may be further divided into l/m/h partitions, following the same priority rules as low/med/high in the table above.)
    gpu    GPU nodes, will keep the allocated hardware until it's done or there's a system or power failure.
    serial    Older serial nodes, jobs will keep the allocated hardware until it's done or there's a system or power failure.
    Nodes can be in more than one partition, and partitions with similar names generally have identical or near-identical hardware: low/med/high are typically one set of hardware, low2/med2/high2 are another, and so on.

    There may be other partitions based on the hardware available on a particular cluster; not all users have access to all partitions. Consult with your account creation email, your PI, or the helpdesk if you are unsure what partitions you have access to or to use.

  • What does exitcode 137 mean?
  • Users will periodically receive error messages regarding the jobs they have scheduled.
    Exitcodes greater than 128 means the job was terminated by a signal and it was coming from your job.  You can get a list of signals with the command 

    kill -l

    Exitcodes greater than 128 designate a job terminated by a signal from the system.  To understand which signal, you subtract 128 from the Exitcode given. 
    In this case, 137-128 = 9, which is SIGKILL.

    Error example:
    Slurm Job_id=40746315 Name=gl_counts Ended, Run time 15:12:48, FAILED, ExitCode 137

     
    looking up the job using:
    sacct --format "JobID, MaxVMSize, State, ExitCode" -j 40746315

    Since the job does not appear to fit in 512 Gb of RAM, it will be a tough challenge to get it to run.  It would be best to find a means to shrink or downsize the job.  There are very few nodes with enough RAM to handle a job of this size  

    Sigbus = 7
    Sigsegv = 11
     

  • How do I check the status of my job in SLURM?
  • There are several ways to check the status of your jobs in the queue.  Below are a few SLURM commands to make use of.  Use the Linux man command to see the manual of each command and find loads of additional information about these commands.

    squeue <flags>

    -u username
    -j jobid
    -p partition
    -q qos


    Example 1

    squeue -u userid
    The above command will display all the jobs related to a user in the following format:
     
      JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
      92311     debug     test      cdc   R       0:08      2 d09n29s02,d16n02
      88915 general-c     GPU_test  cdc   PD      0:00      1 (Priority)

     

    Information about each column displayed:
    JOBID - Shows the id/number given to the Job
    PARTITION - Shows which partition has been the job running in
    NAME - The name given to the job
    USER - The user id of the person running the job
    ST - Shows the status of the job that can be (Pending), R (Running), CD (Completed) 

     

    Example 2

    squeue -u userid --start

    JOBID PARTITION NAME USER ST START_TIME NODES NODELIST(REASON)
    88915 general-c GPU_test cdc PD 2013-07-09T13:09:40 1 (Priority)
    91487 general-c hello_te cdc PD N/A 2 (Priority)

    squeue or stimes - Show the Estimated Start Time of a Job
    squeue <flags> --start


    Example 3

    squeue -j 91487 --start

      JOBID PARTITION     NAME     USER  ST           START_TIME  NODES NODELIST(REASON)
      91487 general-c hello_te      cdc  PD                  N/A      2 (Priority)

    or use stimes for more detailed information:
    stimes <flags - same as squeue>

    Example 4

    stimes -u xdtas

    JOBID       USER      PARTITION        JOB_NAME      REQUEST_TIME NODES  CPUS REASON_FOR_WAIT    PRIORITY   JOB_STARTS_IN
    4753092     xdtas     general-compute  xdmod.benchm         14:00    16   192 (Resources)            1.16     2.11 days
    4753045     xdtas     general-compute  xdmod.app.md          2:00     8    96 (Resources)            1.12     2.11 days


    squeue - Show Jobs Running on Compute Nodes
    Resources - Job is waiting for compute nodes to become available
    Priority - Jobs with higher priority are waiting for compute nodes.  Check this knowledge base article for info about job priority
    ReqNodeNotAvail - The compute nodes requested by the job are not available for a variety of reasons, including: cluster downtime, nodes offline or temporary scheduling backlog

    Example 5

    sinfo -p debug

    PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
    debug        up    1:00:00      1  alloc k05n26
    debug        up    1:00:00      3  idle  d09n29s02,d16n[02-03]

    sinfo - Shows the State of Nodes
    sinfo -p - Shows the partition

    snodes
    - Show Node State and Feature Details
    snodes all <cluster>/<partition>

    Example 6

    snodes all general-compute | more

    HOSTNAMES  STATE    CPUS S:C:T    CPUS(A/I/O/T)   CPU_LOAD MEMORY   GRES     PARTITION          FEATURES
    d07n04s01  alloc    8    2:4:1    8/0/0/8         8.02     24000    (null)   general-compute*   IB,CPU-L5630

     

    idle- all cores are available on the compute node no jobs are running on the compute node
    mix - at least one core is available on the compute node has one or more jobs running on it
    alloc - all cores on the compute node are assigned to jobs

  • What are Slurm environment variables and how to use them?
  • When a job scheduled by Slurm starts, it needs to know certain things about how it was scheduled, what is the working directory, and/or what nodes were allocated for it. Slurm passes this information to the job via environmental variables. In addition to being available to your job, these are also used by programs like mpirun to default values. This way, something like mpirun already knows how many tasks to start and on which nodes, without you needing to pass this information explicitly.

    Following is a list of useful Slurm environment variables:
    $SLURM_NODELIST - Nodes assigned to job
    $SLURM_NODE_ALIASES - Node aliases
    $SLURM_NNODES - Number of nodes allocated to job
    $SLURM_JOBID - Job ID
    $SLURM_TASKS_PER_NODE
    $SLURM_JOB_ID - 
    $SLURM_SUBMIT_DIR - Submit Directory
    $SLURM_JOB_NODELIST
    $SLURM_CPUS_ON_NODE
    - Number of cores/node
    $SLURM_CPUS_PER_TASK - Number of cores per task. I.e., the value given to the -cpus-per-task or -c sbatch options. Not set unless one of those options given.
    $SLURM_JOB_NAME - Job Name
    $SLURM_LOCALID - Index to core running on within node

  • How do I check the status of my job(s)?
  • There are several ways to check the status of your jobs in the queue.  Below are a few SLURM commands to make use of.  Use the Linux 'man' command to find loads of additional information about these commands as well.
    squeue - Show the State of Jobs in the Queue

    squeue <flags>

    -u username
    -j jobid
    -p partition
    -q qos
    Example:
    [ccruser@vortex:/ifs/user/ccruser]$ squeue -u cdc
      JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
      92311     debug     test      cdc   R       0:08      2 d09n29s02,d16n02
      88915 general-c GPU_test      cdc  PD       0:00      1 (Priority)
      91716 general-c hello_te      cdc  PD       0:00      2 (Priority)
      91791 general-c hello_te      cdc  PD       0:00      2 (Priority)
      91792 general-c hello_te      cdc  PD       0:00      2 (Priority)

    squeue or stimes - Show the Estimated Start Time of a Job

    squeue <flags> --start
    Shows only jobs in PD (pending) state

    Example:
    [ccruser@vortex:/ifs/user/ccruser]$ squeue -u cdc --start
    JOBID PARTITION NAME USER ST START_TIME NODES NODELIST(REASON)
    88915 general-c GPU_test cdc PD 2013-07-09T13:09:40 1 (Priority)
    91487 general-c hello_te cdc PD N/A 2 (Priority)

    [ccruser@vortex:/ifs/user/ccruser]$ squeue -j 91487 --start
      JOBID PARTITION     NAME     USER  ST           START_TIME  NODES NODELIST(REASON)
      91487 general-c hello_te      cdc  PD                  N/A      2 (Priority)

    or use stimes for more detailed information:

    stimes <flags - same as squeue>

    Example:

    [ccruser@vortex:/]$ stimes -u xdtas
    JOBID       USER      PARTITION        JOB_NAME      REQUEST_TIME NODES  CPUS REASON_FOR_WAIT    PRIORITY   JOB_STARTS_IN
    4753092     xdtas     general-compute  xdmod.benchm         14:00    16   192 (Resources)            1.16     2.11 days
    4753045     xdtas     general-compute  xdmod.app.md          2:00     8    96 (Resources)            1.12     2.11 days
    4753053     xdtas     general-compute  xdmod.app.ch          2:00     8    96 (Resources)            1.12     2.11 days
    4753060     xdtas     general-compute  xdmod.app.as         43:00     8    96 (Resources)            1.12     2.11 days
    4753098     xdtas     general-compute  xdmod.benchm         30:00     8    96 (Resources)            1.11     2.11 days
    4753068     xdtas     general-compute  xdmod.app.md          3:00     4    48 (Resources)            1.09     2.11 days
    4753085     xdtas     general-compute  xdmod.benchm          9:00     4    48 (Resources)            1.08     2.11 days
    4753099     xdtas     general-compute  xdmod.app.as         55:00     4    48 (Resources)            1.08     2.11 days
    4753114     xdtas     general-compute  xdmod.app.ch          3:00     4    48 (Resources)            1.08     2.11 days
    4753123     xdtas     general-compute  xdmod.benchm         15:00     4    48 (Resources)            1.08     2.11 days
    4753052     xdtas     general-compute  xdmod.app.ch          3:00     2    24 (Resources)            1.08     1.56 days
    4753155     xdtas     general-compute  xdmod.benchm          2:00     4    48 (Resources)            1.07     1.56 days
    4753070     xdtas     general-compute  xdmod.app.ch          4:00     1    12 (Resources)            1.07     1.40 days
    4753115     xdtas     general-compute  xdmod.benchm          7:00     2    24 (Resources)            1.07     1.57 days
    4753121     xdtas     general-compute  xdmod.app.md          4:00     2    24 (Resources)            1.06     1.57 days
    4753122     xdtas     general-compute  xdmod.benchm          8:00     2    24 (Resources)            1.06     1.57 days
    4753134     xdtas     general-compute  xdmod.app.as       1:03:00     2    24 (Resources)            1.06     1.58 days
    4753164 xdtas general-compute xdmod.app.md 6:00 1 12 (Resources) 1.05 1.40 days


    squeue - Show Jobs Running on Compute Nodes

    squeue --nodelist=f16n35,f16n37

    squeue - Job States
    R - Job is running on compute nodes
    PD - Job is waiting on compute nodes
    CG - Job is completing

    squeue - Job Reasons
    (Resources) - Job is waiting for compute nodes to become available
    (Priority) - Jobs with higher priority are waiting for compute nodes.  Check this knowledge base article for info about job priority
    (ReqNodeNotAvail) - The compute nodes requested by the job are not available for a variety of reasons, including:
    cluster downtime
    nodes offline
    temporary scheduling backlog

    sinfo - Show the State of Nodes

    sinfo -p partition

    Example:
    [ccruser@vortex:/ifs/user/ccruser]$ sinfo -p debug
    PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
    debug        up    1:00:00      1  alloc k05n26
    debug up 1:00:00 3 idle d09n29s02,d16n[02-03]

    snodes - Show Node State and Feature Details

    snodes all <cluster>/<partition>

    Example:
    [ccruser@vortex:/ifs/user/ccruser]$ snodes all general-compute | more
    HOSTNAMES  STATE    CPUS S:C:T    CPUS(A/I/O/T)   CPU_LOAD MEMORY   GRES     PARTITION          FEATURES
    d07n04s01  alloc    8    2:4:1    8/0/0/8         8.02     24000    (null)   general-compute*   IB,CPU-L5630
    d07n04s02 alloc 8 2:4:1 8/0/0/8 7.97 24000 (null) general-compute* IB,CPU-L5630

    ...

    sinfo and snodes - Node States

    idle- all cores are available on the compute node
    no jobs are running on the compute node
    mix - at least one core is available on the compute node
    compute node has one or more jobs running on it
    alloc - all cores on the compute node are assigned to jobs

  • How to monitor jobs in Slurm?
  • Users can use the following command to monitor their jobs in Slurm:-
    $ squeue -u $USER
    CODE    STATE    DESCRIPTION
    CA    CANCELLED    Job was cancelled by the user or system administrator
    CD    COMPLETED    Job completed
    CF    CONFIGURING    Job has been allocated resources, but are waiting for them to become ready
    CG    COMPLETING    Job is in the process of completing
    F    FAILED    Job terminated with non-zero exit code
    NF    NODE_FAIL    Job terminated due to failure of one or more allocated nodes
    PD    PENDING    Job is awaiting resource allocation
    R    RUNNING    Job currently has an allocation
    S    SUSPENDED    Job has an allocation, but execution has been suspended
    TO    TIMEOUT    Job terminated upon reaching its time limit

    Information about a job

    root@gauss:~# squeue -l -j  93659 
    Thu Dec  6 16:51:37 2012
      JOBID PARTITION     NAME     USER    STATE       TIME TIMELIMIT  NODES NODELIST(REASON)
      93659     debug  aa_b[1]   isudal  RUNNING      33:49 UNLIMITED      1 c0-10

    Other detailed information about a job

    $ scontrol show -d job <JOBID>

  • How do I cancel a job/all jobs, jobs in a specific partition or specific qos?
  • Use the scancel command:

    [user@farm:~]$ scancel --help
    Usage: scancel [OPTIONS] [job_id[_array_id][.step_id]]

    To cancel a job:
    cancel [jobid]

    To cancel all of your jobs:

    scancel -u <userid>

    To cancel all of your jobs on a specific partition:

    scancel -u <userid> -p <partition>

    To cancel all of your jobs using a specific qos:

    scancel -u <userid> -q <qos>

  • What are array job lists in Slurm?
  • A SLURM job array is a collection of jobs that differ from each other by only a single index parameter. Creating a job array provides an easy way to group related jobs together. The newest version of Slurm supports array jobs. For example:

    $ cat test-array.sh
    #!/bin/bash
    hostname
    echo $SLURM_ARRAY_TASK_ID
    # Submit a job array with index values between 0 and 10,000 on all free CPUs:
    $ sbatch --array=0-10000 --partition=low test-array.sh

    On the Farm cluster the maximum array size is 10001.

    More information at http://www.schedmd.com/slurmdocs/job_array.html

  • What are common advanced (optional) sqeueu usages?
  • The squeue command has some additional command flags that can be passed to better monitor your jobs, if necessary.

    This section involves some Linux shell knowledge and an understanding of environment variables. If you are unsure ask an administrator for help.

    The default output fields of squeue are defined in the slurm module, but these can be overridden with the –format flag. The current Farm configuration is: An example of the standard output of squeue -u <username>:

    JOBID PARTITION     NAME     USER  ST        TIME  NODES CPU MIN_ME NODELIST(REASON)
    12345       med    myjob  username  R  1-22:20:42      1 22  24000M c10-67
    These fields are defined by default using the following format codes:

    %.14i %.9P %.8j %.8u %.2t %.11M %.6D %3C %6m %R
    A full explanation of what formatting codes may be used can be found in man squeue under the -o <output_format> –format=<output-format> section.

    To see the time and date that your jobs are scheduled to end, and how much time is remaining:

    squeue --format="%.14i %9P %15j %.8u %.2t %.20e %.12L" -u <username>
    Sample output:

    JOBID PARTITION NAME     USER     ST  END_TIME             TIME_LEFT
    1234  med       myjob    username  R  2019-06-10T01:12:28  5-21:50:53
    For convenience, you can add an alias to your ~/.bash_aliases file with this command and it will be available next time you log in. Here's an example of a helpful alias:

    alias jobtimes="squeue --format=\"%.14i %9P %15j %.8u %.2t %.20e %.12L\" -u"
    Next time you log in, the command “jobtimes <yourusername>” will be available and will display the information as above.

    See the squeue man page for other fields that squeue can output.

    The default squeue formatting is stored in the environment variable $SQUEUE_FORMAT, which can be altered using the same flags as the –format option on the command line. PLEASE be cautious when altering environment variables. Use module show slurm to see the default setting for $SQUEUE_FORMAT.

  • What are Slurm tasks and how to use .bashrc or .bash_profile aliases for the Farm cluster?
  • A task is to be understood as a process. A multi-process program is made of several tasks.
    By contrast, a multithreaded program is composed of only one task, which uses several CPUs.
    Tasks are requested/created with the –ntasks option, while CPUs, for the multithreaded programs, are requested with the --cpus-per-task option.
    Tasks cannot be split across several compute nodes, so requesting several CPUs with the –cpus-per-task option will ensure all CPUs are allocated on the same compute node.
    By contrast, requesting the same amount of CPUs with the –ntasks option may lead to several CPUs being allocated on several, distinct compute nodes.

    Here are useful aliases information for the Farm cluster:
    alias sq="squeue -u $(whoami)"     ##to check on your own running jobs
    alias sqb="squeue | grep bigmem"   ##to check on the jobs on bigmem partition
    alias sqs="squeue | grep serial"   ##to check on the jobs on serial partition
    alias sjob="scontrol show -d job"  ##to check detailed information about a running job. USAGE: sjob 134158

  • How to compress data to decrease I/O Traffic and save space?
  • In order to decrease a job I/O traffic it is customary to stage it in a fast scratch file system, then compress the output files compress them and only send them subsequently to the owner's permanent repository for later perusal and space-minimizing archiving.
    Therefore in a typical Slurm script it is suggested to have these lines and structure:

    #!/bin/sh -l
    [...]
    #SBATCH --time=blah
    set -x

    These are contextual commands to help diagnosis 2>/dev/null
    date
    pwd
    ls -lart

    This is the staging part 2>/dev/null
    module load blah
    mkdir /tmp/myScratchDirectory
    cp blah /tmp/myScratchDirectory

    This is the run part 2>/dev/null
    cd /tmp/myScratchDirectory
    myExec

    This is the compressio part 2>/dev/null
    gzip *.out
    This is the save part 2>/dev/null
    cp *.gz ~/
    exit

    Then, to read, simply do:
    gzcat *.gz |less

    For more information on gzip compression, type
    man gzip

  • How do I rerun the job that was killed by the "cgroup out-of-memory handler" after running for some hours?

  • The normal route is to request more RAM with the --mem= SBATCH argument. If you use the special argument 0 (--mem=0) then you will request all available memory on the node. 
    First, you need to find out how much available RAM does the system you are using has. You can do that using the command:

    egrep 'MemTotal|MemFree|MemAvailable' /proc/meminfo

    For example, some of the accessible systems in Farm cluster have 63GB of usable RAM. But the more you request, the longer it will take for your job to get scheduled due to resource contention. If you cannot fit within a single node, then you need to look at breaking your job into smaller pieces that do fit.
  • How to use "srun" for interactive Slurm session and parallel SSH?
  • The "srun" command is used to run a parallel job on cluster and to create interactive session to run Slurm jobs.  
    An example:
    A user is not getting graphical interface when running Matlab software after doing "ssh -X" in order not to leave the ssh session past the sleep duration and keep the Matlab session running. There are two ways to deal with this issue:

    1 - Using "srun" - On Farm using the following commands pops up an xterm:-
    srunx -p gpum --time=9:00:00 --cpus-per-task=4 --mem=5G<
    You can then run xclock, or other X apps through it. This is standard X forwarding over SSH, it is a bit slow though.

    2- Using VNC can also help. The user runs a VNC desktop on a node, then tunnels the VNC traffic through SSH. This gets you VNC speeds, but requires a full desktop on the nodes.


    root@farm.cse:~# cat /usr/bin/srunx
    salloc $@ xterm -e 'ssh -X `srun hostname`'

    oschreib@hpc2:~$ salloc --nodes=1 --ntasks-per-node=1 --ntasks-per-core=1 --threads-per-core=1 --time=00:29:00 --partition=high xterm -e 'ssh -X -i .ssh/id_rsaUCDavisOlivierSchreiber.pem `srun hostname`'
    salloc: Granted job allocation 220921
    oschreib@agate-15:~$ xclock

     

Parallel Computing Support in HPC Clusters

  • What are parallel computing software and what type of them HPC has to offer?
  •  Parallel computing refers to the process of breaking down larger problems into smaller, independent, often similar parts that can be executed simultaneously by multiple processors communicating via shared memory, the results of which are combined upon completion as part of an overall algorithm. The primary goal of parallel computing is to increase available computation power for faster application processing and problem solving.
    Performance of parallel computing is highly effected by hardware configuration:
    - Memory or Processor architecture
    - Number of cores/processors
    - Network speed and architecture
     
  • What are differences between OpenMP and OpenMPI parallel libraries?
  • MPI stands for Message Passing Interface that  is available as API(Application programming interface) or in library form  for C,C++ and FORTRAN.
    OpenMP stands for Open Multiprocessing .
    MPI is available as API (Application Programming Interface) while OpenMP (Open Multiprocessing) is an add on in a compiler such as gnu compiler.

    OpenMP is a specification for a set of compiler directives, library routines, and environment variables that can be used to specify high level parallelism in Fortran and C/C++ programs.  This is useful for parallelizing code.  Programmers find areas of code whose instruction can be shared among the processors.  Sometimes processes running on one processor will need the output from a different processor.  OpenMP simplifies the timing and retrieval of these shared resources to allow for true parallel processing.  This is commonly used for parallelizing tasks over a multi-core processor where the cores share memory between each other. This could include cache memory, RAM, hard disk memory, etc.
    OpenMPI provides you the ability to parallelize code over a distributed system (e.g., a supercomputer).  This allows for the parallelization of a program across a network of computers or nodes which communicate over the same network.  Since these nodes are essentially separate computers have their own memory layout and their own set of cores.

    Some explanations

    A)
    Solvers like OpenFOAM are parallel processing capable using Shared Memory Parallel (SMP)
    and/or Distributed Memory Parallel (DMP) parallelization libraries.

    B) Underlying Hardware and Software Notions

    It is important to distinguish hardware components of a system and the actual computations being performed using them.
    On the hardware side, one can identify:

    1. Cores, the Central Processing Units (CPU) capable of arithmetic operations.
    2. Processors, the four, six, eight and up core socket-mounted devices.
    3. Nodes, the hosts associated with one network interface and address.

    With current technology, nodes are implemented on boards in a chassis or blade rack-mounted enclosure.
    The board may comprise two sockets or more.
    From the software side, one can identify:

    1. Processes: execution streams having their own address space.
    2. Threads: execution streams sharing address space with other threads.

    Therefore, it is important to note that processes and threads created to compute a solution on a system will be deployed in different ways on the underlying nodes through the processors and cores’ hardware hierarchy.

    C) Parallelism Background
    Parallelism in scientific/technical computing exists in two paradigms implemented separately but
    sometimes combined in ‘hybrid’ codes: Shared Memory Parallelism (SMP) appeared in the 1980’s with
    the strip mining of ‘DO loops’ and subroutine spawning via memory-sharing threads.
    In this paradigm, parallel efficiency is affected by the relative importance of arithmetic operations
    versus data access referred to as ‘DO loop granularity.’
    In the late 1990’s, Distributed Memory Parallelism (DMP) Processing was introduced and proved very
    suitable for performance gains because of its coarser grain parallelism design. It consolidated on the
    MPI Application Programming Interface. In the meantime, SMP saw adjunction of mathematical libraries
    already parallelized using efficient implementation through Open Multi-Processing (OpenMP TM )
    and Pthreads standard API’s.
    Both DMP and SMP (with some limitations) programs can be run on the two commonly available types of
    hardware systems:
    • Shared Memory systems or single nodes with multiple cores sharing a single memory address space and
    a Single instance of the operating system.
    • Distributed Memory systems, otherwise known as clusters, comprised of nodes with separate local
    memory address spaces and a dedicated instance of the operating system per node.

    Note: SMP programs because of their single memory spaces cannot execute across clusters. Inversely,
    DMP programs can run perfectly well on a Shared Memory system. Since DMP has coarser granularity than
    SMP, it is therefore preferable, on a Shared Memory system, to run DMP rather than SMP despite what
    the names may imply at first glance. SMP and DMP processing may be combined together, in what is
    called ‘hybrid mode’.

    D) Distributed Memory Parallel Implementations
    DMP is implemented through the problem at hand with domain decomposition. Depending on the physics
    involved in their respective industry, the domains could be geometry, finite elements, matrix,
    frequency, load cases or right hand side of an implicit method. Parallel inefficiency from
    communication costs is affected by the boundaries created by the partitioning. Load balancing is also
    important so that all MPI processes perform the same number of computations during the solution and
    therefore finish at the same time. Deployment of the MPI processes across the computing resources
    can be adapted to each architecture with ‘rank’ or ‘round-robin’ allocation.

    E) Parallelism Metrics
    Amdahl’s Law, ‘Speedup yielded by increasing the number of parallel processes of a program is bounded
    by the inverse of its sequential fraction’ is also expressed by the following formula
    (where P is the program portion that can be made parallel, 1-P is its serial complement and N is the
    number of processes applied to the computation):
    Amdahl Speedup=1/[(1-P)+P/N]
    A derived metric is: Efficiency=Amdahl Speedup/N
    A trend can already be deduced by the empirical fact that the parallelizable fraction of an
    application depends more on CPU speed, and the serial part, comprising of overhead tasks depends more
    on RAM speed or I/O bandwidth. Therefore, an application running on a higher CPU speed system will
    have a larger 1-P serial part and a smaller P parallel part causing its Amdahl Speedup to decrease.
    This can lead to a misleading assessment of different hardware configurations as shown by this example
    where, say System B has faster CPU speed than System A:
    N System A elapsed seconds System B elapsed seconds
    1  1000     810
    10  100     90
    Speedup 10     9

    System A and System B could show parallel speedups of 10 and 9, respectively, even though System B has
    faster raw performance across the board. Normalizing speedups with the slowest system serial time
    remedies this problem:
    Speedup 10     11.11

    A computational solution of a particular dataset is said to exhibit strong scalability if elapsed
    execution time decreases when number of processors increases. While computational solution of
    increasing dataset sizes is said to exhibit weak scalability when elapsed execution time can remain
    bounded through an increase of number of processors.
    It may be preferable, in the end, to use a throughput metric, especially if several jobs are running
    simultaneously on a system:
    Number of jobs/hour/system = 3600/(Job elapsed time)
    The system could be a chassis, rack, blade, or any hardware provisioned as a whole unit.
  • What are available GPU resources in different clusters?
  • GPU (Graphical Processing Unit) nodes/resources are designed to handle graphics operations that includes 2D and 3D calculations. GPU nodes are desinged as co-processor to the CPU or host. GPUs have their own DRAM (Device memory or global memory in CUDA parlance)
    GPU nodes in HPC are effected by the:
    - Number of GPU RAMs
    - Number of CUDA cores/ capabilities
    HPC clusters offer different types of GPU resources. Here are brief numbers of GPU nodes available in different clusters:
    Farm:- 17
    HPC1:- 23
    LSSC0:- 20 - There is no multi GPU node in LSSC0
    Peloton:- 95 

    For more detailed information visit our "Clusters" page on: https://hpc.ucdavis.edu/clusters     

Compilers and Interpreters

  • What are available compilers in HPC1 and HPC2?
  • There are many compilers installed and made available in HPC2 like GNU C and Fortran compiler (gfortran).
    HPC1 gives researchers and users access to Intel License, but HPC2 does not have active Intel license:


    Type in the following command and it will show details of the intel compiler:-

    user@hpc1:~$ module show intel/2019

    as shown by its license enabling:

    user@hpc1:~$ /share/apps/CD-adapco/FLEXlm/11_13_0_0/bin/lmutil lmstat -a -c 28518@impact.cse.ucdavis.edu

    lmutil - Copyright (c) 1989-2014 Flexera Software LLC. All Rights Reserved.

    Flexible License Manager status on Thu 8/26/2021 15:31

    License server status: 28518@impact.cse.ucdavis.edu

       License file(s) on impact.cse.ucdavis.edu: /opt/intel/licenses/license.lic:

    impact.cse.ucdavis.edu: license server UP (MASTER) v11.15.0

    Vendor daemon status (on impact.cse.ucdavis.edu):

        INTEL: UP v11.15.0

    Feature usage info:

    Users of IC45FB71A:  (Total of 2 licenses issued;  Total of 0 licenses in use)

     

    Note that in case, lmutil does not work:

    user@hpc2:~$ file /share/software/ads/ADS2021_Update2/Licensing/2020.02/linux_x86_64/bin/lmutil

    /share/software/ads/ADS2021_Update2/Licensing/2020.02/linux_x86_64/bin/lmutil: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-lsb-x86-64.so.3, for GNU/Linux 2.6.18, stripped

    It might be missing the interpreter, which can be installed like this:

    sudo apt-get install lsb

    So, Intel is installed on HPC2:

    user@hpc2:~$ module show intel/2019

    /software/modules/4.6.1/ucdhpc-20.04/modulefiles/intel/2019:

    but its license is not enabled:

    user@hpc2:~$ /software/ads/ADS2021_Update2/Licensing/2020.02/linux_x86_64/bin/lmutil lmstat -a -c 28518@impact.cse.ucdavis.edu

    lmutil - Copyright (c) 1989-2019 Flexera. All Rights Reserved.

    Flexible License Manager status on Tue 7/27/2021 16:08

    Error getting status: Cannot connect to license server system. (-15,570:115 "Operation now in progress")
     

    A request has not been made for it because Intel optimizations do not benefit HPC2's AMD processors.

    Intel may not derive any benefit from having people running its compilers and libraries on AMD

    systems because it might result in Intel selling less processors if AMD can showcase good performance.

    Intel might still derive revenues off licensing.

    Intel compiler may still give acceptable performance if disabling Intel

    AVX2 and AVX512 run-time optimization like this:
     

    egrep -q AMD /proc/cpuinfo;test $? = 0 && export MKL_DEBUG_CPU_TYPE=5 || echo no AMD

     

    In any case, PGI compilers maybe better suited because of the underlying AMD processor

    and are packaged by NVidia, installed on HPC2 and they do not need a license:

    user@hpc2:~$ module show nvhpc/21.2

    /software/modules/4.6.1/ucdhpc-20.04/modulefiles/nvhpc/21.2:

    [...]

    They have been used in builds of OpenFoam.

  • What are available GNU Compilers in HPC clusters?
  • We support several GNU compilers, the one that ships with Ubuntu and the latest stable GNU gcc and gfortran compilers.

    GCC includes front ends for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries for these languages.

    Prerequisites
    Below is a list of prerequisites for using the GNU compilers:

    GMP - GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating point numbers.
    MPFR - The MPFR library is a C library for multiple-precision floating-point computations with correct rounding. It is a prerequisite for GCC.
    MPC - 
    Installed Versions
    We support multiple versions of GNU compilers. Some of the versions we currently have installed include:

    GCC 4.5
    GCC 4.7.3
    GCC 4.7
    GCC 4.9.3
    GCC 5.5.0
    GCC 6.3.1
    GCC 7.2.0
    GCC 7.3.0
    Using
    The default version of GCC will be loaded automatically. To load the different versions first determine which version you want to use.

    Version 4.7
    This is the latest stable version of GCC and will be loaded automatically.

    To load it manually type:

    module load gcc
    Version 4.6
    This is the latest stable version of GCC and will be loaded automatically.

    To load it manually type:

    module load gcc/4.6
    Version 4.5
    This is the latest stable version of GCC and will be loaded automatically.

    To load it manually type:

    module load gcc/4.5
    Version 4.4
    To load it manually type:

    module load gcc/4.4
    Building
    We built GCC using the following options.

    Version 4.4.2:

    configure --with-mpfr-include=$MPFR_HOME/include
    Documentation
    There are man pages available once you load the module but if you prefer to browse online documentation you can look on GNU's website.

  • What types of Python interpreters do clusters support?
  • Ubuntu comes with python 2.7.6 preinstalled into /usr/bin and is available by default. No module is required to use it.
    Python 2.7.15 and many additional Python packages are available via conda in the bio/1.0 module. If you need other packages not already installed, please contact help@cse.ucdavis.edu with your request.
    Python 3.6.8 is also available as a module.

    Use Notes
    To load use the conda-based python 2.7.15:
    module load bio/1.0
    Or to use python 3:
    module load python/3.6.8

    Python on Farm II
    Batch files run python scripts using the default version of python as specified by the current Ubuntu release being used on Farm. The Farm installation can be found here: /usr/bin/python. If “module load Python” is added to batch files, python scripts are run using a custom compilation of python maintained on Farm for bioinformatics applications.

    A simple example of running a python script as a batch job:
    user@agri:~/examples/hello_world$ more hello_world.py
    print "Hello, World! \n"


    user@agri:~/examples/hello_world$ more hello_world.sh
    #!/bin/bash -l
    #SBATCH --job-name=hello_world

    # Specify the name and location of i/o files.  
    # “%j” places the job number in the name of those files.
    # Here, the i/o files will be saved to the current directory under /home/user.
    #SBATCH --output=hello_world_%j.out
    #SBATCH --error=hello_world_%j.err

    # Send email notifications.  
    #SBATCH --mail-type=END # other options are ALL, NONE, BEGIN, FAIL
    #SBATCH --mail-user=user@ucdavis.edu

    # Specify the partition.
    #SBATCH --partition=hi # other options are low, med, bigmem, serial.

    # Specify the number of requested nodes.
    #SBATCH --nodes=1

    # Specify the number of tasks per node, 
    # which may not exceed the number of processor cores on any of the requested nodes.
    #SBATCH --ntasks-per-node=1 

    hostname # Prints the name of the compute node to the output file.
    srun python hello_world.py # Runs the job.


    user@agri:~/examples/hello_world_sarray$ sbatch hello_world.sh
    Submitted batch job X
    user@agri:~/examples/hello_world$ more hello_world_X.err
    Module BUILD 1.6 Loaded.
    Module slurm/2.6.2 loaded 
    user@agri:~/examples/hello_world$ more hello_world_X.out
    c8-22
    Hello, World! 


    A simple example of an array job:
    user@agri:~/examples/hello_world_sarray$ more hello_world.py
    import sys

    i = int(sys.argv[1])

    >print "Hello, World", str(i) + "! \n"


    user@agri:~/examples/hello_world_sarray$ more hello_world.sh
    #!/bin/bash -l
    #SBATCH --job-name=hello_world

    # Specify the name and location of i/o files.
    #SBATCH --output=hello_world_%j.out
    #SBATCH --error=hello_world_%j.err

    # Send email notifications.  
    #SBATCH --mail-type=END # other options are ALL, NONE, BEGIN, FAIL
    #SBATCH --mail-user=user@ucdavis.edu

    # Specify the partition.
    #SBATCH --partition=hi # other options are low, med, bigmem, serial.

    # Specify the number of requested nodes.
    #SBATCH --nodes=1

    # Specify the number of tasks per node, 
    # which may not exceed the number of processor cores on any of the requested nodes.
    #SBATCH --ntasks-per-node=1 

    # Specify the number of jobs to be run, 
    # each indexed by an integer taken from the interval given by "array”.
    #SBATCH --array=0-1

    hostname
    echo "SLURM_NODELIST = $SLURM_NODELIST"
    echo "SLURM_NODE_ALIASES = $SLURM_NODE_ALIASES"
    echo "SLURM_NNODES = $SLURM_NNODES"
    echo "SLURM_TASKS_PER_NODE = $SLURM_TASKS_PER_NODE"
    echo "SLURM_NTASKS = $SLURM_NTASKS"
    echo "SLURM_JOB_ID = $SLURM_JOB_ID"
    echo "SLURM_ARRAY_TASK_ID = $SLURM_ARRAY_TASK_ID"

    srun python hello_world.py $SLURM_ARRAY_TASK_ID
    user@agri:~/examples/hello_world_sarray$ sbatch hello_world.sh
    Submitted batch job X
    user@agri:~/examples/hello_world_sarray$ more *.err
    ::::::::::::::
    hello_world_X+0.err
    ::::::::::::::
    Module BUILD 1.6 Loaded.
    Module slurm/2.6.2 loaded 
    ::::::::::::::
    hello_world_X+1.err
    ::::::::::::::
    Module BUILD 1.6 Loaded.
    Module slurm/2.6.2 loaded 

    user@agri:~/examples/hello_world_sarray$ more *.out
    ::::::::::::::
    hello_world_X+0.out
    ::::::::::::::
    c8-22
    SLURM_NODELIST = c8-22
    SLURM_NODE_ALIASES = (null)
    SLURM_NNODES = 1
    SLURM_TASKS_PER_NODE = 1
    SLURM_NTASKS = 1
    SLURM_JOB_ID = 76109
    SLURM_ARRAY_TASK_ID = 0
    Hello, World 0! 

    ::::::::::::::
    hello_world_X+1.out
    ::::::::::::::
    c8-22
    SLURM_NODELIST = c8-22
    SLURM_NODE_ALIASES = (null)
    SLURM_NNODES = 1
    SLURM_TASKS_PER_NODE = 1
    SLURM_NTASKS = 1
    SLURM_JOB_ID = 76110
    SLURM_ARRAY_TASK_ID = 1
    Hello, World 1! 

  • What types of R software do HPC clusters support?
  • R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.

    We have installed the latest stable R v2.15 version compiled with current compilers and with added modules as well as the default CentOS version.

    Prerequisites
    Below is a list of prerequisites for using R:

    OpenMPI if using Rmpi.
    Using
    The CentOS R is in the default path. To use the CSE version say:

    module load R
    To use submit a serial job:

    #!/bin/bash -l
    #SBATCH -J MyJob

    module load R
    R CMD BATCH --no-save --no-restore <name of R scripts>.R
    Then to run the above example:

    $ sbatch -p serial <scriptname.sh>
    To submit a R+Rmpi parallel job:

    #!/bin/bash -l
    #SBATCH -J MyJob
    module load R Rmpi
    R CMD BATCH --no-save --no-restore brute_force.R

    Then to submit the job:

    $ sbatch -p serial -n 11 <scriptname.sh>
    The example code is available at: http://math.acadiau.ca/ACMMaC/Rmpi/brute_force.R

    After the job is run in the directory you ran it from you should find a file of the form <scriptname>.o<jobID> containing the output of the job, as well as a PDF file named Rplots.pdf.