Introduction to High Performance Computing: Glossary

Key Points

Introduction to HPC
  • High Performance Computing (HPC) typically involves connecting remotely to a cluster of computers.

  • HPCs can be used to do work that would either be impossible or much slower in a Desktop environment.

  • Typical HPC workflows involve submitting “jobs” to a scheduler which queues/priortises the “jobs” of all users.

  • The standard method of interacting with HPC’s is via a Linux-based Command Line Interface called “The Shell”.

Connect to the HPC
  • We connect to remote servers, like HPC’s, using the Terminal

  • SSH is a secure protocol for connecting to remote servers

  • To connect to a server, you need it’s address, an open port, and your user ID

  • With OnDemand, you can upload and download files, create, edit, submit, and monitor jobs, run GUI applications, and connect via SSH, all via a web broswer.

Using a cluster: Introduction
  • A cluster is a set of networked machines.

  • Clusters typically provide a login node and a set of worker nodes.

  • Files saved on one node are available everywhere.

Using a cluster: Scheduling jobs
  • The scheduler handles how compute resources are shared between users.

  • Everything you do should be run through the scheduler.

  • A job is just a shell script.

  • If in doubt, request more resources than you will need.

Using a cluster: Accessing software
  • Load software with module load softwareName

  • The module system handles software versioning and package conflicts for you automatically.

  • You can edit your .bashrc file to automatically load a software package.

  • When using software on any HPC system, check the software documents for details on how to use it effectively.

Using a cluster: Using resources effectively
  • The smaller your job, the faster it will schedule.

  • Don’t run stuff on the login node.

  • Again, don’t run stuff on the login node.

  • Don’t be a bad person and run stuff on the login node.

Basic UNIX Commands
  • scp (The Secure Copy Program) is a standard way to securely transfer data to remote HPC systems.

  • File ownership is an important component of a shared computing space and can be controlled with chmod.

  • Scripts are mostly just lists of commands from the command line in the order they are to be performed.

Glossary

FIXME