Data transfer: Globus

Globus manages file transfers between two machines (two servers, a server and a personal machine, or two personal machines). It is ideal for large files and available for many institutional clusters and networks. Files can be transferred between Mount Moran/BigHorn, the ARCC petaLibrary, ORNL, TACC, NCAR, and other institutions around the world.

When the amount of data to transfer exceeds around 100 GB, other methods like scp, sftp, rsync may be too slow, and Globus will be faster for transferring collections of files due to doing so in parallel.

Globus Advantages:

  • Secure, handles errors, verifies integrity of transferred files
  • Automatically resumes after interruption
  • Emails user when transfer is complete, or when error occurs
  • Accelerated transfer rates — transfers in parallel where available
  • Web and command line interfaces for transfer
  • Links to major HPC sites (ORNL, TACC, NCAR, etc.)

Links:

Terminology:

  • A Globus endpoint is a location that data can be transferred to/from. A server endpoint is set up by an administrator to provide access to a system via Globus. A personal endpoint is set up by an individual (using the Globus Connect Personal software from the link above) to transfer between their personal machine and other endpoints.

To utilize Globus with ARCC Resources and other common ones, the following public Globus Endpoints can be used:

System Name Globus Endpoint Name
ARCC Bighorn (storage system for Mount Moran) ARCC Bighorn
ARCC petaLibrary ARCC petaLibrary
NCAR-Yellowstone (XSEDE) ncar#gridftp
dtn01 (University of Utah) uofuchpc#dtn01
CI-Water project uofuciwater#dtn01