Linux Tutorial
Login to the Greene cluster as described in the section on connecting to the HPC cluster.
Available file systems on Greene
Files Systems for usage:
The NYU HPC clusters have multiple file systems for user's file storage needs. Each file system is configured differently to serve a different purpose.
Space | Environment Variable | Purpose | Flushed? | Allocation (per user) |
---|---|---|---|---|
/home | $HOME | Program Development space; For storing small files, source code, scripts etc that are backed up | NO | 20GB |
/scratch | $SCRATCH | Computational Workspace; For storing large files/data, infrequent reads and writes | YES Files not accessed for 60 days are deleted | 5TB |
/archive | $ARCHIVE | Long Term Storage ( Cold storage ) | NO | 2TB |
Basic Linux Commands
Navigating the directory structure
We've already seen ssh, which takes us from the host we are on to a different host, and hostname, which tells us which host we are on now. Mostly you'll move around filesystems and directories, which resemble inverted tree structures as shown below schematically:
ls
- To list files in the current directory
If this is your first time using the HPC Cluster, ls
probably won't return anything, because you have no files to list.
There are a couple of useful options for ls:
-
ls -l
lists the directory contents in long format, one file or directory per line, with extra information about who own the file, how big it is, and what permissions are set. -
ls -a
lists hidden files. In Unix, files whose names begin with dot (.
) are hidden. This does not stop anything from using those files, it simply instructsls
not to show the files unless the-a
option is used.
A command typed at the Unix command prompt, looks something like this:
pwd
- print working directory, or "where am i now ?" in the directory space.
In Unix, filesystems and directories are arranged in a hierarchy. **A forward slash "/
" is the directory separator, and the topmost directory visible to a host is called "/
". Filesystems are also mounted into this directory structure, so you can access everything that is visible on this host by moving around in the directory hierarchy.
You should see something like /home/NetID
cd
- To change to a different directory,
use "cd" ("change directory"). You'll need to give it the path to the directory you wish to change into, eg "cd /scratch/NetID". You can go up one directory with "cd ..
".
If you run "cd
" with no arguments, you will be returned to your home directory and if you run "cd -
", you will be returned to the directory you were in most recently.
mkdir
- To create a new location, use "mkdir new_directory_name
".
rmdir
- To remove a directory, use "rmdir new_directory_name
".
man
- Manual page. This command provides more information about a command eg., "man ls
"
cat
- Prints the content of the file eg., "cat filename
"
Copying, moving or deleting files locally
Copying: The "cp
" command makes a duplicate copy of files and directories within a cluster or machine. The general usage is "cp source destination
":
command | Explanation |
---|---|
cp test_file.txt test_file2.txt | Makes a duplicate copy of test_file.txt with a new name test_file2.txt |
cp -r subdir subdir2 | That is, a new directory "subdir2" is created and each file under subdir is copied recursively to the new subdir2 |
Moving: The "mv
" command renames files and directories within a cluster or machine. The general usage is "mv source_dir destination_dir
"
command | Explanation |
---|---|
mv dummy_file.txt test_file.txt | Renames dummy_file.txt as test_file.txt |
mv subdir new_subdir | Renames the directory "subdir" to a new directory "new_subdir" |
Deleting files: The "rm
" ( remove ) command deletes files and optionally directories within a cluster or machine.
There is no undelete in Unix. Once it is removed, it's gone !
command | Explanation |
---|---|
rm dummy_file.txt | Remove a file |
rm -i dummy_file.txt | If you use -i you will be prompted for confirmation before each file is deleted |
rm -f serious_file.txt | Forcibly remove a file |
rmdir subdir/ | Remove subdir only if it's empty |
rm -r subdir/ | Recursively delete the directory subdir and everything else in it. Use it with care ! |
Text Editor
nano
is a friendly text editor that can be used to edit the content of an existing file or create a new file. Here are some options used in nano editor.
Options | Explanation |
---|---|
Ctrl + O | Save Changes |
Ctrl + X | Exit nano |
Ctrl + K | Cut single line |
Ctrl + U | Paste the text |
Writing Scripts
An entire sequence of commands can be captured in a script for repeated or later execution. This is the mechanism by which batch jobs are run on the HPC clusters. The essential elements of a script are illustrated in the example below:
#!/bin/bash
# the first line should begin with #! and the path to the interpreter under which the script should run
# do stuff as if it were an interactive session:
cd $HOME/some_place
date
ls -l
pwd
# scripts can use loops and conditionals. See 'man bash' for syntax
for f in `ls`; do
echo "found a file called $f"
done
There are two ways to run scripts:
- Give the script execute permission. and run it as a command:
chmod u+x my_script
./my_script
- Run a shell, and pass the script as an argument
bash my_script
Notice in the first example that to run the script, we prefixed it with "./
". If the script is not somewhere in the shell's $PATH
, it won't find it to run unless it's location is explicitly specified. This is even true when the script is in the normally in the $PATH
. Therefore, we specify that the script is in the current directory with ./
.
Setting execute permission with chmod
In Unix, a file has three basic permissions, each of which can be set for three levels of user. The permissions are:
-
Read permission (" r ") - numeric value 4
-
Write permission (" w ") - numeric value 2
-
Execute permission (" x ") - numeric value 1.
When applied to a directory, execute permission refers to whether the directory can be entered with 'cd'
The three levels of user are:
-
The user who owns the file ( the "user", referred to with "u")
-
The group to which the file belongs - referred to with "g". Each user has a primary group and is optionally a member of other groups, when a user creates a file it is normally associated with the user's primary group. At NYU HPC all users are in a group named " users ", so group permissions has little meaning
-
All other users are referred to with " o "
You grant permissions with "chmod who+what file
" and revoke them with "chmod who-what file
".
The first has "+" and the second "-"
Here "who" some combination of "u", "g" and "o" and what is some combination of "r", "w" and "x". So to set execute permission, as in the example above, we use:
chmod u+x my_script