BCH709 Introduction to Bioinformatics: 4_Linux command line and cloud

Assignments

Please finish below assignment before due date

Clean up and start new

Before we start other session, let’s clean up folders

$ cd ~/
$ ls 

Please do it first and check the solution below.

Clean the folder with contents

How can we clean up bch709_test Hello ?

rm -R  <folder name>

Downloading file

Let’s downloading one file from website. There are several command to download file. Such as ‘wget’,’curl’,’rsync’ etcs. but this time we will use curl. File location

https://raw.githubusercontent.com/plantgenomicslab/BCH709/6055e8c5faf400119dc78b4d36f6b93814d9f76d/bch709_student.txt

How to use curl? curl

How to download from web file?

curl -L -o <output_name> <link>
curl -L -o bch709_student.txt https://raw.githubusercontent.com/plantgenomicslab/BCH709/6055e8c5faf400119dc78b4d36f6b93814d9f76d/bch709_student.txt
ls bch709_student.txt

Viewing file contents

There are various commands to print the contents of the file in bash. Most of these commands are often used in specific contexts. All these commands when executed with filenames displays the contents on the screen. Most common ones are less, more, cat, head and tail. less FILENAME try this: less bch709_student.txt Displays file contents on the screen with line scrolling (to scroll you can use arrow keys, PgUp/PgDn keys, space bar or Enter key). When you are done press q to exit.
more FILENAME try this: more bch709_student.txt Like less command, also, displays file contents on the screen with line scrolling but uses only space bar or Enter key to scroll. When you are done press q to exit.
cat FILENAME try this: cat bch709_student.txt Simplest form of displaying contents. It catalogs the entire contents of the file on the screen. In case of large files, entire file will scroll on the screen without pausing
head FILENAME try this: head bch709_student.txt Displays only the starting lines of a file. The default is first ten lines. But, any number of lines can be displayed using –n option (followed by required number of lines).
tail FILENAME try this: tail bch709_student.txtSimilar to head, but displays the last 10 lines. Again –n option can be used to change this. More information about any of these commands can be found in man pages (man command)

Let’s Download bigger data

Please go to below wesite.

ftp://ftp.arabidopsis.org/home/tair/Sequences/ATH_cDNA_EST_sequences_FASTA/

Please download yellow marked file. download

How to download

download2

curl -L -O ftp://ftp.arabidopsis.org/home/tair/Sequences/ATH_cDNA_EST_sequences_FASTA/ATH_cDNA_sequences_20101108.fas

Viewing file contents

What is the difference in less, tail, more and head?

What are flags (parameters / options)?

A “flag” in Unix terminology is a parameter added to the command. See for example

$ ls

versus

$ ls -l

Typically flags make programs behave differently or report their information in different ways.

How to find out what flags are available?

You can use the manual to learn more about a command:

$ man ls

will produce man

What if the tools does not have manual page?

Only tools that come with Unix will have a manual. For other software the -h, -help or –help command will typically print instructions for their use.

$ curl --help

If the help flag is not recognized you will need to find the manual for the software.

What are flag formats?

Traditionally Unix tools use two flag forms:

  • short form: single minus - then one letter, like -o, ’-L‘

  • long form: double minus – then a word, like –output, –Location In each case the parameter may stand alone as a toggle (on/off) or may take additional values after the flag. -o or -L Now some bioinformatics tools do not follow this tradition and use a single - character for both short and long options. -g and -genome. **Using flags is a essential to using Unix. Bioinformatics tools typically take a large number of parameters specified *via* flags**. The correctness of the results critically depends on the proper use of these parameters and flags.

How many lines does the file have?

You can pipe the output of you stream into another program rather than the screen. Use the | (Vertical Bar) character to connect the programs. For example, the wc program is the word counter.

cat bch709_student.txt | wc

prints the number of lines, words, and characters in the stream:

    41     141     900

How many lines?

cat bch709_student.txt | wc -l
41

of course we can use wc directly

wc -l bch709_student.txt 

That is equivalent to:

cat bch709_student.txt  | wc -l

In general, it is a better option to open a stream with cat then pipe the flow into the next program. Later you will see that it is easier to design, build and understand more complex pipelines when the data stream is opened at the beginning as a separate step.

Again, let’s do head.

cat bch709_student.txt  | head

head

is this equivalent to?

head bch709_student.txt

grep

Globally search a Regular Expression and Print is one of the most useful commands in UNIX and it is commonly used to filter a file/input, line by line, against a pattern eg., to print each line of a file which contains a match for pattern. Please check option with :

grep --help

With options, syntax is

grep [OPTIONS] PATTERN FILENAME

Let’s find how many people use macOS? First we need to check file.

$ wc -l bch709_student.txt
$ less bch709_student.txt

How can we find the macOS people? Now we can use grep

$ cat bch709_student.txt | grep MacOS

How can we count?

$ cat bch709_student.txt | grep MacOS | wc -l

How about this?

$ grep MacOS bch709_student.txt | wc -l

How about this?

$ grep -c MacOS bch709_student.txt

With other options (flags)

$ grep -v Windows  bch709_student.txt

With multiple options (flags)

$ grep -c -v Windows  bch709_student.txt
$ grep --color  -i macos  bch709_student.txt

How do I store the results in a new file?

The > character is the redirection.

$ grep  -i macos  bch709_student.txt > mac_student
$ cat bch709_student.txt | grep -i windows > windows_student

Please check with cat or less

Do you want to check difference?

$ diff mac_student windows_student
$ diff -y  mac_student windows_student

How can I select name only? (cut)

$ cat bch709_student.txt  | cut -f 1
$ cut -f 1 bch709_student.txt

How can I sort it ? (sort)

cut -f 1 bch709_student.txt | sort
$ sort bch709_student.txt
$ sort -k 2 bch709_student.txt
$ sort -k 1 bch709_student.txt
$ sort -k 1 bch709_student.txt | cut -f 1

Save?

sort -k 1 bch709_student.txt | cut -f 1 > name_sort

uniq

uniq command removes duplicate lines from a sorted file, retaining only one instance of the running matching lines. Optionally, it can show only lines that appear exactly once, or lines that appear more than once. uniq requires sorted input since it compares only consecutive lines.

$ cut -f 2 bch709_student.txt > os.txt
$ uniq -c os.txt

uniq

$ sort os.txt | uniq -c

uniq2 Of course you can use sort independently.

$ sort os.txt > os_sort.txt
$ uniq -c os_sort.txt
$ uniq --help

uniq3

diff

diff (difference) reports differences between files. A simple example for diff usage would be ` $ diff FILEA FILEB ` When you try to use new command, please check with --help. diff

$ diff  os.txt  os_sort.txt
$ diff -y os.txt  os_sort.txt
$ sort os.txt | diff -y - os_sort.txt

There are still a lot of command that you can use. Such as paste, comm, join, split etc.

One liners

Oneliner, textual input to the command-line of an operating system shell that performs some function in just one line of input. This need to be done with “|”. For advanced usage, please check this

FASTA format

The original FASTA/Pearson format is described in the documentation for the FASTA suite of programs. It can be downloaded with any free distribution of FASTA (see fasta20.doc, fastaVN.doc or fastaVN.me—where VN is the Version Number).

The first line in a FASTA file started either with a “>” (greater-than; Right angle braket) symbol or, less frequently, a “;” (semicolon) was taken as a comment. Subsequent lines starting with a semicolon would be ignored by software. Since the only comment used was the first, it quickly became used to hold a summary description of the sequence, often starting with a unique library accession number, and with time it has become commonplace to always use “>” for the first line and to not use “;” comments (which would otherwise be ignored).

Following the initial line (used for a unique description of the sequence) is the actual sequence itself in standard one-letter character string. Anything other than a valid character would be ignored (including spaces, tabulators, asterisks, etc…). Originally it was also common to end the sequence with an “*” (asterisk) character (in analogy with use in PIR formatted sequences) and, for the same reason, to leave a blank line between the description and the sequence.

fasta

Description line

The description line (defline) or header/identifier line, which begins with ‘>’, gives a name and/or a unique identifier for the sequence, and may also contain additional information. In a deprecated practice, the header line sometimes contained more than one header, separated by a ^A (Control-A) character. In the original Pearson FASTA format, one or more comments, distinguished by a semi-colon at the beginning of the line, may occur after the header. Some databases and bioinformatics applications do not recognize these comments and follow the NCBI FASTA specification.

FASTA file handling with command line.

Please check one fasta file

$ ls ATH_cDNA_sequences_20101108.fas

previous example

download2

curl -L -O ftp://ftp.arabidopsis.org/home/tair/Sequences/ATH_cDNA_EST_sequences_FASTA/ATH_cDNA_sequences_20101108.fas

count cDNA

How many cDNA in this fasta file? Please use grep wc to find number.

fasta count

fasta_count

count DNA letter

How many sequences (DNA letter) in this fasta file? Please use grep wc to find number.

what is GI and GB?

NCBI identifiers NCBI identifiers

Collect GI

How can I collect GI from FASTA description line? Please use grep cut to find number.

Sequence redundancy

Does GI have any redundancy? Please use grep wc diff to solve.

GFF file

The GFF (General Feature Format) format consists of one line per feature, each containing 9 columns of data, plus optional track definition lines. The following documentation is based on the Version 3 (http://gmod.org/wiki/GFF3) specifications.

Please download below file

http://www.informatics.jax.org/downloads/mgigff3/MGI.gff3.gz

What is .gz ?

file MGI.gff3.gz

Compression

There are several options for archiving and compressing groups of files or directories. Compressed files are not only easier to handle (copy/move) but also occupy less size on the disk (less than 1/3 of the original size). In Linux systems you can use zip, tar or gz for archiving and compressing files/directories.

ZIP compression/extraction

zip OUTFILE.zip INFILE.txt ## Compress INFILE.txt  
zip -r OUTDIR.zip DIRECTORY ## Compress all files in a DIRECTORY into one archive file (OUTDIR.zip)  
zip -r OUTFILE.zip . -i \*.txt ## Compress all txt files in a DIRECTORY into one archive file (OUTFILE.zip) 
unzip SOMEFILE.zip  ## Decompress

TAR compression/extraction

tar (tape archive) utility saves many files together into a single archive file, and restores individual files from the archive. It also includes automatic archive compression/decompression options and special features for incremental and full backups.
tar -xzvf SOMEFILE.tar.gz # extract contents of SOMEFILE.tar
tar -czvf OUTFILE.tar.gz DIRECTORY #extract contents of gzipped archive SOMEFILE.tar.gz
tar -czvf OUTFILE.tar.gz \*.txt #archive and compress all files in a directory into one archive file
tar -czvf backup.tar.gz BACKUP_WORKSHOP #archive and compress all ".txt" files in current directory into one archive file

Gzip compression/extraction

gzip (gnu zip) compression utility designed as a replacement for compress, with much better compression >and no patented algorithms. The standard compression system for all GNU software. gzip SOMEFILE compress SOMEFILE (also removes uncompressed file) gunzip SOMEFILE.gz uncompress SOMEFILE.gz (also removes compressed file)

gzip the file MGI.gff3.gz and examine the size. gunzip it back so that you can use this file for thelater exercises.

$ gunzip MGI.gff3.gz
$ ls –lh
$ gzip MGI.gff3
$ ls -lh
$ gunzip MGI.gff3.gz

GFF3 Annotations

Print all sequences annotated in a GFF3 file.

cut -s -f 1,9 MGI.gff3 | grep $'\t' | cut -f 1 | sort | uniq

Determine all feature types annotated in a GFF3 file.

grep -v '^#' MGI.gff3 | cut -s -f 3 | sort | uniq

Determine the number of genes annotated in a GFF3 file.

grep -c $'\tgene\t' MGI.gff3

Extract all gene IDs from a GFF3 file.

grep $'\tgene\t' MGI.gff3 | perl -ne '/ID=([^;]+)/ and printf("%s\n", $1)'

Print all CDS.

cat MGI.gff3 | cut -f 3 | grep CDS | 

Print CDS and ID

    cat MGI.gff3 | cut -f 1,3,4,5,7,9 | head  
    cat MGI.gff3 | cut -f 1,3,4,5,7,9 | grep CDS | head  
    cat MGI.gff3 | cut -f 1,3,4,5,7,9 | grep CDS | sed 's/;.*//g' | head  
    cat MGI.gff3 | cut -f 1,3,4,5,7,9 | grep CDS | sed 's/;.*//g' | sed 's/ID=//g' | head  
    cat MGI.gff3 | cut -f 1,3,4,5,7,9 | grep $'\tCDS\t' | sed 's/;.*//g' | sed 's/ID=//g' | head  

Print length of each gene in a GFF3 file.

grep $'\tgene\t' MGI.gff3 | cut -s -f 4,5 | perl -ne '@v = split(/\t/); printf("%d\n", $v[1] - $v[0] + 1)'

Extract all gene IDs from a GFF3 file.

grep $'\tgene\t' MGI.gff3 | perl -ne '/ID=([^;]+)/ and printf("%s\n", $1)'

Time and again we are surprised by just how many applications it has, and how frequently problems can be solved by sorting, collapsing identical values, then resorting by the collapsed counts. The skill of using Unix is not just that of understanding the commands themselves. It is more about recognizing when a pattern, such as the one that we show above, is the solution to the problem that you wish to solve. The easiest way to learn to apply these patterns is by looking at how others solve problems, then adapting it to your needs.

This manual was adapted from Linode. Linode is the BEST knowledge site ever.

, a tutorial that teaches you how to work at the command-line. You’ll learn all the basic skills needed to start being productive in the UNIX terminal.

This manual was adapted from Linode. Linode is the BEST knowledge site ever.

A tutorial that teaches you how to work at the command-line. You’ll learn all the basic skills needed to start being productive in the UNIX terminal.

Basic commands

Category comnmand
Navigation cd, ls, pwd
File creation touch,nano,mkdir,cp,mv,rm,rmdir
Reading more,less,head,tail,cat
Compression zip,gzip,bzip2,tar,compress
Uncompression unzip,gunzip,bunzip2,uncompress
Permissions chmod
Help man

There are still a lot of command that you can use. Such as paste, comm, join, split etc.

Quick reminder

Everyday use

Obscure but useful

Recent unix command

Modern Unix

bat

A cat clone with syntax highlighting and Git integration.

exa

A modern replacement for ls.

lsd

The next gen file listing command. Backwards compatible with ls.

delta

A viewer for git and diff output

dust

A more intuitive version of du written in rust.

duf

A better df alternative

broot

A new way to see and navigate directory trees

fd

A simple, fast and user-friendly alternative to find.

ripgrep

An extremely fast alternative to grep that respects your gitignore

ag

A code searching tool similar to ack, but faster.

fzf

A general purpose command-line fuzzy finder.

mcfly

Fly through your shell history. Great Scott!

choose

A human-friendly and fast alternative to cut and (sometimes) awk

jq

sed for JSON data.

sd

An intuitive find & replace CLI (sed alternative).

cheat

Create and view interactive cheatsheets on the command-line.

tldr

A community effort to simplify man pages with practical examples.

bottom

Yet another cross-platform graphical process/system monitor.

glances

Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.

gtop

System monitoring dashboard for terminal.

hyperfine

A command-line benchmarking tool.

gping

ping, but with a graph.

procs

A modern replacement for ps written in Rust.

httpie

A modern, user-friendly command-line HTTP client for the API era.

curlie

The power of curl, the ease of use of httpie.

xh

A friendly and fast tool for sending HTTP requests. It reimplements as much as possible of HTTPie's excellent design, with a focus on improved performance.

zoxide

A smarter cd command inspired by z.

dog

A user-friendly command-line DNS client. dig on steroids

Oneliners

Oneliner, textual input to the command-line of an operating system shell that performs some function in just one line of input. This need to be done with “|”. For advanced usage, please check this

FASTA format

The original FASTA/Pearson format is described in the documentation for the FASTA suite of programs. It can be downloaded with any free distribution of FASTA (see fasta20.doc, fastaVN.doc or fastaVN.me—where VN is the Version Number).

The first line in a FASTA file started either with a “>” (greater-than; Right angle braket) symbol or, less frequently, a “;” (semicolon) was taken as a comment. Subsequent lines starting with a semicolon would be ignored by software. Since the only comment used was the first, it quickly became used to hold a summary description of the sequence, often starting with a unique library accession number, and with time it has become commonplace to always use “>” for the first line and to not use “;” comments (which would otherwise be ignored).

Following the initial line (used for a unique description of the sequence) is the actual sequence itself in standard one-letter character string. Anything other than a valid character would be ignored (including spaces, tabulators, asterisks, etc…). Originally it was also common to end the sequence with an “*” (asterisk) character (in analogy with use in PIR formatted sequences) and, for the same reason, to leave a blank line between the description and the sequence.

fasta

Description line

The description line (defline) or header/identifier line, which begins with ‘>’, gives a name and/or a unique identifier for the sequence, and may also contain additional information. In a deprecated practice, the header line sometimes contained more than one header, separated by a ^A (Control-A) character. In the original Pearson FASTA format, one or more comments, distinguished by a semi-colon at the beginning of the line, may occur after the header. Some databases and bioinformatics applications do not recognize these comments and follow the NCBI FASTA specification.

FASTA file handling with command line.

Please check one fasta file

$ ls ATH_cDNA_sequences_20101108.fas

what is GI and GB?

NCBI identifiers NCBI identifiers

Collect GI

How can I collect GI from FASTA description line? Please use grep cut to find number.

Sequence redundancy

Does GI have any redundancy? Please use grep wc diff to solve.

Split fasta

    	awk '/^>/{f=++d".fasta"} {print > f}'  ATH_cDNA_sequences_20101108.fas

Merge fasta

    cat 1.fasta 2.fasta 3.fasta >> myfasta.fasta
    cat ?.fasta 
    cat ??.fasta 

Search fasta


    grep -n  --color "GAATTC" ATH_cDNA_sequences_20101108.fas
    grep -n  --color -E 'GAA?TTC' ATH_cDNA_sequences_20101108.fas

Regular Expression

A regular expression is a pattern that the regular expression engine attempts to match in input text. A pattern consists of one or more character literals, operators, or constructs. Please play this

GFF file

The GFF (General Feature Format) format consists of one line per feature, each containing 9 columns of data, plus optional track definition lines. The following documentation is based on the Version 3 (http://gmod.org/wiki/GFF3) specifications.

Please download below file

http://www.informatics.jax.org/downloads/mgigff3/MGI.gff3.gz

What is .gz ?

file MGI.gff3.gz

Check file with less or cat

less ** Q**

cat Ctrl+C can be used to stop any command in terminal safely.

GFF3 file format

Returns all lines on Chr 10 between 5.2MB and 5.45MB in MGI.gff3. (assumes) chromosome in column 1 and position in column 4:

    cat MGI.gff3 | awk '$1=="10"' | awk '$4>=5200000' | awk '$4<=5450000' 

    cat MGI.gff3 | awk '$1=="10"' | awk '$4>=5200000' | awk '$4<=5450000' |  grep mRNA 

    cat MGI.gff3 | awk '$1=="10"' | awk '$4>=5200000' | awk '$4<=5450000' |  grep mRNA | awk '{print $0,$5-$4}' 

Returns specific lines


    sed -n '1,10p' MGI.gff3

    sed -n '52p' MGI.gff3

Time and again we are surprised by just how many applications it has, and how frequently problems can be solved by sorting, collapsing identical values, then resorting by the collapsed counts. The skill of using Unix is not just that of understanding the commands themselves. It is more about recognizing when a pattern, such as the one that we show above, is the solution to the problem that you wish to solve. The easiest way to learn to apply these patterns is by looking at how others solve problems, then adapting it to your needs.

HPC and Cloud

This exercise mainly deals with using HPC clusters for large scale data (Next Generation Sequencing analysis, Genome annotation, evolutionary studies etc.). These clusters have several processors with large amounts of RAM (compared to typical desktop/laptop), which makes it ideal for running programs that are computationally intensive. The operating system of these clusters are primarily UNIX and are mainly operated via command line. All the commands that you have learned in the previous exercises can be used on HPC.

SSH

The ssh command is pre-installed. It means Secure Shell.

    ssh <YOURID>@pronghorn.rc.unr.edu

ssh

pronghorn_login

pronghorn

Windows user Desktop location

/mnt/c/Users/[WINDOWS LOGIN ID]/Desktop/

Mac user Desktop location

~/Desktop/

Slurm Quick Start Tutorial

Resource sharing on a supercomputer dedicated to technical and/or scientific computing is often organized by a piece of software called a resource manager or job scheduler. Users submit jobs, which are scheduled and allocated resources (CPU time, memory, etc.) by the resource manager.

Slurm is a resource manager and job scheduler designed to do just that, and much more. It was originally created by people at the Livermore Computing Center, and has grown into a full-fledge open-source software backed up by a large community, commercially supported by the original developers, and installed in many of the Top500 supercomputers.

Gathering information Slurm offers many commands you can use to interact with the system. For instance, the sinfo command gives an overview of the resources offered by the cluster, while the squeue command shows to which jobs those resources are currently allocated.

By default, sinfo lists the partitions that are available. A partition is a set of compute nodes (computers dedicated to… computing) grouped logically. Typical examples include partitions dedicated to batch processing, debugging, post processing, or visualization.

sinfo

sinfo
PARTITION      AVAIL  TIMELIMIT  NODES  STATE NODELIST
cpu-s2-core-0     up 14-00:00:0      2    mix cpu-[8-9]
cpu-s2-core-0     up 14-00:00:0      7  alloc cpu-[1-2,4-6,78-79]
cpu-s2-core-0     up 14-00:00:0     44   idle cpu-[0,3,7,10-47,64,76-77]
cpu-s3-core-0*    up    2:00:00      2    mix cpu-[8-9]
cpu-s3-core-0*    up    2:00:00      7  alloc cpu-[1-2,4-6,78-79]
cpu-s3-core-0*    up    2:00:00     44   idle cpu-[0,3,7,10-47,64,76-77]
gpu-s2-core-0     up 14-00:00:0     11   idle gpu-[0-10]
cpu-s6-test-0     up      15:00      2   idle cpu-[65-66]
cpu-s1-pgl-0      up 14-00:00:0      1    mix cpu-49
cpu-s1-pgl-0      up 14-00:00:0      1  alloc cpu-48
cpu-s1-pgl-0      up 14-00:00:0      2   idle cpu-[50-51]

In the above example, we see two partitions, named batch and debug. The latter is the default partition as it is marked with an asterisk. All nodes of the debug partition are idle, while two of the batch partition are being used.

The sinfo command also lists the time limit (column TIMELIMIT) to which jobs are subject. On every cluster, jobs are limited to a maximum run time, to allow job rotation and let every user a chance to see their job being started. Generally, the larger the cluster, the smaller the maximum allowed time. You can find the details on the cluster page.

You can actually specify precisely what information you would like sinfo to output by using its –format argument. For more details, have a look at the command manpage with man sinfo.

squeue

The squeue command shows the list of jobs which are currently running (they are in the RUNNING state, noted as ‘R’) or waiting for resources (noted as ‘PD’, short for PENDING).

squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
            983204 cpu-s2-co    neb_K jzhang23  R 6-09:05:47      1 cpu-6
            983660 cpu-s2-co   RT3.sl yinghanc  R   12:56:17      1 cpu-9
            983659 cpu-s2-co   RT4.sl yinghanc  R   12:56:21      1 cpu-8
            983068 cpu-s2-co Gd-bound   dcantu  R 7-06:16:01      2 cpu-[78-79]
            983067 cpu-s2-co Gd-unbou   dcantu  R 1-17:41:56      2 cpu-[1-2]
            983472 cpu-s2-co   ub-all   dcantu  R 3-10:05:01      2 cpu-[4-5]
            982604 cpu-s1-pg     wrap     wyim  R 12-14:35:23      1 cpu-49
            983585 cpu-s1-pg     wrap     wyim  R 1-06:28:29      1 cpu-48
            983628 cpu-s1-pg     wrap     wyim  R   13:44:46      1 cpu-49

SBATCH

Now the question is: How do you create a job?

A job consists in two parts: resource requests and job steps. Resource requests consist in a number of CPUs, computing expected duration, amounts of RAM or disk space, etc. Job steps describe tasks that must be done, software which must be run.

The typical way of creating a job is to write a submission script. A submission script is a shell script, e.g. a Bash script, whose comments, if they are prefixed with SBATCH, are understood by Slurm as parameters describing resource requests and other submissions options. You can get the complete list of parameters from the sbatch manpage man sbatch.

Important

The SBATCH directives must appear at the top of the submission file, before any other line except for the very first line which should be the shebang (e.g. #!/bin/bash). The script itself is a job step. Other job steps are created with the srun command. For instance, the following script, hypothetically named submit.sh,

 nano submit.sh
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --cpus-per-task=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=1g
#SBATCH --mail-type=begin
#SBATCH --mail-type=end
#SBATCH --mail-user=wyim@unr.edu
echo "Hello Pronghorn"
seq 1 80000

would request one CPU for 10 minutes, along with 1g of RAM, in the default queue. When started, the job would run a first job step srun hostname, which will launch the UNIX command hostname on the node on which the requested CPU was allocated. Then, a second job step will start the sleep command. Note that the –job-name parameter allows giving a meaningful name to the job and the –output parameter defines the file to which the output of the job must be sent.

Once the submission script is written properly, you need to submit it to slurm through the sbatch command, which, upon success, responds with the jobid attributed to the job. (The dollar sign below is the shell prompt)

$ chmod 775 submit.sh
$ sbatch submit.sh
sbatch: Submitted batch job 99999999

Yim_basic_setting

bashrc

nano ~/.bashrc
######.bashrc
HISTSIZE=5000
HISTFILESIZE=9000

alias vi='vim'
alias grep='grep --color=auto'
alias egrep='egrep --color=auto'
alias ll='ls -alhg'
alias la='ls -A'
alias du='du -h --max-depth=1'

tty -s && export PS1="\[\033[38;5;164m\]\u\[$(tput sgr0)\]\[\033[38;5;15m\] \[$(tput sgr0)\]\[\033[38;5;231m\]@\[$(tput sgr0)\]\[\033[38;5;15m\] \[$(tput sgr0)\]\[\033[38;5;2m\]\h\[$(tput sgr0)\]\[\033[38;5;15m\] \[$(tput sgr0)\]\[\033[38;5;172m\]\t\[$(tput sgr0)\]\[\033[38;5;15m\] \[$(tput sgr0)\]\[\033[38;5;2m\]\w\[$(tput sgr0)\]\[\033[38;5;15m\]\n \[$(tput sgr0)\]"

Nano RC

nano ~/.nanorc
set nowrap
set softwrap
set const
## Nanorc files
include "/usr/share/nano/nanorc.nanorc"

## C/C++
include "/usr/share/nano/c.nanorc"

## HTML
include "/usr/share/nano/html.nanorc"

## TeX
include "/usr/share/nano/tex.nanorc"

## Quoted emails (under e.g. mutt)
include "/usr/share/nano/mutt.nanorc"

## Patch files
include "/usr/share/nano/patch.nanorc"

## Manpages
include "/usr/share/nano/man.nanorc"

## Groff
include "/usr/share/nano/groff.nanorc"

## Perl
include "/usr/share/nano/perl.nanorc"

## Python
include "/usr/share/nano/python.nanorc"

## Ruby
include "/usr/share/nano/ruby.nanorc"

## Java
include "/usr/share/nano/java.nanorc"

## Assembler
include "/usr/share/nano/asm.nanorc"

## Bourne shell scripts
include "/usr/share/nano/sh.nanorc"

## POV-Ray
include "/usr/share/nano/pov.nanorc"