Bash for beginners

Who is this course for

This course is for biologists aiming to learn how to use the Linux Command Line to perform bioinformatics analyses. It requires no prior knowledge of the command line.

Avoid this course if...

If you are already able to use basic shell commands, and in particular to tell programs where your file are, you probably don't need this course.

How we will work

You'll have full access to a Linux Server, so the course heavily relies on this website to provide you workshop notes to follow at your pace from your office (or living room :-D ). Each lesson introduce the concepts and let us review the problems found during the previous session and to begin working on the new assignment. You'll have 24/7 access to the remote Linux server, so the time we spend together is mainly to introduce concepts and to discuss problems, while the real “tutorial” can be self paced and done during the week when you have some spare time.

The course spans across three weeks, the the first with two lessons, then one lesson for the last two.

Bring your laptop (any model will work). The course will be fully accessible also from non-NBI laptops.

A small "preview"

Watch a short video describing the content of the course.

Each day we will meet in a room at QIB (details to be confirmed), from 9.30 to 12.00. Bring your own laptop, if you need one contact Andrea in advance to arrange an alternative.

Day1: The first tour

A brief introduction to the core concepts of the command line as a good environment for bioinformatics analysis. A hands-on tutorial to log in into a remote Linux server using each participants laptop, and test the first commands. Our main goal is the set up of your clients, in order to allow you off-site access to the workshop!

  1. What is a terminal: the “terminal prompt”
  2. Accessing a remote server using ssh (Mac or Linux) or the program PuTTY (from Windows)
  3. Using screen to set up a persistent session on a remote server
  4. The filesystem: using pwd and ls to interact with it from the terminal. Tab completion.
  5. mkdir to create directories
  6. A command line text editor: nano

Day 2: basic commands

Understanding the "file system", relative and absolute paths, and the commands to organize the directories, to list files, to copy and move files and directories. Introduction to commands to interact with text files and view them. We will use the FASTA and FASTQ file formats as examples.

  1. ls (with some parameters), wildcards, cd, rmdir, find
  2. Interactive visualization of text files: less
  3. Viewing text files with cat, head and tail
  4. Counting lines and characters with wc
  5. Selecting lines with patterns using grep
  6. Redirecting a command output into a text file

Day 3: Extracting data from text files

Using terminal commands to interact with bioinformatics files. We'll introduce the SAM file format used to store NGS mapping, the VCF format (for SNP calling), and some tabular annotation files (GFF, GTF).

  1. Recap of previous commands to be used with for SAM files
  2. Downloading datasets with wget
  3. The sort, uniq, cut commands
  4. Command pipes: combining multiple commands

Day 4: The bioinformatician's perspective

Using short reads alignment as a theme, we'll introduce how to install new software to be used from the command line, users and file permissions, and of course the alignment program bwa and samtools the swiss army knife to manipulate SAM files.

  1. Using bwa to align short reads
  2. Using samtools to operate with SAM/BAM files