Unix essentials for NGS bioinformatics




Unix essentials for NGS bioinformatics

This course has been designed to introduce Unix to students as most convenient tool for working with big data in biological sciences such as next generation sequencing (NGS) data. NGS technologies are producing massive amount of data in each run which is difficult to handle through GUI based tools, even it is difficult to open raw files. That's why sequencing data are produced and stored in text format for easy handling and processing.

Unix skill is an assets for bioinformatics. It is very easy, convenient and save lot of time. Bioinformatics skilled people are knows very well to analyze data with programming language PERL/PYTHON. But all of them not realized that it is not necessary to write program all the time. With the help of unix utilities, data handling and processing, input formatting for software, and easy text processing of results for the understanding can be performed without using high end programming skill and special software's. But you will need software and programming skills for advance bioinformatics analyses. It is great skill for bio-sciences researchers and scientist and NGS beginners. Unix skills will help you in making of pipelines where you can use different software to solve your own objective such as

  • Counting and formatting of fasta and fastq sequences

  • Multiple line fasta sequences to single line fasta sequences

  • Extraction of desired fasta and fastq sequences from whole dataset

  • Splitting and subseting of large sequence file

  • Formatting of blast, pfam, and interpro output for analysis

  • Extraction of sub sequences from genome files

  • Sequence file cleaning: Triming and filtering of sequences

  • Random data set generation

  • Bulk data processing for common tasks

  • ................... and many more common tasks

Here, I am intend to cover only specific aspect of unix as required for NGS data processing and project management. Whole course is divided into 4 module from basic command to script. In this course, you will have lot of practice opportunities. In 4 days, you will learn through tutorials, video lectures and assignments for practice. There could be several ways for the teaching and learning, But, i used easiest and simplest approach, and focused to develop thinking for data processing instead of advance and compact use of commands. In guide to practice commands, I have given multiple approach to perform single task. So, you will also have opportunity to use compact and advance options of commands.


Day 1 - Introduction to NGS and UNIX

  • Course introduction

  • Brief description of NGS and UNIX (video).

  • Unix: How to start, basic commands (Directories and files: creation, remove, navigation, listing, writing/retrieval, and unpacking of NGS data files)

  • System information related commands and their usages

  • Quick revision

  • Practice assignments

  • Challenge of the day

Day 2 – NGS bioinformatics data excursion

  • NGS: data source, files and file formats.

  • Unix command for excursion

  • Smart trick to solve complex problems

  • Quick revision

  • Practice assignments (with common NGS data processing related tasks)

  • Challenge of the day

Day 3 – Flying with commands

  • File streaming and redirection, stream editor, pipe, filters

  • Permission, symbolic linking, construction of pipeline on terminal

  • Practice assignments (with common NGS data processing related tasks)

  • Challenge of the day

Day 4 - Bulk data processing

  • Brief introduction of shell scripting

  • Pattern matching, variables, subshells and loops

  • Practice Assignments (with common NGS data processing related tasks)

  • Challenge of the day


A course to develop unix skills for next generation sequencing data handling and processing

Url: View Details

What you will learn
  • Knowledge and understanding of unix commands for NGS data processing
  • Skill to use non-GUI tools for large data processing
  • Smart tricks to use command for NGS data handling

Rating: 3.75

Level: Beginner Level

Duration: 3 hours

Instructor: Sandeep Kushwaha


Courses By:   0-9  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z 

About US

The display of third-party trademarks and trade names on this site does not necessarily indicate any affiliation or endorsement of coursescompany.com.


© 2021 coursescompany.com. All rights reserved.
View Sitemap