Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
bsbdenovo-datasets2 [2017/11/21 10:54]
telatina
bsbdenovo-datasets2 [2020/02/07 09:51] (current)
Line 45: Line 45:
 </​code>​ </​code>​
 We totally have about 95Mbp, that for an //E. coli// genome means we produced a 20X coverage shotgun. Not that bad! We totally have about 95Mbp, that for an //E. coli// genome means we produced a 20X coverage shotgun. Not that bad!
 +
 +==== How do reads look like? ====
 +Different dataset vary. A simple way to have a look is using the ''​less''​ command. Remember that when using ''​less''​ you can interact with keystrokes (arrows, page up/down, and finally ''​q''​ to exit!).
 +Example:
 +<code bash>
 +less -S /​bsb/​denovo/​datasets/​454/​SRP001673.fastq
 +</​code>​
 +We can use the ''​-S''​ parameter to avoid word wrap, and keep the sequences in one line (use left/right arrows to scroll).
 +
 +==== Where does this data comes from? ====
 +Most datasets have been downloaded from the [[https://​www.ncbi.nlm.nih.gov/​sra|NCBI Short Reads Archive]], and they keep their Accession ID as filename. This means that you can search the SRA for the Accession, that in most cases is a single sequencing from a project and in the 454 case is the accession of an [[https://​www.ncbi.nlm.nih.gov/​sra/?​term=SRP001673|entire project]].
 +