Assembly



The Pseudo-nitzschia multistriata B856 genome is approximately 60MB in size. The assembly (accession number PRJEB9419) can be visualized at http://apollo.tgac.ac.uk/Pseudo-nitzschia_multistriata_V1_4_browser/sequences (username and password: pnitzschia).
The genome can also be accessed via the Stazione Zoologica Anton Dohrn Bioinforma portal (http://bioinfo.szn.it/pmultistriata/).
The strain comes from aPseudo-nitzschia multistriata (Takano) Takano pedigree starting from two strains collected in 2009 (Fig. 1). To produce the genome sequence, an axenic offspring of two F1 siblings obtained by crossing two wild type strains isolated in the Gulf of Naples (Italy) was used. The genome was assembled from a total of 172 million 101 bp overlapping paired end reads with ~175 bp inserts, 117 million 100 bp paired end reads with ~450 bp inserts, 72 million ~68 bp (after trimming) mate pair reads with ~1.2 KBp inserts and 5.4 million ~156 bp (after trimming) mate pair reads with ~4.5 Kbp inserts. The final size of the assembled Pseudo-nitzschia multistriata genome is 59.3 Mbp including ambiguous bases.


pseudonitzschia strains graph
Figure 1: Pseudo-nitzschia multistriata (Takano) Takano pedigree starting from two strains collected in 2009. Strain B856 was chosen for genome sequencing. Picture taken from Basu et al., 2017.



Gene Annotations



The genome of Pseudo-nitzschia multistriata had an Ab-initio gene prediction supported by genomic alignments of proteins from other species and RNAseq reads from multiple samples resulted in prediction of 12152 transcripts in 12008 genes. The transcripts generated were used as training data for AUGUSTUS (Stanke et al., 2006). The model built on the training data was applied to the entire repeat masked assembly, together with external support from homologous proteins aligned using EXONERATE (Slater & Birney, 2005). The predicted gene models were annotated using ANNOCRIPT (Musacchia et al., 2015).



Transcriptomics



BioProject Accession
Description
Publication
Link
PRJEB19371
Comparison of MT+ and MT- monocultures and sexualized strains
PRJEB28137
Comparison of WT and a strain displaying sex reversal
PRJEB37110
Time-lapse of sexually reproducing cells and parental monocultures
 
 
 
 
 
 
 
 
Page
of 1
Loading...

Repetitive Elements



Repeats were identified using REPET. The TEDENOVO pipeline (Flutre et al., 2011) was used to build a library of consensus sequences of repetitive elements in the genome assembly. The TEANNOT pipeline (Quesnevilleet al., 2005) was employed with default settings using the sequences from the filtered combined library as probes to perform genome annotation. Full-length complete long terminal repeats (LTRs) were identified using LTRHARVEST and LTRDIGEST (Gremme et al., 2013). The relative age of LTR insertion was estimated using the method proposed in previous studies (Kimura, 1980).