Phaeodactylum tricornutum

Assembly


An international group of researchers in collaboration with DOE's Joint Genome Institute has sequenced the genome of P. tricornutum using whole genome shotgun (WGS) sequencing. The clone of P. tricornutum that was sequenced is CCAP1055/1 and is available from the Culture collection of Algae and Protozoa (http://www.ccap.ac.uk). This clone represents a monoclonal culture derived from a fusiform cell in May 2003 from strain CCCP632, which was originally isolated in 1956 off Blackpool (U.K.). It has been maintained in culture continuously in F/2 medium. The 27.4 Mb genome assembly contains 33 chromosomes and 55 scaffolds.

Reference: Bowler C, Allen AE, Badger JH, et al. The Phaeodactylum genome reveals the evolutionary history of diatom genomes. Nature. 2008 Nov;456(7219):239-244.


In 2021, Oxford Nanopore Technologies long-read sequencing was used to update and validate the quality and contiguity of the P. tricornutum genome. Despite repetitive DNA sequences caused problems for current genome assembly algorithms, this sequencing allowed to resolve previously uncertain genomic regions and further characterize complex structural variation.

Reference: Filloramo, G. V., Curtis, B. A., Blanche, E., & Archibald, J. M. (2021). Re-examination of two diatom reference genomes using long-read sequencing. BMC genomics, 22(1), 1-25.


Gene Annotations


Phatr2

The genome of P. tricornutum was annotated using the JGI annotation pipeline, which combines several gene prediction, annotation and analysis tools. 10,402 gene models were predicted, 86% of the genes are supported by ESTs and 60-65% show homology to proteins in SwissProt.

There are two parts to the P. tricornutum genome sequence assembly and annotation reported here: the Phatr2 "finished chromosomes" and the Phatr2_bd "unmapped sequence". The finished chromosomes consist of the finished genome sequence that could be reliably assembled into chromosomes. The "unmapped sequence" consists of assembled scaffolds that could neither be mapped to finished chromosomes nor assigned to organelles, but that could be aligned to P. tricornutum ESTs which were not represented in the finished chromosomes.

Phatr3

The Phaeodactylum tricornutum genome is a reannotation with 12,089 gene models predicted by using existing gene models, expression data and protein sequences from related species used to train the SNAP and Augustus gene prediction programs using the MAKER2 annotation pipeline. The inputs were:
  • 10,402 gene models from a previous genebuild from JGI
  • 13,828 non-redundant ESTs
  • 42 libraries of RNA-Seq generated using Illumina technology
  • 49 libraries of RNA-Seq data generated under various iron conditions using SoLiD technology
  • 93,206 Bacillariophyta ESTs from dbEST, 22,502 Bacillariophyta and 118,041 Stramenopiles protein sequences from UniProt.
Reference: Rastogi, A., Maheswari, U., Dorrell, R.G. et al. Integrative analysis of large scale transcriptome data draws a comprehensive landscape ofPhaeodactylum tricornutum genome and evolutionary origin of diatoms. Sci Rep 8, 4834 (2018).


Proteogenomics pipeline annotation

Using mass spectrometry-based proteomics data, approximately 8300 Phatr2 genes were confirmed, and 606 novel proteins, 506 revised genes, 94 splice variants were identified.

Reference: Yang, M., Lin, X., Liu, X., Zhang, J., & Ge, F. (2018). Genome annotation of a model diatom Phaeodactylum tricornutum using an integrated proteogenomic pipeline. Molecular plant, 11(10), 1292-1307.



Transcriptomics


BioProject Accession
Description
Publication
Link
PRJNA278661
Response to phosphate fluctuations
PRJNA279965
Response to nitrogen starvation in a nitrate reductase knock-down line
PRJEB11970
Comparison of cells grown in-replete conditions collected at 4, 8, 20, and 36h with nitrate starved cells collected at 4, 8, and 20h, dark treatment for 8h, nocodazole treatment for 20h, and phosphate starvation for 36h
PRJNA310815
Comparison of low, intermediate, and high levels of dissolved Fe over light:dark cycles
PRJNA311568
Response to shift in nitrogen sources and availability in Wild Type and Nitrate reductase KO lines
PRJNA319251
Effect of competition with Alteromonas macleodii
PRJNA322663
Comparison of different CO2 and light conditions
PRJNA349063
Response to additions of Glufosinate-ammonium (GSA), L-Methionine sulfoximine (MSX), and Rapamycin (Rapa)
PRJNA360572
Response to grazing stress, imitated by injection of Acartia tonsa culture medium
PRJNA376114
Transcriptomes of alternative oxidase (AOX) knock-down lines
PRJNA377534
Comparison of wild type and CryP knock-down lines (Na1 and Ta3) under prolonged darkness (72h) and one hour after onset of blue light
PRJNA382762
Comparison of Nitrate Reductase KO and Wild-Type Cells for a 10-d Time Course
PRJNA484278
Comparison of ambient and high pCO2
PRJNA551018
Comparison of low light and high light acclimated wild-type
PRJNA554372
Response to SPX KO lines to different P conditions
PRJEB26173
Comparison of 3 morphotypes
PRJEB34512
Comparison of ambient and high pCO2
PRJNA595993
Response to trans,trans-2,4-decadienal (DD) during the light cycle
PRJNA610772
Comparison of different exposition to Naphthenic Acids
PRJNA625589
Comparison of early and stationary growth phase
Page
of 2
Loading...


Characterized Ecotypes



Accession
Strain reference in permanent collections
Characteristics of collection site
Major morphotype
Accession numbers of characterized strains
Download
Pt1
CCMP632
Coastal water, close to estuaries
Fusiform 95%–100%
CCAP 1055/3, CCAP 1055/1c
Pt2
CCAP 1052/1A (PCC 100, SAG 1090‐1a, UTCC 162)
Coastal water, close to estuaries
Fusiform 95%–100%
CCMP2557
Pt3
CCAP 1052/1B (SAG 1090‐1b)
Clonal bacteria‐free isolation. Growth on freshwater, soil extract
Oval 60%–75% Clonal
CCMP2558
Pt4
CCAP 1052/6 (UTEX 646, SAG 1090‐6)
Brackish water, isolated from a supralittoral rock pool
Fusiform 95%–100% Clonal
CCMP2559
Pt5
CCMP630 (NEPCC 738)
Shallow tidal creek with wide ranging salinity
Fusiform 95%–100% Clonal
CCAP 1055/2
Pt6
CCMP631 (NEPCC 31)
Seawater tank with wide ranging salinity
Fusiform 95%–100%
CCAP 1055/4
Pt7
CCMP1327
Polluted seawater (duck industry). Enclosed bay with low salinity. Isolated during bloom of Nannochloropsis/ Nannochloris
Fusiform 95%–100%
CCAP 1055/6
Pt8
NEPCC 640
Coastal water. Sample collected with a plankton net 2 feet from surface
Triradiate 80%–85%
CCAP 1055/7, CCMP2560
Pt9
CCMP633
From the shore, water temperature 25°C
Oval 60%–75% at 15°C–19°C; but fusiform 80%–95% at 25°C–28°C Clonal
CCAP 1055/5
Pt10
MACC B228
Polluted seawater (industrial area and seaside resort)
Fusiform 95%–100%
CCAP 1055/8, CCMP2928
Page
of 1
Loading...

References:

Epigenetics


Methylome:

The first whole-genome methylome has been obtained by digestion with the methyl-sensitive endonuclease McrBC followed by hybridization to McrBc-chip tiling array of the P. tricornutum genome. Next bisulfite deep sequencing has been used to compare DNA methylation in low and replete nitrogen conditions.

References:

Repetitive Elements



Repeats were collectively found to contribute ~3.4Mb (12%) of the assembly, including transposable elements (TEs), unclassified and tandem repeats, as well as fragments of host genes.




Code
Number
Type
Class
Order
Superfamily
Copia
1371
Transposable Elements (TEs)
Class I
LTR retrotransposons
Copia
Copia_withOTHER
106
Transposable Elements (TEs)
Class I
LTR retrotransposons
Copia_withOTHER
DIRS
14
Transposable Elements (TEs)
Class I
LTR retrotransposons
DIRS
SINE
13
Transposable Elements (TEs)
Class I
Non-LTR retrotransposons
Putative SINE
TRIM_LARD
27
Transposable Elements (TEs)
Class I
LTR retrotransposons
TRIM_LARD
CoDi1
121
Transposable Elements (TEs)
Class I
LTR retrotransposons
Ty1/Copia-like elements from Diatoms group 1
CoDi2
411
Transposable Elements (TEs)
Class I
LTR retrotransposons
Ty1/Copia-like elements from Diatoms group 2
CoDi3
177
Transposable Elements (TEs)
Class I
LTR retrotransposons
Ty1/Copia-like elements from Diatoms group 3
CoDi4
437
Transposable Elements (TEs)
Class I
LTR retrotransposons
Ty1/Copia-like elements from Diatoms group 4
CoDi5
117
Transposable Elements (TEs)
Class I
LTR retrotransposons
Ty1/Copia-like elements from Diatoms group 5
Page
of 3
Loading...

small RNAs




lincRNAs



In this study related to responses to P depletion, long intergenic nonprotein coding RNAs (lincRNAs) were defined as sequences with a length of ≥ 200 nucleotides and a predicted open reading frame (ORF) of ≤ 100 amino acids.



Gene supporting informations



In the file below you will find Supporting informations for the Phatr3 genes: corresponding ID (Phatr2, NCBI...), KEGG, GO, Domains, Targeting predictions, Evolutionary origins.



Genome browser with the telomere-to-telomere assembly



PhaeoEpiView is a browser that allows the visualization of epigenome data and transcripts on an updated and contiguous reference genome.