Mulberry Genome Sequencing

The Morus Genome Project is organized by State Key Laboratory of Silkworm Genome Biology of China, Institute of Sericulture and Systems Biology in Southwest University. This project is also supported by Beijing Genomics Institute (BGI) at Shenzhen, Zhejiang Academy of Agricultural Sciences, Chinese Academy of Forestry, and Sericulture and Farm Produce Processing Research Institute of GAAS. The project was initiated in January 2011 using a whole-genome shotgun sequencing (WGS) approach.

The WGS was performed on Illumina Hiseq 2000 platforms. 12 sequencing libraries were constructed and a total of 78.34 billion high quality bases were generated (236-fold genome coverage). SOAPdenovo was used to assemble the sequenced reads and finally a 330.79 Mb mulberry genome was assembled with a scaffold N50 length of 390,115 bp and contig N50 length of 34,476 bp. Over 80% of the assembly was represented by 681 scaffolds and 93.96% of bases covered by more than 20 reads. From this assembly, we identified 127.98 Mb repetitive sequences using a combination of both de novo repeat prediction and homology based search methods. Protein-coding genes and noncoding RNA genes were predicted and finally we got 29,338 protein-coding genes, 223 miRNA genes, 560 tRNA genes, 81 rRNA genes and 311 snRNA genes in mulberry genome.

Table 1. Global statistics of the M. notabilis (chuansang) genome sequencing and assembly.

Assembling processing

Insert size (bp)

Read length(bp)

Raw data(MB)

Effective data(MB)

Sequence coverage

N50a(bp)

Total length(bp)

Contig and scaffold

170-800

100

76,884.40

54,625.60

165.14

5,719

280,787,257

Scaffold

2,000-20,000

49

49,803.50

23,713.73

71.69

394,221

332,102,025

Gap-closure

170-800

100

76,884.40

54,625.60

--

--

--

Final result

--

--

126,687.90

78,339.33

236.82

390,115

330,791,087

a N50 refers to the size above which half of the total length of the sequence is found.

All M. notabilis genome sequences were deposited in NCBI database under BioProject no. PRJNA202089 and GenBank no. ATGF00000000.1.

 

References:

He N., Zhang C., Qi X., et al. 2013. Draft Genome Sequence of the Mulberry Tree Morus notabilis. Nat Commun, 4:2445.