Distinguish virulent and temperate phage-derived sequences in metavirome data with a deep learning approach




INTRODUCTION        
DeePhage is designed to identify virome sequences as temperate phage-derived or virulent phage-derived fragments. The program calculate a score between 0 and 1 for each input fragment. The sequence with a score higher than 0.5 would be regarded as a virulent phage-derived fragment and the sequence with a score lower than 0.5 would be regarded as a temperate phage-derived fragment. DeePhage can run either on the virtual machine or physical host. For non-computer professionals, we recommend running the virtual machine version of DeePhage on local PC. In this way, users do not need to install any dependency package. If GPU is available, you can also choose to run the physical host version. This version can automatically speed up with GPU and is more suitable to handle large scale data.

Please direct your questions or comments to wu-shufang@pku.edu.cn or hqzhu@pku.edu.cn

DOWNLOAD        
DATA        
All data used to train and test DeePhage, related results and scripts are stored here

.
CITATION        
Shufang Wu, Zhencheng Fang, Jie Tan,Mo Li, Chunhui Wang, Qian Guo, Congmin Xu, Xiaoqing Jiang and Huaiqiu Zhu. DeePhage: distinguish virulent and temperate phage-derived sequences in metavirome data with a deep learning approach.

REFERENCES        
  • Fang, Z., Tan, J., Wu, S., Li, M., Xu, C., Xie, Z., and Zhu, H. (2019). PPR-Meta: a tool for identifying phages and plasmids from metagenomic fragments using deep learning. GigaScience, 8(6), giz066.

  • McNair, K., Bailey, B.A. and Edwards, R.A. (2012) PHACTS, a computational approach to classifying the lifestyle of phages. Bioinformatics, 28(5), 614-618.

  • Deschavanne, P., Dubow, M.S. and Regeard, C. (2010) The use of genomic signature distance between bacteriophages and their hosts displays evolutionary relationships and phage growth cycle determination. Virol. J., 7(1), 163.

  • Richter, D.C., Ott, F., Auch, A.F., Schmid, R. and Huson, D.H. (2008) MetaSim-A Sequencing Simulator for Genomics and Metagenomics. PloS One, 3(10), e3373.

  • Ahmed, S., Saito, A., Suzuki, M., Nemoto, N. and Nishigaki, K. (2009) Host-parasite relations of bacteria and phages can be unveiled by Oligostickiness, a measure of relaxed sequence similarity. Bioinformatics, 25(5), 563-570.