DRAGO pipeline

Unigene sequences from 33 plant species were translated into potential protein sequences using the ESTScan program, version 3.0.2, with default parameters and coupled with the Arabidopsis thaliana codon usage/log odds probability matrices. The resulting translations were subsequently checked for sequence homology with at least one resistance protein contained in the ‘reference’ dataset using the BLAST algorithm with a stringent e-value cut-off of 1E10-15. Domain analysis of selected sequences was performed using InterProScan version 4.8, with standard options and last InterPro database release. Genes were divided into five already known classes according to their domains and gene structure. The resulting set of sequences was loaded into the PRG database. The goodness of Disease Resistance Analysis and Gene Orthology (DRAGO) predictor was evaluated running the pipeline on the hand-curated dataset. The comparisons showed a perfect match between reference genes manual classification and DRAGO prediction.

