2010 | Doron Betel1, Anjali Koppal2, Phaedra Agius1, Chris Sander1, Christina Leslie 1*
The paper introduces miRanda-mirSVR, a new machine learning method for ranking microRNA target sites based on their down-regulation scores. The algorithm trains a regression model using sequence and contextual features extracted from miRanda-predicted target sites. In a large-scale evaluation, miRanda-mirSVR performs competitively with other target prediction methods in identifying target genes and predicting the extent of their downregulation at the mRNA or protein levels. Importantly, the method identifies a significant number of experimentally determined non-canonical and non-conserved sites. The mirSVR scoring model is calibrated to correlate linearly with the extent of downregulation, enabling accurate scoring of genes with multiple target sites. The model also correctly identifies genes regulated by multiple endogenous microRNAs and revisits the concept of seed hierarchy, showing that different seed types have broad and overlapping ranges of efficiencies. The inclusion of non-canonical sites in the model is evaluated using biochemically determined sites from PAR-CLIP experiments, demonstrating that miRanda-mirSVR correctly identifies a significant number of these sites. The results suggest that mirSVR provides a more comprehensive and accurate approach to microRNA target prediction, incorporating a wide range of features and avoiding the limitations of traditional methods that rely solely on seed complementarity and conservation.The paper introduces miRanda-mirSVR, a new machine learning method for ranking microRNA target sites based on their down-regulation scores. The algorithm trains a regression model using sequence and contextual features extracted from miRanda-predicted target sites. In a large-scale evaluation, miRanda-mirSVR performs competitively with other target prediction methods in identifying target genes and predicting the extent of their downregulation at the mRNA or protein levels. Importantly, the method identifies a significant number of experimentally determined non-canonical and non-conserved sites. The mirSVR scoring model is calibrated to correlate linearly with the extent of downregulation, enabling accurate scoring of genes with multiple target sites. The model also correctly identifies genes regulated by multiple endogenous microRNAs and revisits the concept of seed hierarchy, showing that different seed types have broad and overlapping ranges of efficiencies. The inclusion of non-canonical sites in the model is evaluated using biochemically determined sites from PAR-CLIP experiments, demonstrating that miRanda-mirSVR correctly identifies a significant number of these sites. The results suggest that mirSVR provides a more comprehensive and accurate approach to microRNA target prediction, incorporating a wide range of features and avoiding the limitations of traditional methods that rely solely on seed complementarity and conservation.