Home | Other siRNA Design Tools | Design Job Status Check | Help | siRecords | Biolead.org 
 
 
About siDRM

    siDRM is an implementation of the DRM rule sets for selecting effective siRNAs. These rule sets were obtained in the following three steps: (1) A survey of significant features was conducted on the large and diverse dataset compiled from siRecords, with all known features implicated to have impacts on the siRNA efficacy. This survey resulted in a list of features associated with a significant up-shift of the siRNA efficacy distribution, and/or significantly boosting the chances of achieving higher siRNA efficacy. (2) A feature combination analysis was performed to exploit the positive cooperativity of the significant features identified in their joint effects in boosting the siRNA efficacy. A list of feature combinations (also called rules) leading to substantial improvement in efficacy in the training dataset was obtained. (3) The DRM (or disjunctive rule merging) algorithm was applied to merge and reorganize the rules obtained, resulting in a bundle of rule sets (or disjunctions of rules), ordered descendingly in stringency.

 
 
 
References

   • Ren, Y., Gong, W., Xu, Q., Zheng, X., Lin, D., Wang, Y. and Li, T. (2006) siRecords: an extensive database of mammalian siRNAs with efficacy ratings. Bioinformatics , 22, 1027-1028. [PubMed]
   • Gong, W., Ren, Y., Xu, Q., Wang, Y., Lin, D., Zhou, H. and Li, T. (2006) Integrated siRNA design based on surveying of features associated with high RNAi effectiveness. BMC Bioinformatics , 7 , 516. [PubMed] [Full text]
   • Gong, W., Ren, Y., Zhou, H., Wang, Y., Kang, S. and Li, T. (2008) siDRM: an effective and generally applicable online siRNA design tool. Bioinformatics , 24 , 2405-2406. [PubMed]
   • Ren, Y., Gong, W., Zhou, H., Wang, Y., Xiao, F. and Li, T. (2009) siRecords: a database of mammalian RNAi experiments and efficacies. Nucleic Acids Res , 37 , D146-149. [PubMed] [Full text]

 

Help

Significant features. The survey of features leading to significant improvement in the siRNA efficacy was performed on a dataset consisting of 3669 siRNA experiments (Set A). A total of 276 binary features (including all known features implicated to have impact on siRNA efficacy) were used in the survey, and the Wald test of monotone trend and two permutation tests of odds ratios (one for the achieving of >70% efficacies, the other for the achieving of >90% efficacies) were performed for each feature. A feature was regarded significant if the P-value for the Wald test is less than 0.01, and at least one of the P-values for the permutation tests is less than 0.01 in the mean time. To improve the confidence of the significant features obtained, an additional measure was taken, in that 200 bootstrapped datasets were generated from Set A. The survey of significant features was performed on each of the 200 bootstrapped datasets. Only if a feature was found significant in at least 95% of the bootstrapped datasets, this feature was determined to be significant. The non-redundant list of significant features found in this survey is shown in the table below:

Feature Index

Feature Names

F1
1st nucleotide=G
F2
6th nucleotide=T
F3
9th nucleotide=C
F4
17th nucleotide=A
F5
18th nucleotide≠C
F6
19th nucleotide=A
F7
At least three (A/U)s in the seven nucleotides at the 3' end
F8
No occurrences of four or more identical nucleotides in a row
F9
No occurrences of G/C stretches of length 7 or longer
F10
G/C content is between 35 and 60%
F11
Binding energy of N16-N19 > -9 KCal/Mol
F12
Local folding potential (mean) ≥ -22.31 KCal/Mol
F13
Target site is not on the 5'UTR

Standard rulesets. The rule sets obtained obtained through investigating the combinations of 13 significant features (F1 - F13), followed by DRM analysis (merging and organizing the rules).

RS2 :

Feature
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
F11
F12
F13
Rule 1
 
 
 
 
 
 
 
Rule 2
 
 
 
 
 
 
 
Rule 3
 
 
 
 
 
 
 
Rule 4
 
 
 
 
 
 
Rule 5
 
 
 
 
 
 
Rule 6
 
 
 
 
 
 
Rule 7
 
 
 
 
 
 

High sensitivity rulesets. Rule-based siRNA design tools are often prone to low sensitivity, i.e., they tend to produce rather limited numbers of candidate siRNAs for a given gene. The DRM rule sets are no exception in this regard. One strategy of countering the low sensitivity problem is to organize the rules into disjunctive sets, and use the rule sets, rather than individual rules, to select siRNAs, as the sensitivity of a disjunctive rule set is approximately the sum of the sensitivity of all rules included in the rule set. This strategy is, in fact, already incorporated into the DRM methodology. Another strategy to improve the sensitivity of the rule sets is focused on restricting the occurrences of features with lower carrying rates. Among the significant features identified in the survey are 5 direct sequence features F1 , F2 , F3 , F4 and F6 , each of which holds a distinctly low expected carrying rate of 25%. This is because each of the 4 nucleotides A, U, G and C can appear at any position of a 19-mer sequence with roughly equal probabilities. Thus, the feature F1 : 1st nucleotide = G is expected to be carried by only 25% of all 19-mer sequences. The same reasoning applies to the other four features listed above. We repeated the feature combination analysis and rule merging procedure, with an additional restriction that no more than 3 of the 5 features of low carrying rates should occur in any rule. As expected, the resulting rule sets demonstrate substantially improved sensitivity. These rule sets are denoted as RS_HS1 through RS_HS4 (where HS stands for "high sensitivity").

RS_HS1:

Feature
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
F11
F12
F13
Rule 1
 
 
 
 
 
 
 
Rule 2
 
 
 
 
 
 
 
Rule 3
 
 
 
 
 

RS_HS4:

Feature
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
F11
F12
F13
Rule 1
 
 
 
 
 
 
 
 
 
Rule 2
 
 
 
 
 
 
 
 
Rule 3
 
 
 
 
 
 
 
 
Rule 4
 
 
 
 
 
 
 
 
Rule 5
 
 
 
 
 
 
 
 
Rule 6
 
 
 
 
 
 
 
 
Rule 7
 
 
 
 
 
 
 
 
Rule 8
 
 
 
 
 
 
 
Rule 9
 
 
 
 
 
 
 

Fast rulesets. The calculation of the values of different features takes different amount of time. For all features except F12 ( Local folding potential (mean) = -22.31 KCal/Mol ), the calculation of the feature values can be completed almost instantly, but the computing of F12 values costs much longer time due to the secondary structure calculation (using MFold) involved. For an mRNA of an average length, it typically takes about 1-2 hours for the F12 calculation to complete. In seeking of rule sets that can be calculated more rapidly, we repeated the feature combination and rule merging analysis, with F12 excluded from the list of feature used. The result is a bundle of "fast rule sets", which are denoted as RS_Fast1 through RS_Fast4.

RS_Fast2:

Feature
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
F11
F12
F13
Rule 1
 
 
 
 
 
 
 
Rule 2
 
 
 
 
 
 
 
Rule 3
 
 
 
 
 
 
 

RS_Fast4:

Feature
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
F11
F12
F13
Rule 1
 
 
 
 
 
 
 
 
Rule 2
 
 
 
 
 
 
 
 
Rule 3
 
 
 
 
 
 
 
Rule 4
 
 
 
 
 
 
 
Rule 5
 
 
 
 
 
 
 
Rule 6
 
 
 
 
 
 
 
Rule 7
 
 
 
 
 
 
 

Filter for innate Immune responses. It has been reported that siRNA duplexes can activate innate immune responses by interacting with certain toll-like receptors on the cell surface or in the endosomes (Hornung, et al., 2005; Judge, et al., 2005). The invoking of these responses requires the presence of specific motifs, 5'-UGUGU-3' or 5'-GUCCUUCAA-3' in the guide strand of siRNA duplexes. This effect is found in only a small proportion of the gene silencing experiments, and transcribed shRNAs are believed to be insusceptible to these responses. A filter is put in place for avoiding the occurrences of these two motifs in the selected siRNAs.

Filter for cell toxic effects. In a recent study, the presence of the motif 5'-UGGC-3' in the siRNA duplex was found to lead to strong cell toxicity (Fedorov, et al., 2006). A filter is put in place in the siDRM design tool to avoid the presence of this motif in the selected siRNAs.

Filter for off-target activities. Substantial off-target inhibition can take place when an siRNA and a non-targeted gene have exact sequence matches for all but the last two positions, which are tolerant to mismatches (Birmingham, et al., 2006; Dahlgren, et al., 2008). Off-target inhibition can also be induced when matches at the seed region (positions 2-7 and 2-8) are followed by several mismatches nucleotides (Birmingham, et al., 2006; ; Jackson, et al., 2006). For each candidate siRNA, siDRM checks and reports if (i) its sequence has full homology to the whole transcript (5'UTR or CDS or 3'UTR), ii) its 17-mer subsequence (all but the last two positions) has full homology to the 3'UTR region of another transcript, (iii) its seed region (positions 2-8) has full homology to the 3'UTR region of another transcript, or (iv) its seed region (position 2-8) has full homology to the 3'UTR region of another transcript, and this homology region is followed by four consecutive mismatches.

Updated Analysis

Additional analysis of the updated DRM rule sets conducted in March 2007 can be found here.

References

Birmingham, A., Anderson, E.M., Reynolds, A., Ilsley-Tyree, D., Leake, D., Fedorov, Y., Baskerville, S., Maksimova, E., Robinson, K., Karpilow, J., Marshall, W.S. and Khvorova, A. (2006) 3' UTR seed matches, but not overall identity, are associated with RNAi off-targets, Nat Methods , 3 , 199-204.

Dahlgren, C., Zhang, H.Y., Du, Q., Grahn, M., Norstedt, G., Wahlestedt, C. and Liang, Z. (2008) Analysis of siRNA specificity on targets with double-nucleotide mismatches, Nucleic Acids Res.

Fedorov, Y., Anderson, E.M., Birmingham, A., Reynolds, A., Karpilow, J., Robinson, K., Leake, D., Marshall, W.S. and Khvorova, A. (2006) Off-target effects by siRNA can induce toxic phenotype, Rna , 12 , 1188-1196.

Holen, T., Moe, S.E., Sorbo, J.G., Meza, T.J., Ottersen, O.P. and Klungland, A. (2005) Tolerated wobble mutations in siRNAs decrease specificity, but can enhance activity in vivo, Nucleic Acids Res , 33 , 4704-4710.

Hornung, V., Guenthner-Biller, M., Bourquin, C., Ablasser, A., Schlee, M., Uematsu, S., Noronha, A., Manoharan, M., Akira, S., de Fougerolles, A., Endres, S. and Hartmann, G. (2005) Sequence-specific potent induction of IFN-alpha by short interfering RNA in plasmacytoid dendritic cells through TLR7, Nat Med , 11 , 263-270.

Jackson, A.L., Bartz, S.R., Schelter, J., Kobayashi, S.V., Burchard, J., Mao, M., Li, B., Cavet, G. and Linsley, P.S. (2003) Expression profiling reveals off-target gene regulation by RNAi, Nat Biotechnol , 21 , 635-637.

Jackson, A.L., Burchard, J., Schelter, J., Chau, B.N., Cleary, M., Lim, L. and Linsley, P.S. (2006) Widespread siRNA "off-target" transcript silencing mediated by seed region sequence complementarity, Rna , 12 , 1179-1187.

Judge, A.D., Sood, V., Shaw, J.R., Fang, D., McClintock, K. and MacLachlan, I. (2005) Sequence-dependent stimulation of the mammalian innate immune response by synthetic siRNA, Nat Biotechnol , 23 , 457-462.

Lin, X., Ruan, X., Anderson, M.G., McDowell, J.A., Kroeger, P.E., Fesik, S.W. and Shen, Y. (2005) siRNA-mediated off-target gene silencing triggered by a 7 nt complementation, Nucleic Acids Res , 33 , 4527-4535.

Please address questions and suggestions to The siDRM Team.
Copyright 2006-13 Biolead.org. All rights reserved.