Microarray Probe Design

Probe Design Algorithm

Our proprietary probe design software searches for specific probes at the genomic level. The probe candidate is compared to all other expressed sequences from the same organism and the thermodynamic parameters (free energy and melting temperature) are computed for all possible hybridizations between the probe and perfect or non-perfect complementary sequences. If all of these values fall below a predetermined threshold, the probe is considered to be specific to its target. Probes are also selected to be unable to fold into stable secondary structures that may interfere with hybridization. Any probes with low sequence complexity or long stretches of the same base are rejected.

Catalog Microarray Probe Design

In terms of sensitivity, and specificity, the optimal size for an oligonucleotide grown directly on a microarray and used for gene expression analysis is between 50 and 60 bases (Kane 2000, Hughes 2001). However, the 10 nucleotides closest to the chip's surface are not likely involved in hybridization due to steric interference (Shchepinov 1997). There is no reason, therefore, to consider these nucleotides during the design process and especially during the specificity computation, as long as spacer sequence of sufficient length is inserted between the target sequence and the chip surface during fabrication.

Accordingly, we have chosen to design probes with a size comprised between 45 and 47 nucleotides. By using a range of probe lengths, our probe design algorithm can fit a narrower melting temperature (Tm) range in order to achieve better hybridization uniformity. Our probes are synthesized on top of a spacer arm to get them away from the glass substrate. Our spacer has a length equivalent to a 15-mer sequence, which is longer than the 10 nucleotides recommended by Shchepinov et al.

Eukaryotic messenger RNAs are polyadenylated and since this feature is often used to anchor reverse transcription during probe labeling, we have limited the search space for probes to the last 1500 nucleotides of the input sequences. This limit is to prevent picking probes in a region that would eventually not be reverse transcribed in suboptimum experimental conditions. The input sequence is searched in a 3' to 5' direction to give preference to potential probe sequences that are located as closely as possible to the messenger's 3' end. For archaea and bacteria where the mRNAs are not polyadenylated, the reverse transcription is usually primed with short random primers. This will lead to a better representation of the 5' end of the mRNAs into the cDNA population. Thus, the input sequence is searched in a 5' to 3' direction to preferentially pick probes close to the messenger's 5' end.

References:
Hughes, T.R., Mao, M., Jones, A.R., Burchard, J., Marton, M.J., Shannon, K.W., Lefkowitz, S.M., Ziman, M., Schelter, J.M., Meyer, M.R. et al. (2001) Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol, 19, 342-347.
Kane, M.D., Jatkoe, T.A., Stumpf, C.R., Lu, J., Thomas, J.D. and Madore, S.J. (2000) Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucleic Acids Res, 28, 4552-4557.
Shchepinov, M.S., Case-Green, S.C. and Southern, E.M. (1997) Steric factors influencing hybridisation of nucleic acids to oligonucleotide arrays. Nucleic Acids Res, 25, 1155-1161.