Capillary lengths of 36 cm, 50 cm, and 80 cm on the 3100 Genetic Analyzer afford flexible run times and read lengths. This system is designed with 16 capillaries, and run times can be determined, depending on insert size. Typically, a rapid run will read tag sequences to 550 bp with 98.5% accuracy. We recommend the 80-cm array for long-read runs for inserts of 1 kb or larger. (See Table 1 for throughput.)
Using standard run conditions, the 50-cm array on the ABI PRISM 3700 DNA analyzer will provide long reads with POP-5 polymer over a 3-hour run time. Three 24-hour running periods, therefore, will yield almost all the sequence information required to profile one tissue type (See Table 1).
When sequencing and basecalling are complete, the investigator needs to assess the presence and frequency of ditags. Various software tools are available to analyze the ditag sequence and determine transcript abundance. When all sequence files have been processed, it is possible to view tag abundances, match tags to reference sequences, and compare tag abundances from multiple projects. Analysis generally involves matching the tag information against the GenBank-UniGene5 database and generating a reference list. Project tags are compared with this list to identify matches of known genes and other sequences.
The National Institutes of Health (NIH) has established SAGEmap, a public repository and resource for SAGE data.6 Originally developed by researchers at the National Center for Biotechnology Information (NCBI) in Bethesda, MD, SAGEmap was designed to archive SAGE data produced through the Cancer Genome Anatomy Project (CGAP). However, it also accepts submissions of SAGE sequence from any source. The National Cancer Institute (NCI) has chosen to use SAGE exclusively for The Human Tumor Gene Index (hTGI) initiative,