The SNPMaP package has been designed to handle the processing of SNPMaP data from the CEL files generated by the Affymetrix GeneChip Command Console (AGCC) or GeneChip Operating Software (GCOS), through to the RAS (Relative Allele Scores: the pooling equivalent of a relative allele frequency) used in most analyses. This can be as simple as typing
ras <- snpmap()
at the R prompt. The package will identify and read in the CEL files from the current directory, extract the relevant probe intensities and calculate a mean RAS for each SNP on each chip, returning a SNPMaP S4 object containing the scores.

Plotting functions such as plot() and boxplot() help with quality control
Given the amount of data generated by current SNP arrays, even with the relatively modest numbers of arrays (tens) typical of SNPMaP experiments, we have provided the option of a memory-mapping approach (using the R.huge package), which allows analysis to be done on a PC with 2GB of memory (naturally there is a speed penalty). If memory limits are exceeded in the course of the analysis, SNPMaP attempts to automatically switch from storing objects in memory to storing objects on disk.
S4 methods for generic R
functions such as summary(), plot()
and boxplot()
make it easy to query the SNPMaP object and visualize the data it
contains.
Accessors provide convenient access to the data. All functions are
documented through the R help system. For example, typing ?snpmap
will bring up a page describing the snpmap()
function and its
usage. Similarly, package?SNPMaP and class?SNPMaP
will bring up help pages for the SNPMaP package and the SNPMaP class
respectively.
Although the SNPMaP object is intended to be useable for further analyses, the data can also easily be extracted to a matrix using
rasData <-
as.matrix(ras)
A user who wants CEL files transformed into a spreadsheet of RAS in the simplest possible way need not use R interactively at all; example scripts that can be invoked from various shells are available, including a point-and-click front end for Windows. These steps comprise the simplest route from CEL files to the RAS used for association analysis.
Advanced SNPMaP
On the other hand, a user who wants to examine all steps of the analysis and experiment with new methods has access to the data in straightforward and convenient form. This flexibility is one of the major strengths of an implementation in R because of the impressive array of cutting-edge statistical techniques already implemented in the R environment.

The image() method uses the signal intensities from a "raw" format SNPMaP object to draw a pseudoimage of the array. This is useful for checking the surface for artifacts, such as the bubble here. A fastRender option allows rapid initial screening of all the arrays in the study, and the image can be coloured using custom palettes.
A more involved approach might begin by extracting the raw
probe intensities
from the CEL files (running the workflow function cel2raw
rather than the
default cel2rasS):
raw <-
snpmap(lowMemory = FALSE, RUN = 'cel2raw')
This allows the user to plot the raw probe intensities and
generate pseudoimages of the processed chips using the image()
method, so the user can check for scanning artifacts such as dust
or fingerprints. The raw intensities can be further processed to RAS
(this
time one RAS per probe quartet, rather than a summary RAS averaged across an array as before) by a workflow function:
ras <- raw2ras(raw)
Other options available at the snpmap()
or workflow
stage include normalize, which
quantile-normalizes the raw
probe intensities across chips, log.intensities,
which causes
SNPMaP to use the natural logarithm of the probe intensities, and useMM,
which causes SNPMaP to subtract mismatch probe intensities (where
available) before
calculating RAS.
Sample Data
For testing purposes, here are some Affymetrix CEL files from a real SNPMaP experiment. Each download is a zip file containing two arrays, each with DNA from 30 individuals pooled on it.
| Mapping 250K Sty Array | 54MB | download |
| Mapping 250K Nsp Array | 51MB | download |
| Genome-Wide Human SNP Array 5.0 | 44MB | download |
| Genome-Wide Human SNP Array 6.0 | 69MB | download |
