When I load my array data, each array is said to have a varying number of outliers. What are these outliers?
The outlier counts are something reported by the Affymetrix libraries that are used to read the cel files. We report this number but have never done anything with it. If an array had an unusual number of outliers it might be a cause for concern. We don't exclude any probes on this basis.
Each SNP on the 500K array set is interrogated by a variable number of
quartets, with the number ranging from 6 to 10 quartets per SNP.
However, the intensity matrix (created using as.matrix(snpmap_object))
always holds 6 intensities for each allele per SNP. What about the rest?
The variable numbers of quartets are handled by creating one matrix
for each group. The different groups are accessed by the "set"
argument to snpmap(). On the 500K array set=1 corresponds to the
SNPs with 6 quartets and set=2 corresponds to the SNPs with 10.
Alternatively, specifying set=0 returns a list with both matrices as
separate SNPMaP objects. The thinking behind this is that users will
likely want to treat the 6 and 10 SNPs separately, and trying to fit
both into one big matrix results in a lot of empty space, which
becomes important with such large objects! If you calculate RAS scores
and convert the SNPMaP objects to matrices you can rbind them back
together, of course.
