Tuesday, 20 August 2013

Reducing batch effects in methylation analysis

In our sequencing facility we have processed 4 illumina 450K methylation chips for the same study at 2 diferent times. We took care of spreading case and controls among all the chips. We can see a clear time-of-processing batch effect:

The chips were run in two different days one month apart. first 24 samples and then the other 24 samples.

The negative controls of the second batch  have a wider range.

And wen looking at the MDS and PCA we can see that the main parameter for grouping samples is the run day (the two chips starting with 835... where done the same day and the other two another day)

I have used the nice rnbeads R package for reporting the batch effects.

Now I am exploring two methods of correcting the batch effect:
  1.   limma using the batch date as block effect
    • design <- model.matrix(~Block+Treatment)
  2. ComBat in the R SVA package for correcting this
I will post the results soon. Meanwhile any comment for dealing with batch effects in 450K chips is more than wellcome.