Modeling Data From High-Throughput ChIP-Seq Experiments

Abstract: The advent of high-throughput sequencing of short DNA fragments has opened up the opportunity to study DNA-protein interactions in great detail through ChIP-Seq experiments. The premise of ChIP-Seq experiments is conceptually simple: DNA fragments in direct physical interaction with transcription factors are isolated by chromatin immunoprecipitation (ChIP), and then partially sequenced. However, there are many details that need to be addressed in order to ensure that the data obtained from ChIP-Seq experiments are correctly interpreted. In this talk, I will describe a natural null model for the background noise in ChIP-Seq data and propose simple methods for estimation of the null model and hypothesis testing when the data contains both noise and signal. If time permits, I will also discuss the important but difficult-to-formulate problem of sample-size (power) calculations.