Prediction of CpG Islands as an intrinsic clustering property found in many Eukaryotic DNA sequences and its relation to DNA methylation

Cristina Gómez-Martín, Ricardo Lebrón, José L. Oliver, Michael Hackenberg

Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

3 Citations (Scopus)

Abstract

The promoter region of around 70% of all genes in the human genome is overlapped by a CpG island (CGI). CGIs have known functions in the transcription initiation and outstanding compositional features like high G+C content and CpG ratios when compared to the bulk DNA. We have shown before that CGIs manifest as clusters of CpGs in mammalian genomes and can therefore be detected using clustering methods. These techniques have several advantages over sliding window approaches which apply compositional properties as thresholds. In this protocol we show how to determine local (CpG islands) and global (distance distribution) clustering properties of CG dinucleotides and how to generalize this analysis to any k-mer or combinations of it. In addition, we illustrate how to easily cross the output of a CpG island prediction algorithm with our methylation database to detect differentially methylated CGIs. The analysis is given in a step-by-step protocol and all necessary programs are implemented into a virtual machine or, alternatively, the software can be downloaded and easily installed.

Original languageEnglish
Title of host publicationMethods in Molecular Biology
PublisherHumana Press Inc.
Pages31-47
Number of pages17
DOIs
Publication statusPublished - 2018

Publication series

NameMethods in Molecular Biology
Volume1766

Keywords

  • Clustering
  • CpG islands
  • DNA methylation
  • DNA words
  • Virtual machine

Cite this