Homo sapiens Comprehensive Model Collection (v9 - March 2012)

[NEW] Circular tree of HOCOMOCO models (A-C quality)

UPGMA clustering, pairwise similair computed by MACRO-APE at 0.0005 P-value; uniform PWM normalization; picture by jsPhyloSVG.
Then the motifs are arranged in clusters (shown in alternating colors) in a way that the minimal similarity between cluster members is 0.05.
Download the list of clusters, cluster logos, similarity matrix [xlsx], [txt].

Details

HOCOMOCO-AD curated: high-confindence models

The 'A to D'-part of the curated collection contains manually curated TFBS models.

We have used the high-performance motif discovery tool ChIPMunk to produce TFBS models for human TFs integrating data from different sources.
ChIPMunk was used in 4 different modes searching for optimal models in the lengths range from 7 to 22bp.
Then we manually selected the most reasonable models for each transcription factor and assigned the quality ratings from 'A' (best) to 'F' (fail).
The collection of 'A' to 'D' quality models represents the most reasonable subset to be used for further analysis.

HOCOMOCO-E curated: low-confidence models

The 'E'-part of the curated collection contains low confidence models.

HOCOMOCO-FULL: full collection

The whole collection listing 4 models (f1,f2,si,do) for each TF.

Downloads

HOCOMOCO-AD

download the AD curated collection, small-BiSMark XML format, uniform background for PWMs, one-file-per-model
download the AD curated collection, small-BiSMark XML format, hg19 background for PWMs, one-file-per-model
download the AD curated collection, plain text format, weighted position count matrices (WPCM), rows as letters (ACGT)
download the AD curated collection, plain text format, weighted position count matrices (WPCM), columns as letters (ACGT)
download the AD curated collection, MEME text format, probability matrices
download the AD curated collection, TRANSFAC text format, frequency matrices
download the AD curated collection, plain text format, probability matrices, rows as letters (ACGT)
download the AD curated collection, plain text format, probability matrices, columns as letters (ACGT)
download the AD curated collection, plain text format, PWMs, uniform background, rows as letters (ACGT)
download the AD curated collection, plain text format, PWMs, hg19 background, rows as letters (ACGT)
download the AD curated collection, plain text format, PWMs, uniform background, columns as letters (ACGT)
download the AD curated collection, plain text format, PWMs, hg19 background, columns as letters (ACGT)
download precomputed PWM thresholds for the AD curated collection, plain text format, uniform background PWMs, one-file-per-model
download precomputed PWM thresholds for the AD curated collection, plain text format, hg19 background PWMs, one-file-per-model
download precomputed PWM threshold-to-P-value conversion tables for the AD curated collection, plain text format, uniform background PWMs, one-file-per-model
download precomputed PWM threshold-to-P-value conversion tables for the AD curated collection, plain text format, hg19 background PWMs, one-file-per-model

Supplementary data

[xlsx] [txt] UniProt mappings (HUMAN and MOUSE) for HOCOMOCOv9 motifs
download full details on human curation for HOCOMOCOv9, Excel table
download details on precomputed threshold files for HOCOMOCOv9 PWMs
download alignments used to produce the full collection, plain text format, masked TRANSFAC entries; scores are given for the 'uniform' PWM background
download source data used to produce the full collection, ChIPMunk-ready multifasta, no TRANSFAC entries

HOCOMOCO-E

download the E part of the curated collection, small-BiSMark XML format, uniform background for PWMs
download the E part of the curated collection, small-BiSMark XML format, hg19 background for PWMs
download the E part of the curated collection, plain text format, weighted position count matrices (WPCM), rows as letters (ACGT)
download the E part of the curated collection, plain text format, weighted position count matrices (WPCM), columns as letters (ACGT)
download the E part of the curated collection, MEME text format, probability matrices
download the E part of the curated collection, TRANSFAC text format, frequency matrices
download the E part of the curated collection, plain text format, probability matrices, rows as letters (ACGT)
download the E part of the curated collection, plain text format, probability matrices, columns as letters (ACGT)
download the E part of the curated collection, plain text format, PWMs, uniform background, rows as letters (ACGT)
download the E part of the curated collection, plain text format, PWMs, hg19 background, rows as letters (ACGT)
download the E part of the curated collection, plain text format, PWMs, uniform background, columns as letters (ACGT)
download the E part of the curated collection, plain text format, PWMs, hg19 background, columns as letters (ACGT)

HOCOMOCO-FULL

download the whole collection, small-BiSMark XML format, uniform background for PWMs
download the whole collection, small-BiSMark XML format, hg19 background for PWMs
download the whole collection, plain text format, weighted position count matrices (WPCM), rows as letters (ACGT)
download the whole collection, plain text format, weighted position count matrices (WPCM), columns as letters (ACGT)
download the whole collection, MEME text format, probability matrices
download the whole collection, TRANSFAC text format, frequency matrices
download the whole collection, plain text format, probability matrices, rows as letters (ACGT)
download the whole collection, plain text format, probability matrices, columns as letters (ACGT)
download the whole collection, plain text format, PWMs, uniform background, rows as letters (ACGT)
download the whole collection, plain text format, PWMs, hg19 background, rows as letters (ACGT)
download the whole collection, plain text format, PWMs, uniform background, columns as letters (ACGT)
download the whole collection, plain text format, PWMs, hg19 background, columns as letters (ACGT)

Updates

[24 APR 2012] Improved visual representation for circular tree for HOCOMOCO-AC motifs.
[20 APR 2012] Circular tree for HOCOMOCO-AC motifs is available.
[10 APR 2012] UniProt mappings (HUMAN and MOUSE) for HOCOMOCO motifs are available for separate download.
[17 MAR 2012] HOCOMOCO v9 released. Many minor annotation fixes, downloads section updated.
[16 DEC 2011] E-collection list is now available online.
[15 DEC 2011] Complete separate model details pages for the HOCOMOCO-AD collection.
[14 DEC 2011] Precomputed thresholds for the HOCOMOCO-AD are available for download.
[13 DEC 2011] WPCM (weighted PCM) tag added to the small-BiSMark files. Downloads section updated.
[13 DEC 2011] Weighted PCMs are now available to download in the plain text format.

Contact

Ivan Kulakovskiy ivan-dot-kulakovskiy-at-gmail-dot-com
Yulia Medvedeva yulia-dot-medvedeva-at-kaust-dot-edu-dot-sa