research-v2

Date Palm Draft Sequence

New Draft Sequence Version 3

  • PDK30_README.txt
  • PdactyKAsm30_r20101206.fasta.gz
    a gzipped multi-fasta file with all scaffold sequences.
  • PDK30.gbf.gz
    a gzipped GENBANK format file with annotation information including sequences, mRNA sequences, protein sequences, Gene Ontology Info and Enzyme Commission Numbers
  • PDK30-mrna.fsa
    a gzipped multi-fasta DNA file with ~28,000 gene predictions
  • PDK30-pep.fsa
    a gzipped multi-fasta amino acid file with ~28,000 translated gene predictions
  • PDK30_9genomes_SNPs.tab.txt.gz
    a gzipped, tab delimited file of SNP calls at ~3.5M polymorphic sites for each of 9 genomes
  • PDK30_CNVs.tab.txt
    a gzipped, tab delimited file of CNVs/ISCRs (Imbalanced Sequence Count Regions) that overlap gene regions

New Draft Sequence Version 2

  • PDK20_README.txt
  • PDK20.fsa.gz
    a gzipped mulit-fasta file with all scaffold sequences.
  • PDK20.gbf.gz
    a gzipped GENBANK format file with ALL annotation information including sequences, mRNA sequences, proteins sequences, SNPs, Enzyme Commission Numbers, Gene Ontology annotation, etc.
    NOTE: THIS EXPANDS to ~ 1.5Gig.
  • PDK20.mRNA.fsa
    a multi-fasta file of all 19,414 predicted genes (full and partial). The sequences are spliced and titles contain functional annotation.
  • PDK20.pep.fsa
    a multi-fasta file of all 19,414 predicted gene translated as proteins. The titles contain functional annotations.
  • PDK20.snp.txt.gz
    a text, tab delimited file of all SNP locations with PDK20 assembly coordinates. Please do not confuse with V1.0 coordinates. See MAQ or V1.0 README for more detail. Essentially the columns are: Scaffold name, position, reference base, consenus base, Phred-like consensus quality, read depth, the average number of hits of reads covering this position, the highest quality of reads covering the position, the minimum consensus quality in the 3bp flanking regions at each side of this site, the second best call, the log likelihood ratio of the second best and the third best call, and the third best call.

Draft Sequence Version 1