New Draft Sequence Version 3
a gzipped multi-fasta file with all scaffold sequences.
a gzipped GENBANK format file with annotation
information including sequences, mRNA sequences, protein sequences, Gene
Ontology Info and Enzyme Commission Numbers
a gzipped multi-fasta DNA file with ~28,000 gene predictions
a gzipped multi-fasta amino acid file with ~28,000 translated gene predictions
a gzipped, tab delimited file of SNP calls at ~3.5M polymorphic sites for each of 9 genomes
a gzipped, tab delimited file of CNVs/ISCRs (Imbalanced Sequence Count Regions) that overlap gene regions
New Draft Sequence Version 2
a gzipped mulit-fasta file with all scaffold sequences.
a gzipped GENBANK format file with ALL annotation information
including sequences, mRNA sequences, proteins sequences, SNPs,
Enzyme Commission Numbers, Gene Ontology annotation, etc.
NOTE: THIS EXPANDS to ~ 1.5Gig.
a multi-fasta file of all 19,414 predicted genes (full
and partial). The sequences are spliced and titles contain functional
a multi-fasta file of all 19,414 predicted gene translated as proteins.
The titles contain functional annotations.
a text, tab delimited file of all SNP locations with PDK20
coordinates. Please do not confuse with V1.0 coordinates. See MAQ
or V1.0 README for more detail. Essentially the columns are: Scaffold
name, position, reference base, consenus base, Phred-like consensus
quality, read depth, the average number of hits of reads covering
this position, the highest quality of reads covering the position,
the minimum consensus quality in the 3bp flanking regions at each
side of this site, the second best call, the log likelihood ratio of
the second best and the third best call, and the third best call.
Draft Sequence Version 1