Hi UCSC/Ensembl/RefSeq folks,
I'm happy to let you know that the two mouse T2T genomes for the
C57BL/6J and CAST/EiJ strains are now available from INSDC via:
C57BL/6J
GCA_964188535
CAST/EiJ
GCA_964188545
Some notes:
- these assemblies came from a F1 cross mouse (C57BL6/J x CAST/EiJ),
where the reads were separated by strain during the assembly
process.
- C57BL/6J is the paternal hap so lacks an X chromosome and we
didn't assemble the Y chr (likely it is in pieces in the unassembled
scaffolds).
- CAST/EiJ is the maternal hap so has an X chromosome and no Y
chromosome
- the C57BL/6J genome is substantially more complete than GRCm39,
all of the chromosomes are T2T and close every gap in GRCm39 (except
two!). If you are interested,
here
is a talk from TAGC conference which gives a good overview of
the genomes.
- we did not include a MT chromosome (they are already available for
both strains, I can flag the accessions if useful)
- we did do a gene build with a combination of LiftOff and Breaker
3, this was good enough for us for paper writing, but likely not
comprehensive enough for the Genome Browsers. There is plenty of
strain specific RNA-Seq for both strains, happy to point you to this
if helpful.
- we are in the final stages of preparing the manuscript, I would
hope to submit it by the end of the summer/August. It would be
really excellent if the genomes were available or close to being
available for when the paper is published.
Please ask if you have more questions, I really hope you can find
the time to look at loading these but completely understand there's
a long queue of new and interesting genomes!
Rgds,
Thomas
Omics Section Head
Research and Services Team Leader
EMBL-EBI