A poster will be presented at the Asia Pacific Bioinformatics Conference 2008 describing GenCat and the Genome Catalogue, two products of the Genomic Standards Consortium (GSC). If you are attending APBC 2008 and are interested in the work of the GSC, please contact Tanya Gray at tgra at ceh.ac.uk.

About the GSC:

The Genomic Standards Consortium (GSC) formed in 2005 with the aim to promote methods to standardize the description of genomes and the exchange and integration of genomics data.

The GSC is an open-membership international working body. Participants in the GSC include biologists, computer scientists, those building genomic databases and conducting large-scale comparative genomic analyses, and those with experience of building community-based standards.

For more information, please visit http://gensc.org


The GSC has released a number of products:

Minimum Information about a Genome Sequence/Metagenome Sequence specification. Provides an extensions to the minimum information already captured by primary nucleotide databases (DDBJ/EMBL/Genbank) (Field et al, 2007).

Genomic Contextual Data Markup Language that incorporates the MIGS/MIMS specification and provides an extended data capture and exchange mechanism for integrating a wide range of information relevant to the in depth description of genomes and metagenomes.

Genome Catalogue
A repository of genome reports that are compliant with the MIGS/MIMS specification. The Genome Catalogue is based on the GenCat software.

A generic XML data catalogue tool that supports the development of data standards by providing a data repository and input forms auto-generated from successive XML schema files (used to define a data standard). (http://gencat.sf.net)

Dawn Field and Susanna-Assunta Sansone guest edited a special issue of the journal OMICS on data standards as an output of the 2nd GSC workshop. There were 22 invited papers including 5 in the area of standardization of genomic data (See: Special_Issue_of_OMICS).
An overview of the issue and its goals is captured in the foreword: http://www.liebertonline.com/doi/pdfplus/10.1089/omi.2006.10.84
The entire issue was open source and many of the papers in the issue continue to be at the top of the most downloaded list: http://www.liebertonline.com/action/showMostReadArticles?journalCode=omi

Special issue of OMICS from the 5th GSC Workshop

Dawn Field and George Garrity have been asked by the OMICS Editor and Chief Eugene Kolker to produce a special issue of OMICS based on the 5th GSC Workshop. After guaging interest in this prior to the workshop, and in response to developments at the workshop, we are going to accept this invitation.We are now considering proposals from the participants of the 5th GSC workshop (and their colleagues) for contributions on several key topics of special interest to the GSC.

Further information:

GSC web site: http://gensc.org

The latest release of GenCat is available to download from the project SourceForge SVN repository:

gencat release-1.1

GenCat uses the latest technology including Orbeon Forms, eXist, AJAX, XForms and XML Pipeline Language (XPL/XPROC), and provides a generic XML data catalogue with input forms generated on-the-fly from XML schema files, to capture schema-compliant XML instances.

GenCat has been implemented as the Genome Catalogue, an online repository from the Genomic Standards Consortium (GSC) containing MIGS/MIMS-compliant reports.

For further information:

GSC web site: http://gensc.org

A meeting report summarizing the proceedings of the “eGenomics: Cataloguing our Complete Genome Collection III” workshop held September 11–13, 2006, at the National Institute for Environmental eScience (NIEeS), Cambridge, United Kingdom, is now available:

download meeting report

Comparative and Functional Genomics
Volume 2007 (2007), Article ID 47304, 7 pages

Meeting Report

eGenomics: Cataloguing Our Complete Genome Collection III

Dawn Field, George Garrity, Tanya Gray, Jeremy Selengut, Peter Sterk, Nick Thomson, Tatiana Tatusova, Guy Cochrane, Frank Oliver Glöckner, Renzo Kottmann, Allyson L. Lister, Yoshio Tateno, and Robert Vaughan

This 3rd workshop of the Genomic Standards Consortium was divided into two parts. The first half of the three-day workshop was dedicated to reviewing the genomic diversity of our current and future genome and metagenome collection, and exploring linkages to a series of existing projects through formal presentations. The second half was dedicated to strategic discussions. Outcomes of the workshop include a revised “Minimum Information about a Genome Sequence” (MIGS) specification (v1.1), consensus on a variety of features to be added to the Genome Catalogue (GCat), agreement by several researchers to adopt MIGS for imminent genome publications, and an agreement by the EBI and NCBI to input their genome collections into GCat for the purpose of quantifying the amount of optional data already available (e.g., for geographic location coordinates) and working towards a single, global list of all public genomes and metagenomes.


The rhesus macaque monkey (Macaca mulatta) genome was described in the 13th April issue of Science Magazine.

Genome report for Macaca mulatta:

http://darwin.nox.ac.uk/gsc/ gcat/report/002485_GCAT/html

Genome wiki page:

http://darwin.nerc-oxford.ac.uk/gc_wiki/index.php/ /gcat/report/002485_GCAT/html

Genome mashup for macaque genome:

http://darwin.nox.ac.uk/gsc/gcat /mashup?query=macaque%20genome,  returned the following three videos from YouTube:

Macaque Genome video part I

The first four GCat identifiers to appear in print are now online in PloS Biology in a description of four viral metagenomes:

The Marine Viromes of Four Oceanic Regions
Florent E. Angly, Ben Felts, Mya Breitbart, Peter Salamon, Robert A. Edwards, Craig Carlson, Amy M. Chan, Matthew Haynes, Scott Kelley, Hong Liu, Joseph M. Mahaffy, Jennifer E. Mueller, Jim Nulton, Robert Olson, Rachel Parsons, Steve Rayhawk, Curtis A. Suttle, Forest Rohwer

Synopsis: Metagenomics Offers a Big-Picture View of the Diversity and Distribution of Marine Viruses

Here is the snippet from the paper:

The Genome Projects Database (http://www.ncbi.nlm.nih.gov/Genomes) accession numbers for the sequences are 17765 (GOM), 17767 (BBC), 17769 (Arctic), and 17771 (SAR); the Genome Catalogue (http://gensc.sf.net) accession numbers are 000002_GCAT (GOM), 000003_GCAT (BBC), 000004_GCAT (Arctic), and 000005_GCAT (SAR); and the GOLD database (http://www.genomesonline.org) GOLDstamps are GM00060 (GOM), GM00061 (BBC), GM00062 (Arctic), and GM00063 (SAR).

