GSC 8 Outcomes

The GSC 8 meeting was hosted at the Joint Genome Institute Sept 9-11, 2009. It was preceded by an ISA-Tab-GCDML workshop. The agenda can be see here on the GSC 8 meeting homepage.

The first report to appear on the meeting is in Bioinform:

The JGI is also producing a article and will post all talks as videos online soon.

Some of the GSC-8 photos are online at the JGI website.

A formal meeting report is being written by the organizers and will appear in the GSC’s journal Standards in the Genomic Sciences.

The GSC is launching an open-access online journal to support its mission. The “Standards in Genomic Sciences” (SIGS) journal will provide a forum for publishing genome and metagenome notes structured according to the GSC’s MIGS/MIMS specification. It will also support the community by providing a venue for the publication of a wide range of articles that are standards-compliant and standards supportive. For example, this will include the publication of Standard Operating Procedures (SOPs) such as those describing the annotation pipelines of the major sequencing centres. Submissions on a range of topics now welcomed. For more information, please see the links below.

SIGS: website
SIGS: Editorial Board
About SIGS

Another MIGS-compliant report has been published as a Supplementary an article – see Reference 137 and the table S1 listed in the Materials and Methods.

“Adaptations to Submarine Hydrothermal Environments Exemplified by the Genome of Nautilia profundicola”

See the full paper here.

The GSC proposal to host a SIG this year at ISMB 2009 in Sweden has been successful. Further details will follow soon but the one day meeting will be held Saturday June 27th before the start of the main ISMB / ECCB 2009 meeting in Stockholm.

The official title of the SIG is “Metagenomics, Metadata and Metaanalysis (M3)“. There will a set of invited speakers along with a call for papers and posters.

Some further background about the spirit of the meeting below. The “M3” Hompage will appear here in time


ISMB SIG: Metagenomics, Metadata and Metaanalysis (M3), June 27th, Stockholm

There are now thousands of genomes and metagenomes available for study (see the Genomes Online Database. Interest in improved sampling of diverse environments (e.g. ocean, soil, sediment, and a range of hosts) combined with advances in the development and application of ultra-high throughput sequence methodologies is set to vastly accelerate the pace at which new metagenomes are generated. For example, in 2007, the Global Ocean Survey published scientific analyses of 41 metagenomes, and as of October 2008, the submission of user-generated metagenomes to the public MG-Rast Annotation server surpassed 1300. We have also entered an era of ‘mega-sequencing projects’ that now include funded projects like the Genomic Encyclopedia of Bacteria and Archaea (GEBA) project and the Human Microbiome Initiative (e.g. HMP), with many more visionary projects on the horizon.

While genomes represent the full genetic (DNA) complement of a single organism, metagenomes represent the DNA of an entire community of organisms. Metagenomes are partial samples of complex and largely unknown communities that can only be poorly assembled. Genome and metagenomes are now also being complemented with studies of metatranscriptomes (community transcript profiles) and metaproteomes (community protein profiles). The comparative study of these datasets, including multi-omic data sets from the same community, bring with them the need for new computational approaches.
These data hold the promise of unparalleled insights into fundamental questions across a range of fields including evolution, ecology, environment biology, health and medicine. Advances stem from improved understandings of the combinations, abundances and functions of the organisms in these communities and their genes and pathways. We are just starting to exploit these technologies to understand the microbial world and have only scratched the surface in terms of sampling natural microbial diversity in terms of space and time.

This SIG will explore the latest concepts, algorithms, tools, informatics pipelines, databases and standards being developed to cope with the analysis of vast quantities of metagenomic data. Through a series of invited and contributed talks, a panel discussion, and flash talks associated with a poster session, we aim to highlight scientific advances in the field and identify core computational challenges facing the wider community.

We aim to bring together researchers collecting samples for metagenomic analysis with those building the computational infrastructure required to fully exploit them with those thinking about the implementation of standards. In particular, we aim to encourage the participation of researchers already using metadata to detect patterns of biological interest. For example, those that are detecting trends in collections of genomes or metagenomes based on habitat or geographic location, or by analogy for microbiome studies by host or anatomical location, or studies that integrate substantial environmental, phenotypic, or epidemiology data.

This SIG will be hosted by the Genomic Standards Consortium (GSC). It will place emphasis on understanding the community’s needs for the complementary standards required to drive metagenomic science forward and support comparative studies. A SIG report will be published in the GSC’s online, open access journal “Standards in Genomic Sciences”. Contributed submissions based on talks and posters that describe studies that demonstrate the power of using curated (e.g. habitat or host) and measured (e.g. geographic location, salinity, temperature, or pH) contextual data in comparative metagenomic studies of large numbers of samples are encouraged. Likewise, papers describing new approaches, tools, databases, standards, ontologies or substantial new sets of curated metadata that aid in the integration and inter-operablility of disparate datasets are welcomed.

Information about all GSC projects is now being collated on a single page in the GSC Wiki. Please log in during the workshop and help update pages relevant to you and your project(s). The GSC core projects are MIGS/MIMS, the Genome Catalogue, GCDML, GRS, Habitat-Lite and the GSC eJournal and SOP central repository. In addition, the GSC projects page includes descriptions of the larger community efforts in which the GSC participates as a community (MIBBI, MINSEQE, EnvO, OBI, ISATAB).

GSC Projects

We are looking forward to seeing everyone this week starting with the GCDML technical workshop participants. We hope to get a lot done on Monday and Tuesday on GCDML before the main workshop starts.

The agendas for both meetings are now available on the web and have been ‘wikified’ to add live links and other relevant information so we can use them as a central source of information during the workshop.


The three day workshop is packed with short talks in the first two days to maximize exchange of information and the formulation of ideas, but there is also time for discussion. The third day, as always, is set aside for discussion and strategy development and is really when much of the lasting work gets done. We will split the morning into technical (GCDML) and strategic sessions and then rejoin to map out the next stage of the GSC roadmap.

As background to the workshop the GSC homepage has a new page to serve as a portal into all GSC projects.

GSC Projects

Links to the MIGS paper in Nat Biotech and the OMICS special issue including papers on the core GSC project are found on the home page.

At the heart of the MIGS specification is a call for all genomes and metagenomes to be reported with geographic location (lat/long). We were therefore very pleased to see a Nature Editorial calling for ALL biological samples to be tagged with lat/long. The GSC responded with a Nature Correspondence to underscore our collaborative interest in seeing all molecules (in particular genomes, metagenomes and phylogenetic markers like 16S/18S ribosomal RNA genes) placed into a geospatial framework. Location of isolation is increasingly of relevance as an informative piece of contextual data for the interpretation of collections of sequences. In addition to the GSC, the Correspondence was signed by the INSDC, CBOL, ICoMM, and EnvO

The original Editorial can be found here: http://www.nature.com/nature/journal/v453/n7191/full/453002a.html

Our response is here (see bottom of article for all signatories including the GSC): http://www.nature.com/nature/journal/v453/n7198/full/453978b.html

Both pieces are available as free content and can be accessed without a subscription to Nature.

The full list of Signatories on the Correspondence can be seen here.

