Genomics is a new science which has had a very important boom in recent years, thanks to advanced technologies of DNA sequencing, advances in bioinformatics and increasingly sophisticated techniques for analysing whole genomes. And I will discuss in this article about whole genomes and their sequencing, mentioning the Human Genome Project, which allowed the sequencing of the human genome.
WHY WE SEQUENCED?
Sequencing is the set of methods and biochemical techniques aimed at determining the order of nucleotides (A, T, C and G). Its objective is to get in order all nucleotides DNA of an organism.
From that moment comes the eukaryotic sequencing project: in 1998 Caenorhabditis elegans (nematode) was sequenced, in 2000 Drosophila melanogaster (fruit fly) and in 2001 the human genome.
But, why we sequenced? In the case of human genome, there is the need to know to help alleviate or prevent diseases.
Some of the organisms sequenced are model organisms, which have:
- Medical importance: there are pathogens and we know diseases that they can cause.
- Economic importance: organisms that humans eat, they can improve with the molecular techniques.
- Study of evolution: in 2007 more than 11 species of Drosophila were sequenced and it tried to understand the evolutionary relationship between their chromosomes. It has also been made in mammals (ENCORE Project).
WHAT WE UNDERSTAND FOR GENOME SEQUENCED?
The human genome has 46 chromosomes, it means 23 chromosome pairs (22 autosomal chromosome pairs and 1 sexual chromosome pair, XX or XY depending if it is female or male).
The size of the human genome sequenced is 32,000Mb, 23 chromosomes plus Y chromosome.
The human genome was obtained from the mixture of human genomes to obtain a representation of all humanity genome.
PARADOX THAT WE FIND IN GENOME
A paradox is a statement that, despite apparently sound reasoning from true premises, leads to a self-contradictory or a logically unacceptable conclusion. In genomes we find two clear paradoxes.
The first one refers to the C-value, which represents the amount of DNA in the genome. As would be expected, if the organism is larger and more complex, the size of its genome will be bigger. However this is not true because there is not this correlation. It is due because the genome not only contains coding genome and proteins, but also contains repetitive DNA. In addition, the most compacted genomes are found in organisms less complexes.
The second paradox refers to the G-value, which represents the number of genes. There is no correlation between the number of genes and its complexity. A clear example is that in human genome has around 20,000 genes and Arabidopsis thaliana (herbaceous plant) has 25,000 genes. The reason is found in the RNA world, which is more complex and it is related to gene regulation.
THE HUMAN GENOME PROJECT (HGP)
The human genome sequencing project has been the most important biomedical research project of the whole history. With a budget of 3 thousand millions of dollars and the participation of an International Public Consortium, which was formed by EEUU, UK, Japan, France, Germany, China and other countries. Its ultimate objective was achieving the complete sequence of the human genome.
It started in 1990, but things get complicated when, in 1999, appeared a private company, Celera Genomics, headed by the scientist Craig J. Venter, who launched the challenge of getting the human sequence in record time, before the expected by the Public Consortium.
At the end it was decided to leave in a draw. The Public Consortium accelerated the process and obtained the draft almost at the same time. On 26th June 2000, in a ceremony at the White House with President Bill Clinton, the two leading representatives of the parties in competition, Craig Venter by Celera and the Public Consortium director, Francis Collins found. It announced the achievement of two drafts of the complete human genome sequence (Video 1). It was a historic moment, as the discovery of the double helix or the first time the man went to the Moon.
Video 1. Human Genome announcement at the White House (Source: YouTube)
The corresponding publications of both sequences did not appear until February 2001. The Public Consortium published its sequence in the journal Nature, while Celera did in Science (Figure 1). Three years later, in 2004, the Consortium published the final or complete version of the human genome.
The genome of the year 2001 is the reference genome. From here we have entered in the era of personal genomes, with names and surnames. Craig Venter was the first person who sequenced his genome, and the next one was James Watson, one of the discoverers of double helix.
It took 13 years to sequence the reference genome. It took less time to sequence Craig Venter’s genome and only few months for Watson’s genome.
CLINICAL APPLICATIONS OF SEQUENCING
Without going to sequence the entire genome they have been identified disease-causing genes. An exome is not the whole genome, but the part of the genome corresponding to exons.
An example is the case of Nicholas Volker (Figure 2), the first case of genomic medicine. This child had a severe and intractable inflammatory bowel disease of unknown cause. With exome sequencing was allowed to discover a mutation in the XIAP gene on chromosome X, replacing an amino acid functionally important for another. A bone marrow transplant saved the life of the patient.
- L. Pray. Eukaryotic genome complexity. Nature Education 2008; 1(1):96
- Brown. Genomes 3, 3rd edition (2007)
- Bioinformática UAB
- E. A. Worthey et al. Making a definitive diagnosis: Successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease. Genetics in Medicine 2011; 13, 255-262
Main picture: Noticias InterBusca