One of the greatest acts of exploration the world has ever seen took 13 years to complete. And yet it discovered no new lands, nor did it scour the depths of oceans or search the vastness of space. This was an exploration that focused on the very essence of who we, as humans, are.
The Human Genome Project was not as dazzling as the Apollo Mission to the moon in 1969, or as Earth-Conquering as the first summiting of Mt Everest in 1953, but it was one of the most important pieces of research our species has ever undertaken. The project to map the Human Genome, or DNA, began in October 1990 and was completed in April 2003. And for the very first time, we were able to look at the complete Homo Sapien blueprint.
Even 18 years on it remains one of the largest collaborative biological projects the world has ever seen, with 20 separate institutions across 4 continents taking part – this was very much a team achievement. But really, the mapping of our genome was just the beginning. Now that we have the blueprint, we can further our understanding of topics ranging from molecular medicine to human evolution. The Human Genome Project itself didn’t change the world, but in years to come, we may well look back on it as the start of something very special.
The Human Genome
Let’s start with the human genome. Every organism has a genome, which is a complete set of deoxyribonucleic acid (DNA), a chemical compound containing genetic information and instructions to develop and direct activities within that living organism.
You may well be familiar with the image of the double helix, and even if you don’t know exactly what it is, you’ll probably associate it with DNA. The double helix represents the structure of a DNA molecule composed of two twisting, paired strands. These strands themselves consist of four chemical units, called nucleotide bases; adenine (A), thymine (T), guanine (G) and cytosine (C). These bases always pair with a base on the opposite strand, and always the same one, so an A always pairs with a T, and a C always with a G. These are all bonded together with hydrogen atoms and set around the sugar-phosphate backbone which forms the structural framework of the molecule.
We are of course talking about minute sizes, smaller than the human eye could ever hope to comprehend. One study focusing on human DNA found that the chain measured 2.2 to 2.6 nanometres, and one nucleotide unit measured 0.33 nanometres in length. To put nanometers in some kind of perspective, a banana has a diameter of roughly 40,000,000 nanometres – around 4 cm (1.5 inches). However, if you could stretch the DNA in one cell as far as it could go, it would measure roughly 2 metres (6.5 metres), while all the DNA in all our cells combined would be about twice the diameter of the Solar System – that’s around 574 billion km (356 billion miles).
The human genome has just over 3 billion of these base pairs, which can be found in the 23 pairs of chromosomes found in the human body. This DNA contains all of the biological, evolutionary and genetic instructions for development, functioning, growth and reproduction. Every single living being has it, and most viruses likewise. This is essentially a map of living beings, where they’ve come from, who they are, and where they might be going.
A Swiss physician by the name of Friedrich Miescher was the first to isolate DNA back in 1869. As with many early groundbreaking medical findings, there was a slice of luck involved, with Miescher first sighting the microscopic substance in the pus of discarded surgical bandages.
Our understanding of DNA grew steadily, but slowly, over the next 75 years, culminating in 1953 with the first accurate model of DNA, the famed double helix, set out by Francis Crick and James Watson. The story goes that on 28th February 1953, Crick stood proudly in the centre of the Eagle Pub in Cambridge at lunchtime to announce that he and Watson had “discovered the secret of life”. Nearly a decade later, they and a colleague Maurice Wilkins, won the Nobel prize in medicine for their groundbreaking work.
It wasn’t until 1985 that serious discussion surrounding the sequencing of DNA began. Much of this originated in the United States, which came with advantages and disadvantages in the early stages. The potential financial clout to have the only real remaining superpower on board was vital. I know the Soviet Union didn’t collapse for another six years, but let’s be honest, at this point, it was in free fall and more concerned with an unwinnable war in Afghanistan than human genetics.
But the U.S agencies initially involved, the Department of Energy and the National Institute of Health, needed to convince a sceptical public and indeed many sceptics in government about the importance of sequencing DNA.
As soon as financial estimates emerged that included the word ‘billions’, many outside the scientific community quickly labelled it as a waste of time and money. For many, it was difficult to see the immediate benefit and considering that 18 years after the completion of the sequencing we’ve still only begun to scratch the surface of what might be achievable, you can certainly see their reasoning. But as I mentioned earlier, the Human Genome Project was never designed as a short-term wonder achievement, but rather the keys to the much larger Pandora’s Box that is human DNA.
Then there’s was the touchy topic of genetic manipulation and genetic discrimination, both of which were still a long way off considering the mapping hadn’t even begun. But they were points raised nonetheless, and it wasn’t until 1986 after a long and difficult process that the pieces of this extraordinary biological project began to fall into place.
The Human Genome Project
I did say it began to fall into place as it was another four years until the project was officially announced by the Department of Energy and the National Institute of Health as a $3 billion project (adjusted for inflation it comes to around $5 billion). This included almost unparalleled collaboration around the world involving geneticists in the USA, the UK, France, Germany, Australia, China and Japan.
The project itself was divided into two phases, the shotgun phase and the finishing phase – you can probably guess in which order they came in.
The shotgun phase had three different steps:
- Obtaining a DNA clone to sequence
- Sequencing the DNA clone
- Assembling sequence data from multiple clones to determine overlap and establish a contiguous sequence
Before any sequencing could begin, geneticists needed human DNA. Now, you might be wondering how exactly they were able to map human DNA, after all, aren’t we all unique human beings? Well, yes and no. Yes in that no two humans are exactly alike (even identical twins will have tiny variants), but considering we all share 99.9% of the same DNA, we can broadly say we’re pretty much the same. The project used the DNA from 4 or 5 anonymous donors so we will never know who the Human Genome Project is technically based on, it’s probably easier to just say that it’s all of us.
Firstly, geneticists divided human chromosomes into DNA segments of an appropriate size. These were then subdivided into even smaller fragments that overlapped with one another. And here is where the long arduous process of sequencing DNA really begins.
Sequencing involves a process known as electrophoresis which separates pieces of DNA that differ in length by only one base. This is done by placing the sample DNA onto a gelatin-like substance and placing electrodes at either end. When an electrical current is applied, it causes the DNA molecules to move through the gel. The smaller the molecules the faster they move, so this process separates the bands according to their size.
Unfortunately, this is a painstakingly slow process as electrophoresis can only separate about 500 bases into clear bands, which explains why they need to be chopped up so small, and probably why it took 13 years to complete. And that’s with the help of machines. It’s estimated that a human doing this work might be able to produce a finished sequence of 20,000 to 50,000 bases in a single year, whereas a machine could do it in just a few hours.
The latest advancement in sequencing machines involves the fully automated Capillary Sequencers which use a robotic arm to do most of the work that humans once did.
This is an incredibly complex process that I could talk for hours about, but most likely many would simply begin to doze off. But I’ll finish with one more point before we move on to talk about the finishing stage. These machines that sequence DNA can’t actually see the DNA directly. To remedy this, geneticists need to use fluorescent dyes which correspond to the four different DNA bases.
But before any electrophoresis races can even begin, the DNA is first copied several times and divided into four batches. These batches are then copied again but with a small amount of chemically modified base added to each batch, so each batch contains either modified T, A, C or G. When these are added the chain bases stop growing which leaves batches of DNA that contain only one of the DNA letters.
In the second round of copying, fluorescent dyes are added, blue, red, yellow and green, which all correspond to a DNA letter. The four batches are then sent through a sequencing machine and as they emerge and move through the gel, a small laser illuminates the molecules and their colour. This is then read by the machine which begins to match colours and DNA letters until it has a rough draft sequence of that particular part of DNA.
And what does it look like? Well, far less visually interesting than you might think because it’s basically just the same four letters. But it could be something like – TTGATCGGCCATTA.
The Finishing Phase
If you thought the Shotgun Phase sounded long and arduous, the finishing phase can often take even longer. The first stage of this process provided a huge amount of hard data, but with plenty of gaps, 147,821 to be exact, and even mistakes that may have arisen due to machine or human error. The finishing phase focused on filling in these gaps, while also correcting any obvious errors.
This was done by 16 different genome centres around the world and took several years to complete. This resulted in 99% of the human genome being mapped in its final form containing roughly 3 billion nucleotides and only 341 gaps. And these gaps remained for a while. For the purists out there, this might leave a nagging sense of incompletion, but essentially the technology used to sequence the DNA could not always be used effectively, especially in highly repetitive sequences. We have one sequence inside our chromosomes’ centromeres that is 171 repeated letters, which scientists often describe being like a stutter. We don’t really know what it is, why it repeats and we certainly didn’t have the technology to sequence it accurately back in the 1990s. Things have improved now, and as of 2015, the number of gaps was down to 160.
A rough draft of the sequencing project was unveiled jointly by President Bill Clinton and Prime Minister Tony Blair in 2000, but the project wasn’t officially declared complete until April 2003. Three years earlier, President Clinton had declared that the gene information gathered would remain patent-free, which no doubt was music to the ears of those who feared a chaotic genetic free for all, but bad news for a company called Celera, a private institution who had also been working on the sequencing at a much faster rate than the Human Genome Project (because it was using data taken from the project). Celera’s stock immediately plummeted and the biotechnology sector lost about $50 billion in market capitalization in just two days. If private companies were dreaming of making a killing off the mapping of the human genome, it wasn’t to be.
So after all of that sequencing and all of those years, what did we discover? Some of the most striking findings came with how alike we are to other animals and indeed things in our world. Remember how your teacher used to say we are essentially the same as a chimpanzee? Well, they were pretty spot on. Technically we’re 96% identical to our evolutionary cousins, which is quite amazing when you think about. What’s also amazing, is that we share 70% of our DNA with a slug – and 50% of us is the same as a banana. That’s right, we’re all half banana, or all bananas are half-human, whichever way you want to look at it.
It also found that significant disease can be caused by a single nucleotide change in a single gene, rather than on a larger scale as had been previously thought. This has hugely improved our understanding of the molecular mechanisms involved in human diseases.
The project also revealed that we have around 22,300 protein-coding genes inside of us and this is where much of the work has been done after the Human Genome Project. These protein-coding genes account for just 1% of our genes, with the remaining 99% categorised as non-coding genes. And you can probably guess which seem to be more important. Non-coding genes do not provide instructions for making proteins and for a long time they were considered virtually useless. However, through the findings from the Human Genome Project, we’ve been able to paint a much clearer picture of these genes and in particular how they help to regulate protein-coding genes and often determine when and where genes are turned on and off.
And as I mentioned earlier, the project found that we have roughly six billion base pairs of DNA. Now, that’s a big number, but think of it like this, if you were to type eight hours a day at 60 words per minute, it would take around 50 years to type the whole human genome.
If people thought back in the 1980s that the successful sequencing of the human genome would immediately lead to a barrage of new discoveries and groundbreaking advancements in medicine, then they would be severely disappointed.
One point to be clear here is that much of the hype that came with the Human Genome Project decades ago came from the hyperactive media who often made grand claims that this was the first step to a major advancement of the human race. And there is still a good chance that that will be the case, but what many failed to point out is that it will likely take decades of work before any major revelations or life-changing findings appear.
While mapping the human genome was an extraordinary achievement, it is but the first step in a long journey. We now have a hugely complex blueprint of the human DNA, but that doesn’t immediately equate to medical breakthroughs – if anything, the hard part is just beginning, we now have the map, we just need to understand what it all means.