A cloning factory - an incredible notion borrowed straight from science fiction. But here in Shenzhen, in what was an old shoe factory, this rising power is creating a new industry.
The scale of ambition is staggering. BGI is not only the world’s largest centre for cloning pigs - it’s also the world’s largest centre for gene sequencing.
In neighbouring buildings, there are rows of gene sequencers - machines the size of fridges operating 24 hours a day crunching through the codes for life.
To illustrate the scale of this operation, Europe’s largest gene sequencing centre is the Wellcome Trust Sanger Institute near Cambridge. It has 30 machines. BGI has 156 and has even bought an American company that makes them.
China is on a trajectory that will see it emerging as a giant of science: it has a robotic rover on the Moon, it holds the honour of having the world’s fastest supercomputer and BGI offers a glimpse of what industrial scale could bring to the future of biology.
But of course the big goal for us is to understand the genes and the regions of DNA that evolved to make a passenger pigeon the bird that it is, and to begin recreating those elements in a living pigeon genome, the band-tailed pigeon.
This week’s diamond jubilee of the discovery of DNA’s molecular structure rightly celebrates how Francis Crick, James Watson and their collaborators launched the ‘genomic age’ by revealing how hereditary information is encoded in the double helix. Yet the conventional narrative — in which their 1953 Nature paper led inexorably to the Human Genome Project and the dawn of personalized medicine — is as misleading as the popular narrative of gene function itself, in which the DNA sequence is translated into proteins and ultimately into an organism’s observable characteristics, or phenotype.
Sixty years on, the very definition of ‘gene’ is hotly debated. We do not know what most of our DNA does, nor how, or to what extent it governs traits. In other words, we do not fully understand how evolution works at the molecular level.
That sounds to me like an extraordinarily exciting state of affairs, comparable perhaps to the disruptive discovery in cosmology in 1998 that the expansion of the Universe is accelerating rather than decelerating, as astronomers had believed since the late 1920s. Yet, while specialists debate what the latest findings mean, the rhetoric of popular discussions of DNA, genomics and evolution remains largely unchanged, and the public continues to be fed assurances that DNA is as solipsistic a blueprint as ever.
The more complex picture now emerging raises difficult questions that this outsider knows he can barely discern. But I can tell that the usual tidy tale of how ‘DNA makes RNA makes protein’ is sanitized to the point of distortion. Instead of occasional, muted confessions from genomics boosters and popularizers of evolution that the story has turned out to be a little more complex, there should be a bolder admission — indeed a celebration — of the known unknowns.
by Philip Ball
Image by Andrew Rae
Ball P. (2013). DNA: Celebrate the unknowns, Nature, 496 (7446) 419-420. DOI: 10.1038/496419a
Yonggang Ke, Luvena L. Ong, William M. Shih, Peng Yin described an interesting brick model for the DNA that is analogous to LEGO® brick structures:
We describe a simple and robust method to construct complex three-dimensional (3D) structures by using short synthetic DNA strands that we call “DNA bricks.” In one-step annealing reactions, bricks with hundreds of distinct sequences self-assemble into prescribed 3D shapes. Each 32-nucleotide brick is a modular component; it binds to four local neighbors and can be removed or added independently. Each 8–base pair interaction between bricks defines a voxel with dimensions of 2.5 by 2.5 by 2.7 nanometers, and a master brick collection defines a “molecular canvas” with dimensions of 10 by 10 by 10 voxels. By selecting subsets of bricks from this canvas, we constructed a panel of 102 distinct shapes exhibiting sophisticated surface features, as well as intricate interior cavities and tunnels.
The first picture was extracted from the divulgative article:
(A) A DNA brick consists of four regions of 8 nucleotides each and corresponds to a two-stud LEGO brick. Half–DNA-bricks corresponding to one-stud LEGO bricks are used for edges. DNA bricks are connected by an 8–base pair hybrid, causing a 90° shift between two layers. (B) Ke et al. used one- and two-stud bricks (represented by the LEGO bricks in the blue frame) to assemble a 10 by 10 by 10 voxel cuboid (C). With subsets of the bricks used for the cuboid, the authors also assembled many other shapes, such as a space shuttle–like structure, shown both as a LEGO (D) and DNA model (E). The extra bricks in the red-framed section in (B) are the boundary and protector bricks required for formation of the space shuttle structure.
The second image is extracted from the research paper:
Ke Y., Ong L.L., Shih W.M. & Yin P. (2012). Three-Dimensional Structures Self-Assembled from DNA Bricks, Science, 338 (6111) 1177-1183. DOI: 10.1126/science.1227268
Artists have made impressive works of art with some pretty bizarre materials, but this set of minuscule characters blows them all away. The 107 letters, numbers and pictures – including some pretty silly smileys – were created by researchers at Harvard University from the building blocks of life: DNA. Each finished character is just 64 nanometers by 103 nanometers.
The researchers created these impressive little characters by effectively “unzipping” the familiar double helix shape of DNA. This turned them into a sort of ultra-tiny building block that was then used to arrange strands of DNA into specific shapes. The result is fun and kind of silly, but there is a very serious side to this project. It could be used to create molecular-scale medications and other devices that could dramatically enhance science and medicine.
Epigenetics, literally translated means “above the genome.” It’s the study of how different environmental factors can alter the way our genetic code is expressed. Interested? Watch this episode of NOVA all about it!
a, Ideograms of the 12 pseudochromosomes of potato (in Mb scales). Each of the 12 pachytene chromosomes from DM was digitally aligned with the ideogram (the amount of DNA in each unit of the pachytene chromosomes is not in proportion to the scales of the pseudochromosomes). b, Gene density represented as number of genes per Mb (non-overlapping, window size = 1 Mb). c, Percentage of coverage of repetitive sequences (non-overlapping windows, window size = 1 Mb). d, Transcription state. The transcription level for each gene was estimated by averaging the fragments per kb exon model per million mapped reads (FPKM) from different tissues in non-overlapping 1-Mb windows. e, GC content was estimated by the per cent G+C in 1-Mb non-overlapping windows. f, Distribution of the subtelomeric repeat sequence CL14_cons.
The paper can be summarized using the figure published in it: Comparative analyses and evolution of the potato genome
a, Clusters of orthologous and paralogous gene families in 12 plant species as identified by OrthoMCL33. Gene family number is listed in each of the components; the number of genes within the families for all of the species within the component is noted within parentheses. b, Genome duplication in dicot genomes as revealed through 4DTv analyses. c, Syntenic blocks between A. thaliana, potato, and V. vinifera (grape) demonstrating a high degree of conserved gene order between these taxa.
Haplotype diversity and inbreeding depression
a, Plants and tubers of DM and RH showing that RH has greater vigour. b, Illumina K-mer volume histograms of DM and RH. The volume of K-mers (y-axis) is plotted against the frequency at which they occur (x-axis). The leftmost truncated peaks at low frequency and high volume represent K-mers containing essentially random sequencing errors, whereas the distribution to the right represents proper (putatively error-free) data. In contrast to the single modality of DM, RH exhibits clear bi-modality caused by heterozygosity. c, Genomic distribution of premature stop, frameshift and presence/absence variation mutations contributing to inbreeding depression. The hypothetical RH pseudomolecules were solely inferred from the corresponding DM ones. Owing to the inability to assign heterozygous PS and FS of RH to a definite haplotype, all heterozygous PS and FS were arbitrarily mapped to the left haplotype of RH. d, A zoom-in comparative view of the DM and RH genomes. The left and right alignments are derived from the euchromatic and heterochromatic regions of chromosome 5, respectively. Most of the gene annotations, including PS and RH-specific genes, are supported by transcript data.
Gene expression of selected tissues and genes
a, KTI gene organization across the potato genome. Black arrows indicate the location of individual genes on six scaffolds located on four chromosomes. b, Phylogenetic tree and KTI gene expression heat map. The KTI genes were clustered using all potato and tomato genes available with the Populus KTI gene as an out-group. The tissue specificity of individual members of the highly expanded potato gene family is shown in the heat map. Expression levels are indicated by shades of red, where white indicates no expression or lack of data for tomato and poplar. c, A model of starch synthesis showing enzyme activities is shown on the left. AGPase, ADP-glucose pyrophosphorylase; F16BP, fructose-1,6-biphosphatase; HexK, hexokinase; INV, invertase; PFK, phosphofructokinase; PFPP, pyrophosphate-fructose-6-phosphate-1-phosphotransferase; PGI, phosphoglucose isomerase; PGM, phosphoglucomutase; SBE, starch branching enzyme; SP, starch phosphorylase; SPP, sucrose phosphate phosphatase; SS, starch synthase; SuSy, sucrose synthase; SUPS, sucrose phosphate synthase; UDP-GPP, UDP-glucose pyrophosphorylase. The grey background denotes substrate (sucrose) and product (starch) and the red background indicates genes that are specifically upregulated in RH versus DM. On the right, a heat map of the genes involved in carbohydrate metabolism is shown. ADP-glucose pyrophosphorylase large subunit, AGPase (l); ADP-glucose pyrophosphorylase small subunit, AGPase (s); ADP-glucose pyrophosphorylase small subunit 3, AGPase 3 (s); cytosolic fructose-1,6-biphosphatase, F16BP (c); granule bound starch synthase, GBSS; leaf type L starch phosphorylase, Leaf type SP; plastidic phosphoglucomutase, pPGM; starch branching enzyme II, SBE II; soluble starch synthase, SSS; starch synthase V, SSV; three variants of plastidic aldolase, PA.
In the end, I extract future directions:
Given the pivotal role of potato in world food production and security, the potato genome provides a new resource for use in breeding. Many traits of interest to plant breeders are quantitative in nature and the genome sequence will simplify both their characterization and deployment in cultivars. Whereas much genetic research is conducted at the diploid level in potato, almost all potato cultivars are tetraploid and most breeding is conducted in tetraploid material. Hence, the development of experimental and computational methods for routine and informative high-resolution genetic characterization of polyploids remains an important goal for the realization of many of the potential benefits of the potato genome sequence.