Welcome to GeneticCodes

The genetic code

The genetic code is the rule of correspondence between nucleic acid and protein sequences. It defines how genes should be read in an organism to produce a functional response. In nature, genes can be exchanged between species, whereby functions of certain genes can be reproduced in a non-native organism. Since most organisms read genes in a universal manner, the genetic information can travel across the biosphere in a natural way. The transfer of genes from one organism to another can also be done in biochemical laboratories. But what if the reading of the genes was different? Indeed, a few genetic codes allow for somewhat different read-out of genes. Understanding of these differences is very important from the scientific perspective. Is that likely that genes that have been read in a different manner would still produce a functional response?

Science is about numbers, and addressing this question in terms of science requires a numerical approach. The purpose of this web site is to provide a database that would allow numerical comparison between genetic codes natural, artificial, or even purely hypothetical.

This website is a resource that is meant to assist genetic code research by providing a database, a platform, and a tool at the same time.

1) Database

The web site hosts a user generated database for genetic codes and amino acids, encompassing

  • The standard genetic code,
  • Other natural codes found in natural organisms and organelles.
  • Genetic codes engineered in synthetic biology labs.
  • Speculative or theoretical codes not (yet) implemented in any organism.

The list of genetic codes is not static, as more and more natural codes are being discovered and new engineered codes are being made in the lab. To keep up with the growing number of codes, registered users can add new genetic codes to the database.

The standard code uses triplets of four nucleotide bases to encode an amino acid, a start or a stop codon. In principle genetic codes are possible that do not use three, but (one), two, four or even more nucleotide bases. Recent work on xeno nucleic acids (XNA) have shown that it should be possible to use a set of bases that is different from the four bases used in the standard code (AGCT in FNA, or AGCU in RNA). In this database, at least for now, we only used genetic codes with triplets and the four natural occurring bases.

The standard genetic code uses 20 (22) canonical (coded, proteinogenic) amino acids to construct proteins. It has been demonstrated that the genetic code can be modified as to accept also non-canonical (non-coded, non-proteinogenic) amino acids. This database contains the 20 canonical amino acids and a larger number of non-canonical amino acids to construct novel codes. Users can add additional non canonical amino acids to the database, and use them in new genetic codes.

2) Platform

The web site provides a platform to measure the (dis)similarity between any two genetic codes.

The metric used to measure the (dis)similarity between two codes is ∆code as described in (Schmidt and Kubyshkin 2021). The ∆code values of each genetic code to any other genetic code, is calculated and can be seen either in the interactive 3D visualization (which is a 3D projection of a higher dimensional metric space) or by downloading the ∆code matrix in the download section.

3) Tool

The site contains tools to support the design, build, test, learn (DBTL) cycle for future genetic code development.

Alternative genetic codes have a number of potential advantages, ranging from basic research, e.g. assessing the contribution of horizontal gene flow for evolutionary success, to engineering solutions such as new organisms that are not infected with bacteriophages or viruses; or as novel biosafety mechanisms to inhibit horizontal gene flow to other organisms.

Which genetic code would be the most useful to design and build experimentally? How far away should the newly designed code be in relation to other codes? These questions require a metric to measure the distance between existing and potential genetic codes, and this is exactly what the data on this website seeks to provide.

Figure 1

Image - based on Flammarion engraving - by Markus Schmidt and Ege Kökel.

Creative Commons Lizenzvertrag
Dieses Werk ist lizenziert unter einer
Creative Commons Namensnennung - Nicht kommerziell - Keine Bearbeitungen 4.0 International Lizenz