Viruscraft: Genetic model connected to a tree visualisation

The genetic model we were working on previously has now been ported into a browser compatible form and connected to a new tree visualisation that displays the species that emerge as the host population adapts to a virus infection. It’s still a prototype with rough edges, but have a play with it here, some example pics:


This is one of the earlier attempts, which I like the look of, but later cleaned up versions are a bit clearer to read what is going on.



Firstly the genetic part is working in that the population evolves to get gradually better at coping with the virus infection (the fitness score increases) – it takes a bit too long at the moment, but it’s great to be able to see this working in realtime as it happens with new species branching off older ones.

One early observation is that this has the potential to show why diversity is beneficial. If you modify the virus (fitness function) at a point when there are lots of different species present, the chances are that a few of them will be resilient enough to the infection to expand into the new environmental niche, and things eventually continue as before. If you alter the virus when there is only a single really well evolved species that is a bit too good at coping with the existing virus – the chances are that you will cause the population to go extinct as it won’t be able to adapt. This is analogous to the situation with bananas: “To carry on growing the same genetic banana is stupid”.

A big chunk of the work here was actually spent optimising the code. It’s pretty amazing how developed browsers are for development – I’ve been using the profiler in the chromium browser to locate slowness and keep the frame rate as near 60 frames per second as possible. I just noticed taking the screenshot for this post the slowest single part seems to be the debug text rendering, strangely enough.


Viruscraft: building a ‘reasonably accurate’ genetic game world simulation

The concept for the viruscraft game is to have a realtime genetic model or simulation of the host evolution which is adapting to the properties of a virus you are building (either on screen or via a tangible interface as part of an exhibit). This model needs to be realistic, but only up to a point – it can be more of a caricature of biology than a research model would need to be, as our intentions are educational rather than biological research.

Using our previous species prototype as a starting point, we have a network of connected locations that can be inhabited by organisms. These organisms can jump to neighbouring locations and be infected by others in the same place at the same time. Now we need to figure out how different species of these organisms could emerge over time that evolve immunity to a virus – so we can build up a family tree (phylogeny) similar to the ones we created for the egglab game but that is responding to the viruses that you create in realtime as you play. The evolution itself also has to happen fast enough that you can see effects of your actions ‘quickly enough’, but we’ll worry more about that later.

For a job like this we need to move back from fancy visualisations and graphics and try to get some fundamental aspects working, using standard tools like graphviz to understand what is going on to save time.

The first thing to do is to add a fixed length genetic string to each individual organism, this is currently 40 elements long and is made from biologically based A,T,C and G nucleotide symbols. We chose these so we can use biological analysis tools to test the system as we go along just like any other genetic process (more on that below). The organisms can also reproduce by spawning copies of themselves. When they do this they introduce random errors in the genetic code of their offspring which represents mutation.

Previously we were using a ‘SIRS’ model for virus infection (susceptible -> infected -> recovered -> susceptible), based on 4 global parameters that determined the probability of jumping from one state to the next. Using the genetics, the probability of infection is now different for every individual based on:

1. Is a virus infected individual in the current location?

2. If so, use our genetic code to determine the probability of catching it. Currently we use the ratio of A’s to T’s in the genetic string as a totally arbitrary place-holder ‘fitness function’, the lower the number the better. AAAAAAT is bad (fitness: 6) while TTTTTTA is good (fitness: 0.1666) – so we would expect the A’s to disappear over time and the T’s increase in the genetic strings. This number also determines the probability of dying from the disease and (inversely) the probability of gaining immunity to it.

3. A very small ‘background infection’ probability which overrides this, so the virus is always present at a low level and can’t die out.

The next thing we need is a life cycle for the organism – this needs to include the possibility of death and the disease model is now a ‘SIR’ one, as once recovered, individuals cannot go back to being susceptible again.


All the other non virus related probabilities in the simulation (spawning offspring, moving location, natural death) are currently globally set – to make sure we are seeing evolution based only on disease related behaviour for now.

This model as it is could form the foundation of a world level visualisation – seeing organisms running around from place to place catching and spreading your virus and evolving resistance to it. However this is only half the story we want to tell in the game, as it doesn’t include our time based ‘phylogenetic’ family tree view. For this, we still need to figure out how to group individuals into species so we can fully visualise the effects of your virus on the evolution of all the populations as a whole.

First we need to decide exactly what a species is – which turns out to be quite an arbitrary concept. The rather course approach that seems to work here is to say that two organisms represent two distinct species if more than a quarter of their genes are different between them.

We can now check each organism as it’s born – and compare its genome against a ‘blueprint’ one that represents the species that it’s parent belongs to. If it’s similar enough we add it to its parents species, if it’s too different we create a new species for it. This new species will have a copy of its genome as the ‘blueprint’ to compare all its descendants with. This should mean we can build up a set of related species over time.

If we run the simulation for 5000 time steps we can generate a phylogenetic family tree at the end, using the branch points between species to connect them. We are hiding species with only 1 member to make it simpler, and the population is started off with 12 unique individuals. Only one of which (species 10) is successful – all the later species are descendants of that one:


The numbers here are the ID, fitness and size of population for each species. The colours are an indication of population size. The fitness seems to increase towards the right (as the number drops) – which is what we’d expect if new species are emerging that cope better with the virus. You can imagine changing the virus will cause all this to shift dramatically. The “game mechanic” for viruscraft will all be about tinkering with the virus in different ways that changes the underlying fitness function of the host, and thus the evolution of the populations.

As we used standard biological symbols for our genetic code, we can also convert each species into an entry in a FASTA format text file. These are used by researchers to determine population structure from limited information contained in genetic samples:

> 1 0.75 6
> 3 0.46153846153846156 5
> 5 0.6153846153846154 171

In the FASTA file in the example above, the numbers after the ‘>’ are just used as identifiers and are the same as the tree above. The second line is the blueprint genome for the species (its first individual). We can now visualise these with one of many online tools for biological analysis:


This analysis is attempting to rebuild the first tree in a way, but it doesn’t have as much information to go on as it’s only looking at similarity of the genetic code. Also 40 bases is not really enough to do this accurately with such a high mutation rate – but I think it’s a good practice to keep information in such a way that it can be analysed like this.

Viruscraft next steps

Following on from the first viruscraft workshop, we can now start planning the viruscraft game. The field of virology from genetics and interactions on the microscopic scale to the spread of disease and it’s effects on the ecosystem is huge, so we used the workshop primarily to identify the core things that are the most important to convey, and promising ways we can use to explain them. Getting high quality feedback so early on has allowed us to get a good sense of what is important with a diverse mix of people – the things that they picked up on (and just as importantly the things that they didn’t) saves a lot of time – and sharpens our focus right from the start.

1. Phylogeny

Phylogeny is the name given to a kind of family tree that shows the evolution and development of species over time. Ben’s work is concerned with how viruses can jump across species, so the concept of phylogeny is central to his work. There is also something very concrete and humbling about hearing the time scales involved here – nearest common ancestors of related species of fruit flies (his model organism for study) being tens of million years distant, while you need to go back 800 million years to find their common ancestor with us. These numbers are hard to grasp, but at the same time put things into perspective. Playing with and visualising long time spans will be an important aspect of this project.

2. Interaction of hosts and viruses

Viruses and their hosts are very different, hosts can be any creatures from bacteria, plants or animals while viruses seem little more than self replicating geometric shapes. Despite these differences, viruses and hosts have a huge effects on each other’s evolution – viruses need to spread and infect as many individuals as possible, but if they get ‘too good’ they kill off their hosts and they die off too. We’ve dealt with the co-evolution of hosts and diseases before in the red king sonification project, and again here the dynamics between competing organisms needs to be a central theme.

3. Shape matching/arms race

Our workshop participants found it surprising that one of the defining aspects of a viruses success is down to shape – e.g. whether you succumb to a cold or not is down to the ‘lock and key’ connection which needs to happen for the virus to attach to receptors on a cell’s external structure. These physical forms are the most promising area for viruscraft in terms of game mechanics – particularly if we are thinking about physical, tangible interfaces.

As an arms race situation between the host and the virus, our butterfly mimicry game for Cambridge university is a good reference for how a game mechanic can work in this context, as it accelerates evolution over the course of a minute or two as you play.

Given these themes we can now fill out our initial sketch a little with the new scientific information we learnt from the workshop as well as the feedback and ideas. One major addition is to add phylogeny as a kind of racing game in reverse, with time heading backwards so you can see the effects of your virus on a population, and also gives us a way to visualise your virus skipping between species. Like the Inca, we travel backwards into the future, giving us a view on biological history.

The primary game mechanic controlling the species jumping and virus success in general is combining shapes based on receptors on the host cells – this is very roughly shown above, the virus is currently attaching to the spherical receptors and infecting those host populations, if one of the players plugs in the shapes they are holding they will jump across to infect one of the other species too. The precise nature of this needs to be realistic enough that players get the idea that this is actually how things work rather than a metaphor, but schematic and abstract enough that it’s simple enough to understand within a few minutes of play. The other aspect of the shape matching is that the relatedness of host species should be reflected in the receptors in some sensible way (closely related ones should be similar) – so we need a procedural ‘receptor combination generator’ to make matching interesting, with zillions of possible combinations.

Another limitation is that of feasibility with regards to building a tangible interface. Luckily we have a few stepping stones that give us a range in terms of cost and risk. One of the activities from the workshop was building a large virus capsid out of bamboo – which led to a large scale modular origami climbing frame crossed with a dance mat as a grand vision, incorporating feedback consisting of haptics, lighting or projection mapping. In terms of practicalities (and budget) we need to get there a step at a time.

We can start by building a screen based system for manipulating the virus structure and shape matching, as it needs to be able to work in browsers anyway for accessibility – so this is a good place to begin. Once we have this working we can test the game properly and move on to building a smaller scale tangible interface, like in the sketch above – perhaps based on the pattern matrix technology, primarily designed for exhibitions and family groups to play with together.

The underlying model or simulation that is being manipulated is unusual for us in that it does not have to be a scientific research model, so we can design something that is realistic enough for educational purposes but no more. This is the next priority, and we can build on our viruscraft prototypes by having a model based on individuals navigating a mesh of connected nodes that alter to represent dynamic geographical changes (formation of land bridges and islands). The phylogeny chart can then be a separate visualisation of this same process (with references to futuristic driving games such as f-zero and wipeout). Time may run matched to the earth’s time line, so perhaps every day starts at the appearance of mammals and ends with the present day – the world map could match earth, or could be an alien planet. We can have external events to liven the game up, perhaps if you are doing too well your hosts could in impacted by climate change, asteroid strikes or tectonic shifts.