Functions, abstraction and compiling

More puzzles from my compiler writing experiments – this time figuring out how function calls work. They are so ingrained as part of computer languages – it seems somehow surprising to me that they are quite challenging to implement as bytecode.

Conceptually, functions are primarily a means for humans to break problems down into smaller parts in order to better reason about them – the process of doing this is called abstraction. What is good for human reasoning, is also good for efficient use of finite space in memory – so good functions will be able to be reused in as many places as possible (to be general, in other words).

At the low level, when a function is called we jump to a new location of the code, do some stuff, and then return to where we came from. We also need to deal with arguments passed to the function (giving it different values to work on), and then we need to keep things like the return address and result of the function somewhere safe.

The Jellyfish Virtual Machine is a stack machine – so everything needs to be stored on a single stack we can push and pop values to and from (vectors in this case). Calling a function is a case of:

1. Calculating and pushing the return address to the stack.
2. Calculating and pushing all the arguments onto the stack in order.
3. Jump to the function pointer.

;; return a list of vector instructions to generate 
;; a function call to an existing function pointer
(define (emit-fncall x addr)
  ;; generate instructions to eval and push all the arguments
  (let ((args (emit-expr-list (cdr x))))
     ;; first push the address we want to return to to the stack
     ;; skip the argument pushing code and the following function jump
     ;; we need to tell the linker to fix the address later on:
     (emit (list 'add-abs-loc 'this 1
                 ;; adds absolute location to the offset
                 (vector ldl (+ (length args) 2) 0)))
     args ;; code to push arguments to stack
     (emit (vector jmp addr 0))))) ;; jump to the function pointer

One complexity is that we won’t know the exact address of the function until the compilation process is finished, so we need to tag the relevant instruction to be modified in a second pass later on – this is called linking, which stitches the addresses together properly after compiling.

When we are in the function with the stack set up correctly we need to:

1. Pop all the arguments from the stack and put them somewhere where they can be referred to by name.
2. Do the body of the function.
3. Store the end result on the stack.
4. Jump back to the return address.

When finished, the only thing should be the result of the function call at the top of the stack. The code to generate this is provided by the core “lambda” function (which can be assigned to a name for calling later or called directly – anonymously).

;; lambda - create a function, and return the pointer to it
(define (emit-lambda x)
  (let* ((body
          ;; first we need to get all the arguments out of the
          ;; stack so the function body can refer to them
           (map ;; map over the argument names
            (lambda (arg)
              (make-variable! arg)
              ;; store the value where it can be looked up by the name
              (vector sta (variable-address arg) 0))
            (cadr x))
           ;; now args are stored, do the body
           (emit-expr-list (cddr x))
           ;; swap ret pointer with the result 
           ;; so it's the top of the stack
           (emit (vector swp 0 0))
           ;; jump to the address on the stack
           (emit (vector jms 0 0))))
         ;; make-function! stores the code in a table 
         ;; and returns the address the function will exist at
         (loc (make-function! body)))
      ;; use the linker again to offset 
      ;; from function code section start
      (list 'add-abs-loc 'function-code 1
            (vector ldl loc 0))))))

One thought on “Functions, abstraction and compiling”

Leave a Reply

Your email address will not be published. Required fields are marked *