Wednesday, August 12, 2015
Some reflections on my coding style after 10 years of ICFP Programming Contests
The first ICFP Programming Contest that I participated in was the 2005 Cops & Robbers problem, organized by Robby Findler and the PLT group. I was just learning OCaml at the time and used it for my solution. Prior to OCaml, the only other functional languages I had used were Lisp and Scheme.
Although I used Haskell in 2015, I used OCaml for the 2014 contest, so I find it useful to compare my code between 2005 and 2014.
One of the things I notice in my code from that first contest is how much it reflects my background of stateful, imperative languages. For example, most of the data types I defined were full of records with mutable fields, like this:
I didn't understand at the time that it is easy to copy records where you just modify the fields you want to change.
Another symptom of my stateful thinking is that I used refs and List.iter a lot when there were better ways to do it. For example, here's how I computed the distance to the nearest cop in the Cops & Robbers game:
A better option here might be to define a function to return the minimum value in a list, and then map a get_distance function over all the cops. Something like this:
In 2014, the contest involved writing code to run on a virtual machine based on the SECD Machine. I wrote a compiler in OCaml to convert a simple form of Lisp to the instructions for the GCC (General Compute Coprocessor - the SECD-like machine). The compiler generates an abstract syntax tree (AST) and then generates code from the tree. Here is a snippet of the definitions for the AST:
When processing the AST, I need to keep track of a lot of state information for the code generator. This is the state type:
Notice that there aren't any mutable fields any more. Instead, I pass the state to various functions and they return a modified version of the state. For example, this function is how I add new instructions to the current function code I am generating:
And as an example, here is how I generate the instructions for a NOT operation:
If you're curious, the "LDC 0" loads the constant 0 onto the data stack, and CEQ does an equality comparison.
Where this way of processing state looks a little ugly is when I chain a lot of operations together. For example:
A nicer way to string these calls together is to define the functions to take state as the final parameter. Here are some simple examples:
Then, you can use the pipe-forward or reverse-application operator (|>), which I like to call the "chicken operator", to chain the calls together, like this:
Because state is the last parameter, I just omit it from the calls to the other functions and they are treated as partially applied functions where the |> operator will supply the final state parameter. This operator wasn't in OCaml in 2005, but I could have just defined something like it (I think).
Although I used Haskell in 2015, I used OCaml for the 2014 contest, so I find it useful to compare my code between 2005 and 2014.
One of the things I notice in my code from that first contest is how much it reflects my background of stateful, imperative languages. For example, most of the data types I defined were full of records with mutable fields, like this:
type player_rec = { mutable player_name : string; mutable ptype : player_type; mutable ploc : int; };;
I didn't understand at the time that it is easy to copy records where you just modify the fields you want to change.
Another symptom of my stateful thinking is that I used refs and List.iter a lot when there were better ways to do it. For example, here's how I computed the distance to the nearest cop in the Cops & Robbers game:
let distance_from_nearest_cop loc = let distance = ref !max_distance in let get_cop_distance cop = let dist = ((cop_distance_selector cop) (!the_world.nodes.(cop.ploc))).(loc) in ((Printf.eprintf "Distance from cop %s is %d\n" cop.player_name dist); if dist < !distance then distance := dist) in List.iter get_cop_distance (get_cops()); !distance;;
A better option here might be to define a function to return the minimum value in a list, and then map a get_distance function over all the cops. Something like this:
let minimum l = fold_left min (hd l) (tl l) let distance_from_nearest_cop loc cops = minimum (map (get_distance loc) cops)
In 2014, the contest involved writing code to run on a virtual machine based on the SECD Machine. I wrote a compiler in OCaml to convert a simple form of Lisp to the instructions for the GCC (General Compute Coprocessor - the SECD-like machine). The compiler generates an abstract syntax tree (AST) and then generates code from the tree. Here is a snippet of the definitions for the AST:
type defs = def list and def = | Defun of string * (string list) * (statement list) | Defun_Tail of string * (string list) * (statement list) | Defconst of string * int and statement = | Let of (let_env list) * (statement list) | If of (statement * statement * statement) | Begin of statement list | Function_Call of string * (statement list) | Tail_Function_Call of string * (statement list) | Cons of statement * statement | Car of statement
When processing the AST, I need to keep track of a lot of state information for the code generator. This is the state type:
type state = { current_function_code : string list; environment_stack : string list list; function_list : (string*(string list)) list; next_temp_number : int; symbol_table : (string * symbol_table_entry) list; pc : int; is_tail_recursive : bool; top_level_env : string list }
Notice that there aren't any mutable fields any more. Instead, I pass the state to various functions and they return a modified version of the state. For example, this function is how I add new instructions to the current function code I am generating:
let add_code_lines state code_lines = { state with current_function_code = (List.append state.current_function_code code_lines) }
And as an example, here is how I generate the instructions for a NOT operation:
let generate_not state = add_code_lines state [ "LDC 0"; "CEQ" ]
If you're curious, the "LDC 0" loads the constant 0 onto the data stack, and CEQ does an equality comparison.
Where this way of processing state looks a little ugly is when I chain a lot of operations together. For example:
generate_function state fn env_stack env statements return_type = let curr_code = state.current_function_code in let state = { state with current_function_code=[]; environment_stack=env::env_stack} in let state = add_code_line state (";FN="^fn) in let state = generate_statements state statements in let state = add_code_line state return_type in let state = { state with function_list = (fn, state.current_function_code)::state.function_list} in let state = add_symbol state fn (FunctionSymbol (-1)) in { state with current_function_code = curr_code; environment_stack=List.tl state.environment_stack }
A nicer way to string these calls together is to define the functions to take state as the final parameter. Here are some simple examples:
let fun1 tn state = { state with next_temp_number = tn } let fun2 code state = { state with current_function_code = List.append state.current_function_code code } let fun3 newpc istr state = { state with is_tail_recursive = istr; pc = newpc }
Then, you can use the pipe-forward or reverse-application operator (|>), which I like to call the "chicken operator", to chain the calls together, like this:
let process_something state = state |> fun1 4 |> fun2 ["LDC 0"; "CEQ"] |> fun3 123 false
Because state is the last parameter, I just omit it from the calls to the other functions and they are treated as partially applied functions where the |> operator will supply the final state parameter. This operator wasn't in OCaml in 2005, but I could have just defined something like it (I think).