Wednesday, August 12, 2015


Some reflections on my coding style after 10 years of ICFP Programming Contests

The first ICFP Programming Contest that I participated in was the 2005 Cops & Robbers problem, organized by Robby Findler and the PLT group. I was just learning OCaml at the time and used it for my solution. Prior to OCaml, the only other functional languages I had used were Lisp and Scheme.

Although I used Haskell in 2015, I used OCaml for the 2014 contest, so I find it useful to compare my code between 2005 and 2014.

One of the things I notice in my code from that first contest is how much it reflects my background of stateful, imperative languages. For example, most of the data types I defined were full of records with mutable fields, like this:
type player_rec = {
  mutable player_name : string;
  mutable ptype : player_type;
  mutable ploc : int;

I didn't understand at the time that it is easy to copy records where you just modify the fields you want to change.

Another symptom of my stateful thinking is that I used refs and List.iter a lot when there were better ways to do it. For example, here's how I computed the distance to the nearest cop in the Cops & Robbers game:
let distance_from_nearest_cop loc =
  let distance = ref !max_distance in
  let get_cop_distance cop =
    let dist = ((cop_distance_selector cop) (!the_world.nodes.(cop.ploc))).(loc) in
      ((Printf.eprintf "Distance from cop %s is %d\n" cop.player_name dist);
      if dist < !distance then
    distance := dist)
    List.iter get_cop_distance (get_cops());

A better option here might be to define a function to return the minimum value in a list, and then map a get_distance function over all the cops. Something like this:
let minimum l = fold_left min (hd l) (tl l)

let distance_from_nearest_cop loc cops =
  minimum (map (get_distance loc) cops)

In 2014, the contest involved writing code to run on a virtual machine based on the SECD Machine. I wrote a compiler in OCaml to convert a simple form of Lisp to the instructions for the GCC (General Compute Coprocessor - the SECD-like machine). The compiler generates an abstract syntax tree (AST) and then generates code from the tree. Here is a snippet of the definitions for the AST:
type defs = def list
def =
    | Defun of string * (string list) * (statement list)
    | Defun_Tail of string * (string list) * (statement list)
    | Defconst of string * int
statement =
    | Let of (let_env list) * (statement list)
    | If of (statement * statement * statement)
    | Begin of statement list
    | Function_Call of string * (statement list)
    | Tail_Function_Call of string * (statement list)
    | Cons of statement * statement
    | Car of statement

When processing the AST, I need to keep track of a lot of state information for the code generator. This is the state type:
type state = {
   current_function_code : string list;
   environment_stack : string list list;
   function_list : (string*(string list)) list;
   next_temp_number : int;
   symbol_table : (string * symbol_table_entry) list;
   pc : int;
   is_tail_recursive : bool;
   top_level_env : string list

Notice that there aren't any mutable fields any more. Instead, I pass the state to various functions and they return a modified version of the state. For example, this function is how I add new instructions to the current function code I am generating:
let add_code_lines state code_lines =
    { state with current_function_code = (List.append state.current_function_code code_lines) }

And as an example, here is how I generate the instructions for a NOT operation:
let generate_not state =
    add_code_lines state [ "LDC 0"; "CEQ" ]

If you're curious, the "LDC 0" loads the constant 0 onto the data stack, and CEQ does an equality comparison.

Where this way of processing state looks a little ugly is when I chain a lot of operations together. For example:
generate_function state fn env_stack env statements return_type =
    let curr_code = state.current_function_code in
    let state = { state with current_function_code=[]; environment_stack=env::env_stack} in
    let state = add_code_line state (";FN="^fn) in
    let state = generate_statements state statements in
    let state = add_code_line state return_type in
    let state = { state with function_list = (fn, state.current_function_code)::state.function_list} in
    let state = add_symbol state fn (FunctionSymbol (-1)) in
    { state with current_function_code = curr_code; state.environment_stack }

A nicer way to string these calls together is to define the functions to take state as the final parameter. Here are some simple examples:
let fun1 tn state = { state with next_temp_number = tn }
let fun2 code state = {
  state with current_function_code = List.append state.current_function_code code }
let fun3 newpc istr state = {
  state with is_tail_recursive = istr; pc = newpc }

Then, you can use the pipe-forward or reverse-application operator (|>), which I like to call the "chicken operator", to chain the calls together, like this:
let process_something state =
  state |> fun1 4 |> fun2 ["LDC 0"; "CEQ"] |> fun3 123 false

Because state is the last parameter, I just omit it from the calls to the other functions and they are treated as partially applied functions where the |> operator will supply the final state parameter. This operator wasn't in OCaml in 2005, but I could have just defined something like it (I think).

Thursday, May 22, 2008


Adding 1 to a binary number in sed

I have an odd fascination with binary clocks. I just recently built my own, based on an AVR microcontroller, which I will post more about soon. I also have a binary watch that was a present from Ceal a few years ago. During a particular boring meeting yesterday, I was watching it count the seconds, and I realized that to increment a binary number by 1, you are really looking for the pattern: 01*$ (the last part of the number consisting of 0 followed by only 1s to the end of the line). When you find that pattern, you just invert the bits.

For example, to add 1 to 10011 (19 decimal), you are only interested in the 011 at the end. You invert that part and get 10100 (20 decimal). The only time the 01*$ pattern doesn't work is when the number is all 1s. In that case, you can insert a leading 0 and again just invert the bits. For example, 1111 (15 decimal) first becomes 01111, then inverted becomes 10000 (16 decimal).

So, after that little epiphany, I began to wonder if I could write a sed script that functioned as a binary adder. At first I didn't think it could be done, because I was thinking purely in terms of doing substitutions. I have since learned much more about the capabilities of sed and found it was possible. Here is the script, along with an explanation of what each line does:


s/^1*$/0&/If the number is all 1s, insert a leading 0
hCopy the current line to the hold buffer
s/\(.*\)01*$/\1/Strip off the 01*$ pattern from the end
xExchange the current pattern with the hold buffer
s/^.*\(01*\)$/\1/Strip off everything but the 01*$ pattern
y/01/10/Invert the bits in the current line
xExchange the current pattern with the hold buffer (go back to the prefix bits)
GJoin the hold buffer (the end bits that were just inverted) to the current pattern (sed puts it on a separate line)
s/\n//Join the two lines

To test this, I wrote a little script that starts with 0, runs it through the adder, prints the result, and the runs the result through the adder. This is the script (caution, it loops forever):

while true; do
NUM=`echo $NUM | sed -f adder.sed`
echo $NUM

And here is a snippet of the results:


Thursday, March 09, 2006


Beyond XML, part 2

I have come to the realization that the thing I was trying to describe in my previous post is fairly close to ASN.1 (Abstract Syntax Notation 1). The thing about ASN.1 that I never really grasped was that you can have different encoding rules. Thus, you can have an encoding that encodes as XML, or another one that has a more efficient binary protocol. I came to this realization while reading about Sun's Fast Infoset for SOAP requests, during my desperate search for a production-ready Java SOAP implementation that was reasonably fast. Still looking, though..

Sunday, August 14, 2005


Beyond XML

Most people I have worked with will tell you that I am not a fan of XML. It has its places, but people seem to try to use it everywhere they can just because it is popular. I was reflecting this morning on XML Schemas and how ugly they are. It is difficult to look at a schema and visualize the data layout.

At first, I wondered why this is so, and why you can look at a data definition in other languages and get a good idea how everything fits. Some of this is probably the noise of XML -- all those tags just give your brain more information to process. I think another part of it, though, is that XML is so explicit about things. In other languages, you can define things more positionally. That is, you can define a variable declaration as "type varname". In XML, you can do the same thing in your schema, but you still need tags for the type and varname.

As I thought about this, it occurred to me that maybe all we need is a really super schema language. One where we define mappings to and from an abstract data notation. Instead of just mapping to/from XML, it would be great if you could map to/from all kinds of formats -- property files, flat files, comma-delimited, etc. You could couple this with some kind of interface definition language that expresses an interface in terms of the abstract data notation and then you could have something like SOAP but with support for multiple syntactic forms.

For example, suppose I define a remote-call interface in an abstract notation that looks something like this:
remote_call_request = string call_name, any params[];
remote_call_response = any result;

Then I could define multiple syntaxes for the remote call. One might looks like a SOAP request, one might be pipe-delimited text. It seems like the syntax definition might be pretty complicated to write, but it also seems like it could solve some headaches. It would be more friendly to legacy systems, you could get closer to a binary XML kind of format, you could convert easily from one format to another (like you do with XSLT, but with better support for non-XML formats).

I don't know if anyone is working on something like this, or if it is even possible, but it is interesting to think about.

Thursday, July 14, 2005


Lispy or not?

I just started making Lisp versions my cop and robber entries from the ICFP contest. I am aware that because other languages I use don't have anything like Lisp's macros, I tend not to use macros in Lisp. I am trying to change that.

When I was coding my parser in ocaml, I thought it was cool to use the match statement to parse the various lines, like this:

let parse_line line =
match Str.split (Str.regexp "[ \t]+") line with
"wsk\\" :: [] -> handle_wsk_start()
| "name:" :: name :: [] -> handle_name name
| "robber:" :: name :: [] -> handle_robber name
| "cop:" :: name :: [] -> handle_cop name
| "nod\\" :: [] -> handle_nodes_start()
| "nod:" :: name :: tag :: x :: y :: [] -> handle_node name (node_tag_of_string tag)
(int_of_string x) (int_of_string y)

Since I don't have match in Lisp, I wondered what would be a good way to do this in Lisp. I decided to do it with a macro. I keep a global hash table that maps keywords like "wsk\\" and "name:" to functions that handle the data for those keywords. I use a macro named defparse to create a function where instead of the name of the function, I supply the keyword name. Here is how I handle the "nod:" line for example:

(defparse "nod:" (name tag x# y#)
(push (make-instance 'node :name name :tag tag :x x# :y y#) *build-node-list*))

One of the interesting things here is that you are seeing the function that handles the line, not just my definition of what is on the "nod:" line. In the ocaml version, I had all the keywords in one big match statement and made them call separate functions. I could have implemented the functions in the match statements but I thought it might look ugly. I like the Lisp solution here because I define the line contents and the function in one place, but it isn't all cluttered with other line definitions.

There are a couple of things going on in defparse. First, it turns the function body into a lambda expression that is mapped in the global keyword hash table. Next, it looks for parameters whose names end in # and automatically converts them from string to integer. This is what the expansion of my (defparse "nod:" ...) looks like:

(LAMBDA (NAME TAG #:G4171 #:G4172)
(LET ((|X#| (PARSE-INTEGER #:G4171)) (|Y#| (PARSE-INTEGER #:G4172)))

Those #:G4171 and #:G4172 names are unique symbol names generated by the gensym function. A macro in Lisp is basically a function whose return value is code that will be compiled. This means that it blends in well with the language - there are a few syntactic differences having to do with quoting, but otherwise, you don't learn a separate macro language. One of the difficulties is that the macros run at compile time, so you can make use of all the functions you have defined (as far as I can tell, at least).

For the curious, here is my defparse macro:

(defmacro defparse (command param-list &body body)
(let* ((int-params (remove-if-not
#'(lambda (param)
(let ((ps (symbol-name param)))
(char= (elt ps (1- (length ps))) #\#))) param-list))
(param-map (mapcar #'(lambda (x) (cons x (gensym))) int-params))
(param-list-with-syms (mapcar
#'(lambda (param) (if (assoc param param-map) (cdr (assoc param param-map)) param))
(let-list (mapcar #'(lambda (pg) (list (car pg) (list 'parse-integer (cdr pg)))) param-map)))
`(setf (gethash ,command *command-handlers*)
(lambda ,param-list-with-syms
(let ,let-list ,@body)))))

The thing I am a bit curious about is whether this is a Lispy way to approach the problem. Would a Lisp pro nod his or her head at this, or start to gag?

Sunday, July 10, 2005


My 2005 ICFPC Submission

My Ocaml code for the 2005 ICFP Programming Contest is now available at
I'm sorry to say that there is only one barely informative comment. The only other comments are code that has been commented out.

One of the things I learned last night on IRC was that I can optimize some of my match statements. The parser ( has lines like this in its match:

| "nod:" :: name :: tag :: x :: y :: [] -> handle_node name (node_tag_of_string tag)
(int_of_string x) (int_of_string y)

Apparently, I can make this a little shorter with:

| ["nod:";name;tag;x;y] -> handle_node name (node_tag_of_string tag)
(int_of_string x) (int_of_string y)

Although I am happy with my choice of Ocaml for doing this project, I am not happy with my usage of Ocaml. I didn't use the "O" part of it at all - no objects. I did things as modules and it was a little cumbersome. I should have made a generic Bot class and then derived the cop and robber from that. I ended up copying too much code back and forth.


36+7 hours of coding

I submitted my final entry for the 2005 ICFP Programming Contest and the deadline has passed, so now I feel free to share the mini diary I kept while working on the project.

June 24-27, 2005 - It Begins

Friday, 7:08 PM

I'm finally home and eating supper. The contest started 9 hours ago. I had to work today, so the best I have been able to manage is to look through the rules. I am excited to get going. Last night, I did a few tests with Chicken Scheme since I suspected I might have to do some socket code. The LiveCD the contest people put out mentioned something about game servers. Now that I see that I just read from stdin and write to stdout, I am back to Ocaml. The other thing that leads me back to Ocaml is that I will have 5 seconds to make a move each turn, I might need all the speed I can get.

Friday, 7:28 PM
I'm starting on the parser now. I will make a callback interface similar to the way SAX works to handle each message event. For nested messages like wsk, I will have callbacks for each sub-message. The callback interface will allow me to simulate the server in code later without actually having to do the full protocol.

Friday, 9:24 PM
It has been a little bumpy with Ocaml, but the problem is a disconnect between the chair and the keyboard. I am getting the parser written now. I have code to dump out the wsk structure, am trying to get it running now.

Friday, 9:54 PM
Got the basic parser running a little while ago. Now I am adding in support for all the commands the server can send. I have been on #icfpc on freenode for a while now (I am mr_pengy) and it is a blast chatting with these folks. It makes me wish we were all in one big lab.

Friday, 11:48 PM
All the parsing is coded, now I am trying to build the graphs for foot and car paths. After that I think I may go to bed.

Saturday, 12:14 AM
I decided to keep a list of foot edges and car edges in each node, so I essentially have two trees. I am going to bring in my old fibheap code and add dijkstra's algorithm so I can do shortest path computations. I want to see if I can compute the shortest path between all points in each tree. There are < 300 nodes in the graph, so that's just a few hundred thousand distances to compute. I'll have to see how long it takes. If the size of the world increases substantially, though, this will be a bad idea. Time for bed now.

Saturday, 10:02 AM

I probably slept a lot later than I should have. I finally got caught up with the ICFPC mailing list – lots of rule clarifications. Time to code.

Saturday, 11:57 AM

If I were to quit now, it wouldn't be a total loss. I was wrestling with trying to squeeze my fibheap code into this project without making it become really ugly. Then I started thinking about alternate ways to compute the distances. I worked out a way to do Dijkstra's shortest path algorithm without a priority queue. Basically, I start with distance D=0 and I iterate through each node N looking for node Nadj with distance=D. Then, I look at each node Nadj' that is an edge node of Nadj. If the distance from N to Nadj' is greater than D+1, I update the distance from N to Nadj' to be D. When I have done that for all nodes, I then repeat the procedure for all nodes with D=1 and so forth, until I do not find a node to update. On my machine, doing this calculation for all 229 nodes takes about 0.03 seconds. The max distance on foot is 29, the max distance by car is 17. This isn't bad considering the map is roughly a 15x15 grid. This is pretty exciting, even though I have yet to write a complete bot, so I am going to get started on a robber bot.

Saturday, 1:15 PM
Still reorganizing the code and making a bot interface. Still no bot, but this should make it easier to write one. There was a mention on #icfpc to look up the Floyd-Warshall shortest path algorithm, I think that's essentially what I came up with. I am pretty sure mine is O(n^3) like Floyd.

Saturday, 4:16 PM
I'm working on a robber bot. As a first try, I am having it look for banks that it can get to before the cops do, with a certain margin for error. If it can't find one, it picks another place to go. I'm writing lots of functions that should make it easier to add other behavior.

Saturday, 8:53 PM
I have a pretty dumb robber bot. He got caught after robbing one bank. It's not that my cop is smart, the robber ran right into him. The robber targeted banks that were farthest away from the cops, but didn't know how to avoid cops in its path. I have been thinking for a while about what to do about that, and my current thinking is that I need to make a new shortest path algorithm that avoids cops. Basically, before traversing an edge, see if the destination falls within some constraints (no cop within 2 spaces, for example). That would allow me to plot a short path around a cop.

Saturday, 10:13 PM
My robber just made all the way through without getting caught, and having robbed several banks. The cops are still very dumb, and the robber needs some work. His choice of banks is still determined by how far away the cops are, but not how close he is. Also, I need to make some adjustments to make sure it picks valid moves. I had a problem with that and I need to make absolutely sure it validates the move. Still, I feel good about my progress.

Saturday, 10:43 PM
I figured out where the invalid moves were coming from and made several adjustments there. I also changed the algorithm to go for the closest bank now that the distances take cop proximity into account. If it were to get surrounded right now, it wouldn't be able to pick a path because it wouldn't find nodes that are “safe”. In that case, I will make an emergency flee function. I may let it go for a “less safe” path first, then any path where there is no cop. Otherwise, stay put.

Saturday, 11:45 PM
I added the emergency flee function, so it should be able to get out of a jam. Tomorrow I have to make a better cop. I submitted an entry already just in case my machine crashes. Can't be too careful.

Sunday, 9 AM
Over breakfast, I modified the robber's calculations for cop distances to account for cops changing from foot to car. I'm leaving for Meeting for Worship shortly, won't be able to get going again until after noon.

Sunday, 4:04 PM
I have been working on the cop for a while. I tried assigning the cops a beat where they walked from bank to bank. My robber still got a ton of money. Now I am trying to compute the vicinity where the robber should be.

Sunday, 5:29 PM
I have been working on smartening up my cop. The contest rules state that in the initial tournament, your cop must win – not just come in the top 3. Because of the way the evidence collection works, that means if the other cops stumble on evidence and you aren't the one to capture the robber, you lose. This seems a little too random for me. My cop now directs the other cops to the general vicinity of the last known location of the robber, adjusting for time (“Our fugitive has been on the run for 90 minutes...”), but also within a certain minimum distance away. The idea is that my cop comes in to the actual area and snoops around. It doesn't quite work that way, but it isn't too bad. Whenever my cop smells the robber, it computes the possible locations of the robber (based on last-known location) and joins that with the locations that are exactly the smell distance away. Then it picks a random one if there are any. My cop can now occasionally beat my robber. I'm going to see what the ratio is.

Sunday, 5:42 PM
Oops! In computing how far a robber could have moved, I forgot that the robber only moves on every other turn, so the area was much too wide.

Sunday, 5:55 PM
In 20 games, the robber won 12 and my cop won 8. Also, my cop lost to the other cops in 1 out of those 8.

Sunday, 7:12 PM
I am really starting to feel tired. This has been a tough project and I am sure I won't win. I would like to be able to get through the regular season part of it, at least. I have modified the robber a little bit. Before, when it couldn't find a regular move, it would look for a “safe” move, which was a move that maximized the distance from the cops. Now, it looks at the set of those moves and tries to maximize the safe moves from that move as well. I could probably play with the logic there a little, but I will first see if that improves its performance against my current cop.

Sunday, 7:27 PM
The robber won 14 out of 20 this time. It might be just a tiny bit better than before. I started logging its searches and I have seen it get out of a jam when it has picked an emergency move, but usually when it has to do that, it gets caught.

Sunday, 10:03 PM
I submitted my updated cop & robber. My team name is “O Caml, My Caml”. I just recently put in code to look at the informs from the Mcgruff bots again and now I am starting to see how I might use Bayes theorem to predict user location. Someone mentioned Bayes on #icfpc yesterday, but I didn't know what he was getting at. I googled and found a paper on predicting customer location on a wireless network and now I am beginning to understand. Unfortunately, there are less than 12 hours left, and I know I won't be able to last all night. I don't know if I can create a smarter cop in the next few hours or not. I'll give it a try.

Sunday, 11:07 PM
I tried a few other things with probabilities, trying to incorporate Bayes theorem, but it seemed to be a little worse than my previous algorithm. Perhaps the Mcgruff bots aren't so good at estimating probabilities. It looks like I'll be sticking with what I have.

Sunday, 11:32 PM
I just sent in my final submission for the first part of the contest. Time for bed.

Monday, 2:24 AM
I was in bed with the light off when I realized that if a robber never hit a bank, my cop wouldn't find him. I switched the cop to start off in beat mode. Now I am doing some analysis on what bank a robber hits last before getting caught. So far, "60-and-blackstone" and "53-and-the-other-lake-park" lead by far. I tried eliminating 60&B already and cops won 17 out of 30 times. I am trying again with all banks allowed. Then I will try with 60&B and 53&OLP removed.

Monday, 3:44 AM
I tried a number of combinations, including picking random banks, the farthest bank, sticking with a bank until it was robbed. In the end, the best strategy was still to aim for the closest bank, but don't rob the bank at 60-and-blackstone. That should be enough right there to disqualify me from the Judge's Prize. NOW I'm going to bed.

July 9-10, 2005 - And now, The Twist
The interesting thing about this contest was that after the initial 72-hour period, there was almost a two-week lull, then they announced a change in the rules that we had 24 hours to respond to. I can't say that my response was great, but at least I submitted something.

Saturday, 10:10 AM
I am in Blacksburg, VA getting ready to come home after a week at the Friends General Conference Gathering. My dorm room at Va. Tech has an internet connection so I have just logged in to start reading the "twist" in the problem specification. UGH! Cops can be bribed! I don't know how I am going to do this. I spend the next 20 minutes looking over the requirements.

Saturday, 3:20 PM
Ceal is driving right now as I work on my program. I have just adapted my original cop and robber to the new protocol. The robber doesn't offer any bribes and the cop doesn't accept them. At least I have something.

Sunday, 12:45 AM
I have just submitted my final code. I am really tired after waking up at 6, packing the van and driving about 4 of the 7 hours of the return trip. My robber tried to bribe cops, but my cop does not take bribes. I decided to leave the cop alone. This will probably cost me the judges prize, even though I don't think I had a realistic shot at it to begin with. This has been a fun project and I hope to go back and try to add better strategy to my programs later.

Sunday, June 26, 2005


Coding my head off

It has been a really long time since I posted anything here. This weekend I have been participating in The Eighth Annual ICFP Programming Contest.

There were a few accidental clues given before the contest started. The LiveCD they distributed mentioned something about game servers, and the submission instructions mentioned a cop and a robber executable. Thinking that the contest might require some socket programming, I looked at the languages available to see which ones would be best for that. I always had Python and Java to fall back on, but Chicken Scheme seemed like a possibility.

When the contest was announced on friday, however, the cop and robber executables just had use stdin and stdout for all communication, which left the choice of language pretty wide open. I decided to go with OCaml. There are things about OCaml that really annoy me. I get syntax errors in odd places and I have to put parens in places I wouldn't expect. It is probably just because I don't know the language well enough. When the contest is over, I'd love to have someone who knows the language well to look over my code and tell me what stupid things I do.

Despite the minor annoyances, I am really happy that I picked OCaml. Toby Reyelts was arguing with me here about static vs. dynamic typing when I first started this blog. I think that OCaml's static type checking really saved me many times on this project. When I changed something significant, the compiler dutifully told me when I broke an interface.

There will be more about the contest in a future blog entry. I have been keeping a little log during the contest, just updating what I am working with, along with the time, so you get some idea what it might be like. I just don't want to say anything until the contest is over (in 2 weeks, after the requirements change) because I don't want to give away anything to my opponents.

I submitted a cop & robber a little while ago and I think they may end up being my final entries. It is almost 11pm EDT, so there are only 11 hours left in the contest and I'm really tired. Realistically, I don't think I have any shot at winning. I think that my bots have a good chance of making it out of the regular season round, but I expect them to be trounced in the playoffs.

This page is powered by Blogger. Isn't yours?