The Uncarved Block

Set of scala helper classes

Learning scala, I've started to write some generally useful stuff and make it available

permalink Updated: 2020-08-07

The quest to get a mac to do exactly what I want

It all started because I use a weird keyboard. When I don't have this

keyboard plugged in, I want my laptop keyboard map to use dvorak, but since my keyboard is dvorak in hardware, when it's plugged in, the laptop's keyboard map needs to be switched to qwerty otherwise I get a kind of "double dvorak" and everything goes down an extremely strange rabbit hole. Now normally it wouldn't be that much of a hassle to just change it when I'm plugging or unplugging my keyboard, but I found myself in a situation when I kept having to take my laptop to one or other meeting, meaning that I would be plugging and unplugging the keyboard and flipping the setting multiple times a day.

As a nerd, this grates. The computer should just know that if I have that keyboard plugged in I want the keyboard mapping to be qwerty and otherwise I want it to be dvorak. Happily, this is where hammerspoon comes in. Hammerspoon is a framework that allows you to program all kinds of little customizations for mac. Here's how you do the keyboard tweak I wanted:

-- Watch for planck keyboard add/remove and set the keyboard layout
--appropriately
function usbDeviceCallback(data)
    -- Skip internal memory card as it spams an event on suspend/resume
    if (data["productName"] ~= "Internal Memory Card Reader") then
        hs.alert(data["productName"] .. " " .. data["eventType"])
    end

    if (data["productName"] == "Planck") then
        if (data["eventType"] == "added") then
            --when we add the planck, change the hardware layout to
            --dvorak and turn the OS key layout to qwerty
            hs.keycodes.setLayout("U.S.")
        elseif (data["eventType"] == "removed") then
            --when we remove the planck, change the hardware layout to
            --qwerty. It's up to the OS to give me dvorak
            hs.keycodes.setLayout("Dvorak")
        end
    end
end
usbWatcher = hs.usb.watcher.new(usbDeviceCallback)
usbWatcher:start()

...but of course this turned out to be a gateway drug for all kinds of further customizations. How about a thing that starts the annoying WeWork printer client thing whenever I'm at WeWork? (Yup. Done). Automatically tweaks my brightness taking into account my battery percentage and the ambient brightness of the room I'm in? (Yup. Easy peasy). Set a default layout of windows whenever I plug my laptop into an external monitor? (Sure) I've now also got it so I can do almost all window movement and resizing via the keyboard.

Any case, my full config is unavailable right now but I'll try to get it back when my personal mac is brought back to life. Feel free to take from it anything that's usable. The grid resizing thing has a minor bug that I've not been annoyed enough to fix yet, where on certain sizes of monitor it doesn't realise your window is on at the bottom, so you need to move it up to resize. Other than that it's tickety-boo. I highly recommend checking hammerspoon out if you can code a little and want to tweak a Mac to behave exactly the way you want.

permalink Updated: 2017-06-23

Getting started with github

Before we can do our marketmaking system, you need to be able to get the software.

To share the marketmaking software that I talked about in the first marketmaking article , we need to have a repository set up, so I decided to get going with github. You'll need to set up git, and you'll need maven 2 to build everything. I thought I'd put out a small article to get people started with building a small package we're going to depend on for the bigger system. Assuming this goes ok, I'll push the actual software that interfaces with bullionvault.

Once you've got git and maven2 set up, it should be simple enough. On my linux boxes I just do:

git clone git://github.com/huntse/helpers.git
cd helpers
mvn test

...and with any luck, git will get the software, maven2 will download the internet and some time later build and run the tests, and you'll see something like this:

+++----------------------------------------------------
T E S T S
+++----------------------------------------------------
There are no tests to run.

Results :

Tests run: 0, Failures: 0, Errors: 0, Skipped: 0

[INFO]
[INFO] +++ maven-scalatest-plugin:1.1-SNAPSHOT:test (default) @ helpers ---
[INFO] org.scalatest.tools.Runner.run(-p, "/home/sean/tmp/helpers/target/classes
/home/sean/tmp/helpers/target/test-classes", -o, -fNCXEHLOWFD, /home/sean/tmp/h
elpers/target/scalatest-reports/file/constrained.txt, -f, /home/sean/tmp/helpers
/target/scalatest-reports/file/full.txt, -u, /home/sean/tmp/helpers/target/scala
test-reports/xml, -h, /home/sean/tmp/helpers/target/scalatest-reports/html/repor
t.html)
Run starting. Expected test count is: 7
Suite Starting - DiscoverySuite
BasicClientSpec:
An BasicClient object
- should be able to get theflautadors.org
- should be able to reget theflautadors.org using conditional get
- should be able to get HEAD of uncarved.com
- should be able to GET uncarved with parameters
- should be able to get xml
- should be able to handle redirects
- should be able to do a POST with values
Suite Completed - DiscoverySuite
Run completed in 2 seconds, 266 milliseconds.
Total number of tests run: 7
Suites: completed 2, aborted 0
Tests: succeeded 7, failed 0, ignored 0, pending 0
All tests passed.
THAT'S ALL FOLKS!
[INFO] +++---------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] +++---------------------------------------------------------------------
[INFO] Total time: 23.140s
[INFO] Finished at: Wed Jun 01 10:27:36 GMT 2011
[INFO] Final Memory: 6M/11M
[INFO] +++---------------------------------------------------------------------
mvn test  22.79s user 0.74s system 94% cpu 24.793 total

I don't use eclipse or anything like that, but it should be possible too get this working with eclipse too, just don't ask me how.

permalink Updated: 2011-06-02

helper classes for scala

My first opensource scala package

So I've released my helpers code via github. To get Scaladoc you'll need to build it yourself. To build it, and run the tests, do...

mvn test

...after unpacking the tarball somewhere

The motivation behind this package is that I wanted to do some http stuff and wanted to make my client code be as simple as possible while still being polite to servers. So that meant supporting conditional get using Etags, Last-Modified and also supporting gzip encoding. I do all this by using the apache commons httpclient-4.x and httpcore-4.x libraries and wrapping them all up in a class that's convenient and simple to use. Here's a taster: import com.uncarved.helpers.http._

val helper = new BasicClient()
//Get a webpage as a string (if you look at the apache http log4j
//messages you can see it does conditional get and has transparen
//gzip support)
val str = helper.get("http://www.theflautadors.org/")

//Get some XML (with request parameters supplied)
val params = List(("tag"->""), ("limit"->"5"))
val xml = helper.getXML(Request(RequestType.GET, "http://www.uncaJved.com/index.py/rss1.1.xml", params))
val items = xml \\ "item"

permalink Updated: 2011-06-01

Log4j for Scala

Scala is nice. Logging is nice. Scala + logging.....?

I've recently been playing about a bit with Scala. It's a great language in which you can write real programs very easily. As an example, in Java you often want to be able to make use of log4j to log various tracing info. Well, here's a little helper trait you can mix in to scala classes make logging completely trivial:

package com.uncarved.helpers

import org.apache.log4j.Logger;

/**
* LogHelper is a trait you can mix in to provide easy log4j logging
* for your scala classes.
**/
trait LogHelper {
    val loggerName = this.getClass.getName
    lazy val logger = Logger.getLogger(loggerName)
}

You use it like this:

class MyClass extends LogHelper {
    logger.debug("We got ourselves a class")
    def someMethod(temp: Int) = {
        logger.debug("entering someMethod")

        if(temp>25) {
            logger.info("It's mighty hot in here")

            //...do something

        }
        //..... etc

        logger.debug("leaving someMethod")
    }
}

Easy peasy logging. This class is one of several that I have released on github. Enjoy!

permalink Updated: 2011-06-01

Writing an Automated Marketmaking System

The first in a series of articles in which we will construct a simple high frequency trading system.

There has been a lot of press recently about the evils of high frequency trading (HFT), with many commentators saying that HFT may well be the root of the next big financial crisis. The basis for this view is the increasing importance of computerised market-making systems in providing liquidity to markets, and the concern about what happens if these participants stop providing this liquidity in the event of some sort of market panic. Additionally, some commentators maintain that automated market makers are trading in a manner that is to the detriment of other participants.

Rather than attempting to add more commentary to this debate, I'd like to contribute in a practical way by widening the understanding of these systems so that people can decide for themselves. As such, this is going to be the first in a series of articles during which we will build a fully-functional electronic market-making system. It's going to take a little while to develop all the pieces, but since I have the first piece almost ready to go I think it's time I came out with the introductory articles. This series will mostly be aimed at people with some computing background and interest in financial markets but not assume any knowledge of how markets work.

Caveats before we begin

Nothing in these articles is going to constitute any sort of advice as to the merits of investing in a particular product, or making markets using a particular strategy. If you follow the series you will acquire some bits of software which can be used to construct an EMM (Electronic Market-Making) system. I won't make you pay for them, and if you use them, you may lose money. The software may have bugs or unintended features which cause you to lose money. I'm really sorry if that's the case. You need to understand the software and accept the consequences of what it does, because it will be trading on your account. You need to put in place any tests you require to feel happy that it is performing according to your specifications. You also need to understand that in financial markets you are dealing with random processes and as such even well-founded strategies can lead to losses. Additionally, anything you make using this toolkit will need to be able to hold it's own against other market participants who will be aiming to exploit it. In any financial situation you need to do your own research and take responsibility for what happens - this is no different. Obviously you shouldn't put more at risk than you can afford to lose, but it's your money, and you need to decide whether this is right for you.

Background - Marketmakers and liquidity

It would be very cumbersome if every time you wanted to buy something you had to find the person willing to sell exactly that quantity at that time and agree a price, so generally when we buy or sell, a marketmaker actually takes the other side of the trade, hoping to find someone else wanting the opposite bargain later in the day. The marketmaker makes money by charging a small spread (ie they buy for lower than they sell for) in return for assuming the risk of holding the position you have put them into until they are able to unwind it by doing the opposite trade. This risk is a function of the ease of the unwind (how likely they are to be able to find someone to trade with) and the price volatility of the asset. So if you want to trade in a product which has very low price volatility, and very high liquidity, then it would be easy for the marketmaker to find someone to trade out of their position with, and they would not need to worry about the price moving too much while they hold the asset, so you would expect the spread between the bid and ask prices to be very low. Conversely, high volatility and low liquidity assets would normally have high spreads to compensate marketmakers for their higher risk.

Providing liquidity

In the old days, the position of marketmaker was held only by exchange locals who had to pay for the membership that allowed them to earn this spread, but with the advent of limit order books, anyone can provide liquidity to many markets and expect to be compensated for it. Our goal is to write a computer system which will do this for us. In order to do that, we first need to understand how a limit order book works, and by way of an example I'm going to jump right in and introduce the market we will use as the basis for this whole series.

Bullionvault

The market we're going to use for our examples is bullionvault.com, which is in essence a physical gold and silver market, with all the actual metal held in escrow in reserves in New York, London and Zurich, with seperate order books in $, £ and €. If you sign up using that link I will make a small referral fee (at no cost to you) from the commissions you pay to trade your account, and that's I'm going to get for writing these articles. Before you sign up, you should of course, peruse the on-line help so you understand how their system works, and like any other investment, you should think carefully about the risks involved.

Understanding the order book

If you go to the front page, you can see the current sell and buy prices for gold in the three locations in one currency (USD by default).

Bullionvault USD gold touch prices

The touch

These prices are just the top of the order book - the so-called touch prices. You would find more quantity available at different prices to buy or sell further away in the book, but what the prices mean in this example is that if you wanted to buy one troy ounce of gold in NYC it would cost you $1528, but if you wanted to sell you would only get $1524, so the market spread is $4 per TOz or $110 per kg if you live in the modern world. We're going to use metric units for these articles as teaching something as amazing as a modern computer to think in Troy ounces (or any imperial units) is a great evil, like teaching children arithmetic only by using Roman numerals or something.

Bid and Offer

We're also going to use some real market-making terminology, so we're going to refer to bid and ask or bid and offer prices, rather than sell and buy prices. The easy way to remember which way round these are is to think about the fact that as marketmakers we want to make money and so charge people for what they want to do. If they want to sell, we're going to bid to buy those shares from them at a low price. If they want to sell, we will reluctantly offer to sell to them at a high price. Hence bid is low and offer is high. When these prices are reversed, the orderbook is said to be crossed. This happens in equity markets when they are closed for the night, there is then an auction phase where high bids are matched with low offers until the book is uncrossed and normal trading begins. We wouldn't expect the orderbook to be crossed in a continuous trading market like this unless something was wrong and matching was suspended.

Aggressive and Passive

If an order to buy comes in, and it has no limit price, it is matched with the cheapest available sell orders until it is filled. If it has a limit price, it will only be filled up to the limit price on the order. But what happens to the remaining quantity? Under normal circumstances, this quantity stays on the book at the limit price until it can be matched against an incoming sell order within its limit.

We say that an order that provides liquidity by sitting on the book waiting to be filled is passive and that an order which crosses the spread, taking liquidity from the market by crossing off with passive orders is aggressive. We can also refer to the passive touch and aggressive touch. If we have an order to buy, then the passive touch is the bid and the aggressive touch is the offer. This is because if we want to buy passively, we will place our order at the bid or lower, whereas if we want to buy aggressively, we will need to pay the offer or higher. The opposite would be true for a sell order. If we want to sell right now, we will need our limit price to be at, or more aggressive (lower) than, the current bid, whereas if we are prepared to wait, our price can be more passive (ie higher).

Still to come - the software

In the next article, I will introduce a the software we can use to connect to bullionvault. Please feel free to comment below if anything so far is unclear and I'll try to deal with it in the next article.

permalink Updated: 2011-05-11

Mozilla Ubiquity

Fun with this amazing mozilla addon

Ubiquity was an amazing mozilla addon which gives users a command-line interface within the browser which you can use to do various things. Users can easily write their own commands in javascript and share them via the web. In that spirit, I've written my first ubiquity command, which searches hoogle, the haskell type-aware search engine.

CmdUtils.makeSearchCommand({
    homepage: "http://www.uncarved.com/",
    author: { name: "Sean Hunter"},
    license: "MPL",
    name: "hoogle-search",
    url: "http://www.haskell.org/hoogle/?hoogle={QUERY}",
    icon: "http://www.haskell.org/favicon.ico",
    description: "Searches haskell.org for functions matching by name or type sign
    ature.",
});

As you can see, it's virtually all meta-data and that's because there are a bunch of functions around search that know how to do everything you need to do. However, you can write more sophisticated commands that manipulate the browser, the web page you're on, have little built-in previews etc. All very nifty.

When you have the above, you can simply invoke ubiquity and say "hoogle-search Ord a => a -> a" or whatever and it will find you functions matching that type signature. I'll share this command (and any others I write) at www.uncarved.com using the subscription mechanism they recommend.

I was feeling rather proud of the above, however I saw this morning that if you go on a page with a search box, select it and invoke ubiquity and type "create-new-search-command" it writes something very much like the above for you.

Note: Ubiquity functionality has been superceded by the search bar itself.

permalink Updated: 2010-05-05

Using a Linux laptop in text mode

Linux has everything you need to use your laptop even if you choose not to use the pointy-clicky interfaces

So having recently got a netbook, I have been learning how to do everything on it. It's obviously all easy if you use something like xfce or gnomeA, or some netbook-specific spin like the Ubuntu netbook remix , but I like to use xmonad, so it helps to learn how to do all the laptoppy things in text mode.

Here's a list of things it's useful to know:

permalink Updated: 2009-10-17

I've got a new netbook

So finally I give in to temptation...

This week I bought a new netbook (an Acer Aspire One Pro P531). I was determined to reward someone who sells a Linux netbook but in spite of my best efforts noone was selling the sort of configuration I wanted (big SSD drive, 2GB memory) without Windows XP. So I bought one and installed Fedora on it myself.

It's a fantasticlittle device and everything on it just works with Fedora. Apart from the ethernet card which seems not to want to dhcp (although that might be some incompatibility with my home hub/router/adsl thingy).

I'm using it with Xmonad to do random bits of Haskelltinkering when I take the tube to work in the morning. The keyboard is a bit fiddly, but manageable, and the small form factor and light weight are terrific. In this respect it reminds me a little of my old sony vaio which was a sort of expensive predecessor to a netbook that I got second hand.

One thing I would have done differently if I had done more research is to get the other sort of Atom processor. Ones with an "N" or a "Z" at the front are based on the i686 architecture, whereas ones without that are x86_64 based. Since I have an x86_64 desktop pc at home it would have been slightly more convenient not to have to download two seperate fedora versions. The upside of this one is I think it has slightly better battery life. In fact, battery life seems amazing.

permalink Updated: 2009-10-17

n-Bit Gray codes in Haskell

A first step in what will become a combinatorics library

I have been playing around with Gray's reflected binary code (aka Gray codes) and similar things a bit. Before I reveal why I'm doing this lets just dive in and write some code. Gray's algorithm is described well here. The code which follows is in haskell, because it's a really fantastic language and I'm playing around with it at the moment. For scala fans, don't worry. I haven't abandoned scala, this is a parallel effort.

So for starters we need a datatype for representing these things. This is how you define an algebraic datatype in haskell. In what follows, lines beginning "--" are single-line comments

-- | 'Bit' is the datatype for representing a bit in a gray code.
data Bit = Zero | One deriving Show

Alright. So we have a type "Bit" with two constructors Zero and One and a "deriving Show" which means haskell figures out how to turn it into a string. This is useful when you're in ghci (the interactive haskell environment) debugging.

-- prepend a given item onto each of a list of lists (probably something to do t
his in the prelude)
prepend :: a -> [[a]] -> [[a]]
prepend t xs = map (t:) xs

A teeny helper function. Given a list of lists and a thing it sticks the thing on the front of each list in the outer list. This would append the thing on the end of each list:

append :: a -> [[a]] -> [[a]]
append t xs = map (++[t]) xs

Note I'm writing the type signatures explicitly but there's absolutely no problem if you leave them off. So let's generate our Gray codes:

-- | 'gray' generates the gray code sequence of length 'n'
gray :: Int -> [[Bit]]
gray 1 = [ [Zero], [One] ]
gray n = prepend Zero (gray (n-1)) ++ prepend One (descGray (n-1))

-- | 'descgray' generates the reversed gray code sequence of length 'n'
descGray :: Int -> [[Bit]]
descGray 1 = [ [One], [Zero] ]
descGray n = prepend One (gray (n-1)) ++ prepend Zero (descGray (n-1))

So we get an ascending and a descending one for free. Since the descending one is just the ascending one in reverse why (you might say) don't I just define descGray as descGray = reverse.gray ? Indeed, that may be a reasonable thing to do. I'm doing it this way to try to preserve as much laziness as possible, and (although my haskell-fu is still very weak at the moment) I think that if you reverse a list you pretty much have to evaluate each thing in the list. If you read the paper you'll see that this is Gray's (naive) algorithm and there has been an astonishing amount of research in this area leading to more efficient algorithms. I'll give those a crack at some point.

Why am I doing this? You'll see. This is at the heart of building a really cool combinatorics library. I needed something that could enumerate all combinations and permutations of various generic distribution-type things. There are similar but more recent orderings that are comparable to gray codes which I'm also looking into. They'll all be presented here in due course.

permalink Updated: 2009-08-22

Which language would you use?

It depends

I got a mail a few days ago about how I wrote a couple of articles with code for an options pricer in ocaml but all my latest articles were about scala. So which language would I use today if I were to write an options pricer. The answer of course is "it depends".

Now in and of themselves both ocaml and scala are fine choices for writing just about anything. But there are tradeoffs. If I was in charge of development of a brand new pricing and risk infrastructure for a big bank that had to be able to price everything from a stock to a digital multi-asset multi-barrier callable bermudan range-accrual thingummybobber that was going to be worked on by a thousand people ranging from the lowliest intern to the most brilliant genius then I have no hesitation in saying I would use scala.

In fact a friend of mine who is in charge of the development of risk and pricing systems at a major wall st firm told me that if he was building this infrastructure far a bank from scratch today he would use scala. He's the guy who persuaded me to try scala in the first place as a matter of fact.

The reasons this would be a fine choice should be obvious- it's very simple to write serious software in scala, it shouldn't take anyone of reasonable ability much time at all to learn, the syntax is not overly burdensome and tedious, performance is adequate or better for most things, the concurrency paradigm is tractable by normal human beings and there is fantastic library support because you can just use java stuff. The extensible syntax would help for various things and the cross-platform support is always a nice thing to have if you want to have number-crunching on a Linux compute farm and desktop apps on Windows or whatever.

On the other hand if I was setting up a small quant trading shop/hedge fund or doing it for my own benefit then the choices are much more varied. I might use scala still (it would still be an excellent choice), I might do it in ocaml (or even Haskell in fact), particularly if I was going to be all or most of the programming myself or I had access to recruit a decent pipeline of smart functional-programming aware people.

The benefits of doing it in ocaml (or Haskell) would be that I would probably have a more mentally-stimulating time doing it (this is can be an important motivation also if you have a super-bright team), and would probably end up with something more aesthetically pleasing from a pure comp-sci point of view.

On the other hand I would certainly have more frustrations (eg Why has no-one noticed that you can only do one request through the ocaml curl library because there is a memory scribble? But I digress). I wouldn't really want to lead a group of 100 guys and have to keep teaching haskell monad combinators or whatever every time a new person joined. And maintaining/code reviewing etc could become excruciating when you were dealing with people of average ability less one or two standard deviations.

So horses for courses. Ultimately writing good software always requires thought, discipline and some skill. The right language fits the problem domain and suits the group and organization. Good programmers can learn to write good software in any language.

permalink Updated: 2009-08-20

Website Changes

An occasionally-updated summary of stuff I've changed

So when I had a day off work this week I decided to fix a few things on this site that have been bugging me for a while:

So I've had a go at fixing them. I now generate my sitemaps automatically using the code I wrote to generate the atom and rss feeds, transparently gzip stuff if your client can accept that, and there is a new articles link on the side to make all the content easier to get to. Oh, and the pages have proper description tags which should make them more useful on search engines.

Hope this all helps.

permalink Updated: 2009-08-07

Adding comments

So I've thought about doing this for some time but never had the spare time. Finally I add comments.

I have added comments for the first time. I don't expect there will be a great deal of traffic and at the moment just to get started I have moderation on. If things seem to be resonably calm and the wingnuts stay away, I'll turn moderation off.

I'm using the comment system from intense debate, which has been fantastically easy to set up and use so far. Let me know what you think.

Note: I decided to remove comments due to spam and the desire to avoid javascript where possible

permalink Updated: 2009-08-07

Thank goodness for greasemonkey

Google recently added an extremely annoying feature to google reader. Thankfully you can decide for yourself how their website works.

Every morning I drink my coffee and read various news sites using google reader, so I was horrified recently to see a big yellow smiley popping up on a bunch of posts. This is some sort of social-networking-inspired feature which allows people to say which posts they "like", then anyone else reading that post gets a smiley to say "6 people liked this post" or whatever.

Perhaps this is just the ticket if you're desperately insecure about whether or not your taste in news is in tune with the mainstream of the internets, but as someone who does not give a flying... err monkeys about what other people think (by and large) I hate this feature. In fact, clicking through to the google forum on this feature I find that it turns out there are a lot of people who do not "like" reader's new "like" feature.

Thanks to the miracle of greasemonkey I can decide for myself what google's site looks (and works) like in my browser, and in fact Jason Fager has already posted a greasemonkey script that removes the offending smileys.

permalink Updated: 2009-07-20

Foaf in n3

Having got started with n3 I made myself a foaf file using it

So now I've started with notation3 I am itching for places to use it. For this reason I decided to make myself a foaf file. Happily, this is really straightforward using n3. My foaf.n3 looks like this:

@keywords a.
@prefix : <http://xmlns.com/foaf/0.1/>.
@prefix wot: <http://xmlns.com/wot/0.1/>.
@prefix dc: <http://purl.org/dc/elements/1.1/>.
@prefix x: <#>.

x:this a PersonalProfileDocument;
wot:assurance "http://www.uncarved.com/static/foaf.rdf.asc";
maker x:me;
primaryTopic x:me;
dc:title "Sean Hunter's FOAF file";
dc:identifier "http://www.uncarved.com/static/foaf.rdf".


x:me a Person;
name "Sean Hunter";
gender "male";
title "Mr";
givenname "Sean";
family_name "Hunter";
homepage <http://www.uncarved.com/>;
mbox_sha1sum "f5a2eaad7e46af80de0bc48e6db72efebe382da0";
plan """
---  ---
---  ---
---  ---             Darkening of the Light. In adversity
--------             It furthers one to be persevering.
---  ---
--------""".

...which when I do python /usr/bin/cwm.py foaf.n3 --rdf >| foaf.rdf generates a foaf.rdf file like this:

<?xml version="1.0"?>
<!-- Processed by Id: cwm.py,v 1.164 2004/10/28 17:41:59 timbl Exp -->
<!--     using base file:/home/sean/doc/n3/foaf.n3-->
<rdf:RDF xmlns="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wot="http://xmlns.com/wot/0.1/">
  <Person rdf:about="#me">
    <family_name>Hunter</family_name>
    <gender>male</gender>
    <givenname>Sean</givenname>
    <homepage rdf:resource="http://www.uncarved.com/"/>
    <mbox_sha1sum>f5a2eaad7e46af80de0bc48e6db72efebe382da0</mbox_sha1sum>
    <name>Sean Hunter</name>
    <plan>
---  ---
---  ---
---  ---             Darkening of the Light. In adversity
--------             It furthers one to be persevering.
---  ---
--------</plan>
    <title>Mr</title>
  </Person>
  <PersonalProfileDocument rdf:about="#this">
    <dc:identifier>http://www.uncarved.com/static/foaf.rdf</dc:identifier>
    <dc:title>Sean Hunter's FOAF file</dc:title>
    <maker rdf:resource="#me"/>
    <primaryTopic rdf:resource="#me"/>
    <wot:assurance>http://www.uncarved.com/static/foaf.rdf.asc</wot:assurance>
  </PersonalProfileDocument>
</rdf:RDF>

...which I link from my homepage. Easy-peasy. I can validate it using the w3c rdf validator and it works just fine. Now I just need to add some friends. If you know me and want to be added to my foaf file, mail me at [sean@uncarved.com.][10]

permalink Updated: 2009-06-13

Free Software

A Small Collection of Free Software I have written

Note: I think all of this software is now lost

permalink Updated: 2009-06-12

How do unix file permissions work?

Unix provides fine-grained access control for files. It's important to understand it.

What unix file permissions are

At their most basic, unix file permissions are a set of bits which control who has the permission to read from, write to and execute files. They have similar, but slightly different meanings as applied to directories. The mode is normally expressed either as a symbolic string of gibberish or as a numeric mode.

Numeric file permissions

A numeric mode is normally expressed as set of three or four octal digits but can be any length up to four. Any ommitted digits are assumed to be leading zeroes. The first digit is the sum of the "set uid" (4), "set gid" (2), and "sticky" (1) attributes. If you need these you know what they mean. Otherwise, move along please. The second digit sets permissions for the owner of the file to read(4), write(2), and execute(1) the file. The third sets permissions for members of the group specified as the group owner of the file and the fourth for people not in the file's group, with the same values for read, write and execute as the user permission digit. You figure out what permissions you need by adding the bits together. If I want a file to be readable and writeable by me and its group and just readable by others, I might run chmod 664 file to change the mode of the file.

Symbolic file permissions

commands which deal with file permissions often allow you to specify symbolic permissions as follows: u user g group o other r read w write x execute + add specified permission - remove specified permission = make permission exactly equal to this

So in the example above, I could achieve the results I wanted by chmod ug=rw,o=r file or by looking at the current file mode and using "+" or "-" as appropriate. For example, to remove group and world write and execute permissions on a file I might do chmod go-wx file. See man chmod for more details. On linux boxes you may have to do info chmod to get all the details because the GNU project don't like manual pages.

Finding out file permissions

You can see a symbolic representation of the permissions on a file or directory by using ls -l. If you want to understand why files you create get the permissions they do, read about how umasks work.

Special file permissions

I said "move along please", but to reward you for your persistence, these are the meanings of the leftmost digit of four-digit octal file permissions. Note: Do not use these unless you really know what you are doing. The setuid and setgid bits on files in particular have been responsible for many serious security breaches when thoughtlessly applied to unworthy programs.

When set on files:

Numeric | Symbolic |  Name       | Meaning
-----------------------------------------------------------------------
4000    | u+s      | setuid bit  | If the file is executed, set the effective user id of the resultant process to the owner of the file.
2000    | g+s      | setgid bit  | If the file is executed, set the effective group id of t he resultant process to the group owner of the file.
1000    | t        | sticky bit  | No effect. On ancient systems it means "Save the text image of the program to swap to speed up load time".

When applied to directories, the meanings of these bits are subtly different and more system- and filesystem-dependent:

Numeric | Symbolic |  Name                                  |  Meaning
-----------------------------------------------------------------------
4000    | u+s      | setuid bit                             | No effect
2000    | g+s      | setgid bit                             | Set the group owner of files created in this directory to the group of the group owner of the directory instead of the primary group of the file's creator
1000    | t        | sticky bit or "restricted delete flag" | On some systems it means prevent users from removing or renaming files in this directory unless they own the file or directory

permalink Updated: 2009-06-12

How do unix umasks work?

Often file permission problems on unix systems are caused by users who don't understand their umask and set it properly

What is a umask?

A umask is a bitmask by means of which a user can affect the default permissions of files they create on unix systems. You should find out what file permissions are if you are unsure. Its important to fully appreciate that a umask is not a default permission set it is a default permission mask so bits in the mask are the bits which you want not to be in the permissions of the file. A umask of 777 will thus mean that processes create files with permissions of zero whereas one of 0 will create files with full permissions thussly:

% umask 777
% touch foo
% umask 0
% touch bar
% ls -l foo bar
-rw-rw-rw-    1 sean   sean             0 Jul  2 11:15 bar
----------    1 sean   sean             0 Jul  2 11:15 foo

Now you might well wonder why (if my umask is zero) the execute bits aren't also being set on "bar"? Lets see what "touch" is doing under the covers...

% strace touch furb 2>&1 | grep furb
execve("/bin/touch", ["touch", "furb"], [/* 137 vars */]) = 0
open("furb", O_WRONLY|O_NONBLOCK|O_CREAT|O_NOCTTY|O_LARGEFILE, 0666) = 3
utime("furb", NULL)                     = 0

As you can see, the touch process only tries to create the file with permissions of 0666. The permissions requested at open(2) are then masked by the umask and the bits requested in the open(2) call that are not in the mask end up in the permissions of the file created. Processes (such as compilers) which know they are creating executable files will generally use open(2) with the execute bits set as well. Because umask is a mask it can't raise the permissions higher than the process asks for in the open(2) call. The same is true of any other process which opens files.

How do I change my umask?

Now, if you want to change your umask, the command you want is the shell builtin called umask not /usr/bin/umask. From man umask:

/usr/bin/umask

The umask utility sets the file mode creation  mask  of  the
current  shell  execution environment to the value specified
by the mask operand.  This mask affects the initial value of
the  file permission bits of subsequently created files.  If
umask is called in a subshell or separate utility  execution
environment, such as one of the following:
    (umask 002)
    nohup umask ...
    find . -exec umask ...
it does not affect  the  file  mode  creation  mask  of  the
caller's environment.

If the mask operand is  not  specified,  the  umask  utility
writes  the  value of the invoking process's file mode crea-
tion mask to standard output.

Let's just see if that's true, shall we?

% umask                 #check my current umask
022
% /usr/bin/umask 0      #try to change it using /usr/bin/umask
% umask                 #Did it work?  No!
022
% umask 0               #try to change it using the builtin umask
% umask                 #Did it work?  Yes!
000

So it is true. Why? The reason is that the shell does a fork(2) to create a copy of itself to run the /usr/bin/umask process, so then even if that process sets its own umask, those values happen in the child process and don't end up in the calling process when the child quits. That's why you need the shell builtin if you want to change your umask. If you don't understand, just always type umask or builtin umask and never /usr/bin/umask.

Now you know why you need to use chmod +x to set the "execute" bits on the things you want to be executable. In ancient (pre "git") times it was particularly important to get file permissions set correctly before checking them in to version control because all future checkouts of a file have the file permissions which a file had when it was first checked in unless someone tinkers in the repository.

permalink Updated: 2009-06-12

That was painful

Migrating everything to a new server

So a couple of weeks ago I finally became annoyed enough at having to use a virtual linux server that I shelled out the extra and was upgraded to a dedicated box of my very own in co-lo. Mostly this was totally seemless and I even looked forward to the differences given that I was moving to Centos from Debian. One thing, however, was extremely painful. Moving this website.

So uncarved.com uses a python script that I wrote ages ago using an early version of web.py. However, at the time it had pretty whacky database support, didn't have it's own template engine (so it used Cheetah) and had a few wierd bugs which I had hacked around in my local copy. As such, upgrading has been a real pain because I had to change the code to the new api, then hack all my templates to the new syntax (I initially tried to carry on using cheetah, but cheetah has changed and it was even more excruciating to get that working so I took the plunge). Anyway, it seems to have worked.

Note: This has all changed significantly in the last 11 years.

permalink Updated: 2009-02-06

I Ching in ML

In which I write yet another implementation of the book of changes (this time in ocaml)

I decided to rewrite my python I Ching in ocaml. This is interesting to do in some respects as it's not really a traditional functional programming task.

First we define the two basic oracles, the coin oracle and the yarrow stalk oracle. Obviously on a computer we are actually sampling the same probability distribution as the oracle, not simulating the process itself. How I do this is to just make a static array of all the outcomes and then choose from them with uniform probability. This is a lot easier to write (and understand) than trying to make a multifaceted biased coin and is equivalent from a probability point of view.

let (consult_coin, consult_yarrow) =
  (** choose one item from a given array with uniform probability *)
  let choose arr =
    let size = Array.length arr in
    let idx = Random.int(size) in
    arr.(idx)
  in
  (* ...use this to consult the coin oracle... *)
  let consult_coin () =
    let outcomes = [|
      9; 9;               (* Moving Yang --- x --- *)
      7; 7; 7; 7; 7; 7;   (* Stable Yang --------- *)
      8; 8; 8; 8; 8; 8;   (* Stable Yin  ---   --- *)
      6; 6                (* Moving Yin  --- o --- *)
    |]
    in
      choose outcomes
  in
  (* ...and the yarrow oracle also. *)
  let consult_yarrow () =
    let outcomes = [|
      9; 9; 9;             (* Moving Yang --- x --- *)
      7; 7; 7; 7; 7;       (* Stable Yang --------- *)
      8; 8; 8; 8; 8; 8; 8; (* Stable Yin  ---   --- *)
      6;                   (* Moving Yin  --- o --- *)
    |]
    in
      choose outcomes
  in
    (consult_coin, consult_yarrow)

This is an example of how you define let binding which is private, yet shared by more than one function. In this case it's "choose", a helper function which just picks an element from an array. So we just define two public functions:

val consult_coin : unit -> int = <fun>
val consult_yarrow : unit -> int = <fun>

Given those two functions, it's easy to return a hexagram using a given oracle.

let get_hexagram ?(oracle=consult_yarrow) () =
  let rec pick lis =
    let len = List.length lis in
    if(len==6) then lis else pick ((oracle ())::lis)
  in
    pick []

A single divination actually derives two hexagrams, representing a change from one state to another. The original single hexagram contains "moving" lines which are inverted in the second one.

let get_hexagram_pair ?(oracle=consult_yarrow) ?(lines=[]) () =
  let lines = if(lines==[]) then (get_hexagram ~oracle ()) else lines in
  (* Given a hexagram, return a version with no moving lines by
   * simply disregarding their moving statement.  Moving yang becomes
   * yang, moving yin becomes yin *)
  let rec ignore_moving =
    function
        [] -> []
      | 9::res -> 7::(ignore_moving res)
      | 6::res -> 8::(ignore_moving res)
      | hd::res -> hd::(ignore_moving res)
  in
  (* Given a hexagram, return a version with no moving lines by
   * inverting moving lines.  Moving yang becomes yin, moving yin
   * becomes yang *)
  let rec invert_moving =
    function
        [] -> []
      | 9::res -> 8::(invert_moving res)
      | 6::res -> 7::(invert_moving res)
      | hd::res -> hd::(invert_moving res)
  in
    (ignore_moving lines),(invert_moving lines)

Now you have the guts of a working I Ching.

permalink Updated: 2008-12-09

Changing to a kinesis keyboard

I thought I would never find a keyboard that I liked better than Typematrix, but I may have to eat my words...

Until yesterday, I used to use a typematrix dvorak keyboard. Typematrix is a fabulous keyboard with lots of terrific design features but its primary strength is that it has a small desktop footprint, and that it uses straight forward and back movements of the fingers without any of the small left and right movements that are required due to the staggered layouts of conventional keyboards. This is a massive improvement if you do a lot of typing or if you have problems with your hands and wrists.

The one weakness of the typematrix in my opinion is that the modifier keys (ctrl and alt) are not placed in a very convenient spot. This used not to bother me as I am a Vim user and so didn't require lots of control keystrokes. However, recently I have been doing almost all of my programming in a proprietary in-house environment and this has been causing my pinkies to take more and more of the strain. I finally decided to make the switch when I had a marathon coding day where I wrote something like 600 lines of very formulaic code in a day and my fingers were aching from the funny contortions I had been putting them through with all of the control keystrokes that had been required.

I briefly tried out one of my colleagues kinesis "Advantage" keyboards and immediately decided that it was weird enough that I should buy one. I opted for a model that has an inbuilt hardware dvorak mode. I prefer this so that I am able to use dvorak mode right from the login, rather than switching in user preferences or dotfiles.

So, the kinesis keyboard feels like a substantial unit. When you are typing on it, it has a pleasing clackiness, and the general functionality is good. It has a very cool hardware remapping thing that means you can move the keys you don't like around to your heart's content. It's a really nice keyboard.

permalink Updated: 2008-07-29

A Functional Test Harness

Using monads to thread state, we make a purely functional version of the test harness

To paraphrase Crocodile Dundee, this isn't a monad tutorial this is a monad tutorial. However, I decided to do something that would help me to understand monads, and that is to use them in a practical way.

The [object-oriented][6] test framework we developed doesn't feel very idiomatic and as I learn more about FP, and it feels as if we could do something a lot nicer. So I set about thinking so how to make a purely functional test framework. The first thing that strikes you is how useful state is. In a purely functional framework we have to thread that state through our functions, and one elegant way to do that is through monads.

Now if you read the monad tutorial, you will realise that a monad is a magic box and all you can really do with a monad is put something in the box, or apply a function which will return another magic box. This is all very well, but its not great at explaining how (in a practical sense) anything actually gets done.

We start with what it's going to look like when you actually use the functional test api:

Test.test_begin >>=
Test.ok "Something which should be true" true_thing >>=
Test.not_ok "Something which should not be false" false_thing >>=
Test.fail_if "This should raise an exception" (fun () -> raise (Failure "aiee"))
>>=
Test.test_end

The >>= is borrowed from Haskell, and is the "bind" operator, which acts as the glue here, sending the state from one function to the next. So our first function (test_begin) needs to create the monad and bung in the starting state. The rest of the functions accept as their last argument the current state in its native form and return the updated state in the State monad. This means that after the arguments that you see above have been applied, they are candidates for the "bind" function.

So without further ado, our monad:

(** The basic type sig of a monad *)
module type MONAD = sig
    type 'a t
    val return : 'a -> 'a t
    val bind : 'a t -> ('a -> 'b t) -> 'b t
end

(** our state monad which will bind all our tests together *)
module State : MONAD = struct
    type 'a t = 'a
    let return x = x
    let bind m f = f m
end

It would be pretty hard to make anything simpler than that, but it fulfills the requirements to be a monad and it turns out a little goes a long way. Here's our functional test module:

module Test = struct
    (** the actual state which gets threaded through each fn *)
    type test_state = {n:int; ok:int}

    (** helper fns which return the state when it has succeeded or failed *)
    let succeeded s = State.return {n=s.n+1; ok=s.ok+1}
    let failed s = State.return {n=s.n+1; ok=s.ok}

    (** Pass the initial state into the State monad *)
    let test_begin = State.return {n=0; ok=0}

    (** we use this func tos implement all the rest.  It takes a string and a
    predicate, and the state, then succeeds if the predicate returns true. *)
    let pass_if desc pred s =
    let dots = String.make (50-(min 50 (String.length desc))) '.' in
    Printf.printf "%5.5d: %s ....%s" s.n desc dots;
    try
        if pred () then
        begin
            Printf.printf "ok\n" ;
            succeeded s
        end
        else
        begin
            Printf.printf "not ok\n";
            failed s
        end
    with
        _ ->Printf.printf "not ok (threw exception)\n";
        failed s

    (** Runs a predicate function and fails if it throws or
     returns true.  Otherwise it succeeds *)
    let fail_if desc pred s = pass_if desc (fun () -> not pred) s


    (** Takes a bool and marks the test as succeeded if it is true *)
    let ok desc x s = pass_if desc (fun () -> x) s

    (** Takes a bool and marks the test as failed if it is true *)
    let not_ok desc x s = ok desc (not x) s

    let test_end s =
    Printf.printf "End tests: %d of %d tests passed\n" s.ok s.n ;
    State.return s
end

Nifty, no? You'll notice that the functions are all very similar to those in the OO version, except that their final argument is a state record, and instead of updating member data in the object, they simply update this state record and use "State.return" to pass it into the State monad. To make the initial code snippet work, the only thing that remains is:

let ( >>= ) = State.bind

permalink Updated: 2008-06-16

A Simple Ocaml test harness

It's nice to be able to write tests for code as you go along, so you need a test harness...

I usually like to be able to write tests as I write code, and to have them run every time the code builds to make sure I haven't broken anything. To do this, you need a test harness so that adding tests is as painless as possible. So I wrote this one:

(** test.ml - a simple test harness in ocaml

  This is demonstration code only.  You are free to use it under the
  terms of the Creative Commons Attribution 2.5 license.

  @author Sean Hunter &lt;sean\@uncarved.com&gt;

 *)

(** Test harness class.  Tracks numbers of tests run and how many
 succeed. *)
class harness =
object(self)
    (** The number of tests so far *)
    val mutable n = 0

    (** The number of tests so far which have succeeded *)
    val mutable succeeded = 0

    (** A name for the group of tests currenly running *)
    val mutable group = ""

    (** Runs a predicate function and fails if it throws or
     returns false.  Otherwise it succeeds *)
    method pass_if desc pred =
      n &lt;- n + 1;
      let dots = String.make (50-(min 50 (String.length desc))) '.' in
      Printf.printf "%5.5d: %s ....%s" n desc dots;

      try
          if pred () then
            begin
                Printf.printf "ok\n" ;
                succeeded &lt;- succeeded + 1;
                true
            end
          else
            begin
              Printf.printf "not ok\n";
              false
            end
      with
      _ -&gt;Printf.printf "not ok (threw exception)\n"; false

    (** Runs a predicate function and fails if it throws or
     returns true.  Otherwise it succeeds *)
    method fail_if desc pred = self#pass_if desc (fun () -&gt; not pred)

    (** Takes a bool and marks the test as succeeded if it is true *)
    method ok desc x = self#pass_if desc (fun () -&gt; x)

    (** Takes a bool and marks the test as failed if it is true *)
    method not_ok desc x = self#ok desc (not x)

    (** Evaluate a predicate, ignoring its result. Succeed if it throws
     an exception, fail if not *)
    method pass_if_throws desc (pred : unit -&gt; bool) =
        try
            (ignore (pred ()));
            self#not_ok desc true
        with
            exn -&gt; self#ok desc true

    (** Evaluate a predicate, ignoring its result. Fail if it throws
     an exception, succeed if not *)
    method fail_if_throws desc (pred : unit -&gt; bool) =
        try
            (ignore (pred ()));
            self#ok desc true
        with
            exn -&gt; self#not_ok desc true

    (** Check that two floats are within a certain tolerance *)
    method pass_if_close ?(eps=1e-9) desc x y =
        let diff = abs_float (x-.y) in
        self#ok desc (diff &lt;= eps)

    (** mark the beginning of a group of tests *)
    method start_group name =
        n &lt;- 0 ;
        succeeded &lt;- 0;
        group &lt;- name ;
        Printf.printf "Begin test group %s\n" group

    (** mark the end of a group of tests, printing out success count *)
    method end_group  =
        Printf.printf "End test group %s: %d of %d tests passed\n"
            group succeeded n
end;;

(* vim: set syn=ocaml sw=4 ts=4 expandtab: *)

Now it's very easy for me to write tests. For example, tests for all my payoff functions might look like this:

(** payoff_tests.ml - tests for payoff classes

  Written by Sean Hunter &lt;sean@uncarved.com&gt;

  This is demonstration code only.  You are free to use it under the
  terms of the Creative Commons Attribution 2.5 license, but don't
  expect it to accurately price real options.

 *)

let tests = new Test.harness;;

tests#start_group "payoff.ml";;

(* tests for vanilla put and call payoffs *)
tests#ok "Vanilla ITM call" ((Payoff.call 100.0 110.0) = 10.0);;
tests#ok "Vanilla ATM call" ((Payoff.call 100.0 100.0) = 0.0);;
tests#ok "Vanilla OTM call" ((Payoff.call 100.0 90.0) = 0.0);;
tests#ok "Vanilla ITM put" ((Payoff.put 100.0 10.0) = 90.0);;
tests#ok "Vanilla ATM put" ((Payoff.put 100.0 100.0) = 0.0);;
tests#ok "Vanilla OTM put" ((Payoff.put 100.0 190.0) = 0.0);;

(* tests for digital payoffs *)
tests#ok "Digital ITM call" ((Payoff.digital_call 100.0 110.0) = 1.0);;
tests#ok "Digital ATM call" ((Payoff.digital_call 100.0 100.0) = 0.0);;
tests#ok "Digital OTM call" ((Payoff.digital_call 100.0 90.0) = 0.0);;
tests#ok "Digital ITM put" ((Payoff.digital_put 100.0 10.0) = 1.0);;
tests#ok "Digital ATM put" ((Payoff.digital_put 100.0 100.0) = 0.0);;
tests#ok "Digital OTM put" ((Payoff.digital_put 100.0 190.0) = 0.0);;
let dd = Payoff.double_digital ~low:90.0 ~high:110.0;;
tests#ok "Double Digital below the low barrier" ((dd 89.0) = 0.0);;
tests#ok "Double Digital at the low barrier" ((dd 90.0) = 1.0);;
tests#ok "Double Digital inside the low barrier" ((dd 90.1) = 1.0);;
tests#ok "Double Digital inside the high barrier" ((dd 109.9) = 1.0);;
tests#ok "Double Digital at the high barrier" ((dd 110.0) = 1.0);;
tests#ok "Double Digital above the high barrier" ((dd 110.1) = 0.0);;

tests#end_group;;

(* vim: set syn=ocaml sw=4 ts=4 expandtab: *)

...and when I run the tests the output looks like this:

Begin test group payoff.ml
00001: Vanilla ITM call ......................................ok
00002: Vanilla ATM call ......................................ok
00003: Vanilla OTM call ......................................ok
00004: Vanilla ITM put .......................................ok
00005: Vanilla ATM put .......................................ok
00006: Vanilla OTM put .......................................ok
00007: Digital ITM call ......................................ok
00008: Digital ATM call ......................................ok
00009: Digital OTM call ......................................ok
00010: Digital ITM put .......................................ok
00011: Digital ATM put .......................................ok
00012: Digital OTM put .......................................ok
00013: Double Digital below the low barrier ..................ok
00014: Double Digital at the low barrier .....................ok
00015: Double Digital inside the low barrier .................ok
00016: Double Digital inside the high barrier ................ok
00017: Double Digital at the high barrier ....................ok
00018: Double Digital above the high barrier .................ok
End test group payoff.ml: 18 of 18 tests passed

permalink Updated: 2007-06-21

Ocaml objects part 2

Parameter objects make more sense when we pass piecewise functions to the pricer

Unfortunately the class for constant parameters that we developed before is just not that useful. We would be better off just using a float if that was all we wanted. The advantage is, however, that we have defined the interface we need for these functions, and what we need is something that can return us the integral of a function and the integral of the square of a function in an interval:

method integral :       float -> float -> float
method integral_sq :    float -> float -> float

If we can do that, we can use our function in the mc pricer, so let's define a 'parameters' class for linear functions. This will take two coefficients a and b and our parameter will model the function f(x) = ax + b . The integral of f(x) dx is ax^2 + bx because we're trying to find the area under a triangle for the first term and a rectangle for the second. I needed a little help with f(x)^2 dx (thanks Iain!), but here it is:

(** Parameter class that models a linear function f(x) -> ax + b *)
class parameter_linear' a b =
object(self)
    inherit parameter
    method integral t1 t2 =
        let integrate_to x = 0.5 *. x *. a *. x +. x *. b in
        integrate_to t2 -. integrate_to t1
    method integral_sq t1 t2 =
        a ** 2.0 *. (t2 ** 3.0 -. t1 ** 3.0) +.
        a *. b *. (t2 ** 2.0 -. t1 ** 1.0) +.
        b ** 2.0 *. (t2 -. t1)
end;;

class parameter_linear a b = (parameter_linear' a b : parameter_type);;

This works fine, but what we really want is piecewise functions and piecewise constants. These give us the ability to have parameters that we can calibrate to observable data:

(** For piecewise constants and piecewise linear functions a "piece" is a
* parameter that applies between low and high limits. *)
type piece = Piece of float * float * parameter_type;;

This is what we are going to use to make the pieces of our piecewise functions- a tuple of low and high limits, and a parameter object that can give us the integrals in these limits. Integrating the piecewise function then becomes walking the list of pieces and integrating everything that is in the region we are interested in.

(** Class for piecewise parameters.  Takes a list of pieces which each
* specify the limits and the linear or const parameter in each region.
*
* At present, does no checking that the pieces are continuous and not
* overlapping *)
class parameter_piecewise' pieces =
  (* there's probably a better way of doing this, but anyway...
   * Apply a function to each piece, with the parameters being the part of
   * the interval from t1 to t2 that is inside the interval of the piece *)
  let visit_pieces pieces fn t1 t2 =
      let visit_piece piece low high =
        if (low > t2 || high <; t1) then
          0.0
        else
            let range_start = max t1 low in
            let range_end = min t2 high in
            (fn piece) range_start range_end in
      let rec visit_list so_far lis =
        match lis with
            [] ->; so_far (** we're done *)
          | Piece(low, high, p) :: rest -&#62; visit_list (so_far +. (visit_piece p low high)) rest
      in
        visit_list 0.0 pieces
  in
object(self)
    inherit parameter
    method integral t1 t2 = visit_pieces pieces (fun x->;x#integral) t1 t2
    method integral_sq t1 t2 = visit_pieces pieces (fun x->;x#integral_sq) t1 t2
end;;

(** Piecewise parameters class with implementation hidden *)
class parameter_piecewise x = (parameter_piecewise' x : parameter_type);;

As you can see, I have a local visitor function which walks the list of pieces and calls the right method on each one. I do this using a nifty little tail-recursive pattern-matcher. This works but I am a bit dissatisfied with the way I have to trick visit_pieces into calling the integral or integral_sq methods. There is a better way I'm sure- I just don't know what it is yet.

I could leave this implementation and it would be functionally-complete, but it would be a bit of a pain to build the piecewise functions, so I make a couple of little helper functions:

(** helper funcs to make parts of a piecewise function *)
let make_const_piece low high x = Piece(low, high, new parameter_const x);;
let make_linear_piece low high a b = Piece(low, high, new parameter_linear a b);
;

They're still not ideal- I would like to give a list of 2-tuples specifying the upper bound and a point for each piece, and have the function return the piecewise function or constant that "join the dots", but anyway this gives us the ability to make lists of pieces by doing this sort of thing:

(* price one option and print the result  *)
let print_mc ?(num_paths=100000) label payoff =
    let pieces = [
        Param.make_const_piece 0.0 0.25 0.35;
        Param.make_const_piece 0.25 1.0 0.25;
    ] in
    let vol=new Param.parameter_piecewise pieces in
    let mc payoff =
        Mc.sim
            ~payoff:payoff
            ~expiry:0.25
            ~spot:1.613
            ~vol:vol
            ~r:(new Param.parameter_const 0.055)
            ~num_paths:num_paths
         in
    Printf.printf "%s: %f\n" label (mc payoff)
;;

Note that our piecewise const parameter doesn't require the function to be continuous. Given this vol function, we can price as before.

permalink Updated: 2007-06-21

Ocaml objects part 1

Writing some objects in ocaml, we begin to make the pricer more flexible

The next in the somewhat haphazard series that began here in which I learn a bit about ocaml objects, and in so doing, also learn why Joshi proposes a parameter's class and how to calculate simple integrals.

The purpose of this exercise was to learn how to do object-oriented programming in ocaml, so I pick up where Joshi suggests modifying the Monte Carlo simulator to take "parameters" objects instead of floats for vol and rates. The purpose of this modification is that ultimately we will then be able to use functions as parameters to our models, rather than just constants. I begin by defining the interface to these "parameter" classes. This will define what these parameter objects can do.

(** interface for all parameter classes. Users of parameter classes expect
* this interface *)
class type parameter_type =
object
method integral :       float -> float -> float
method integral_sq :    float -> float -> float
method mean :           float -> float -> float
method root_mean_sq :   float -> float -> float
end;;

This is a direct translation of Joshi's interface. I'm not 100% sure we
really need this, but it does mean that differences between various
parameters classes and their private members can be hidden behind this
interface.

So given that, we can adapt the mc:
```Ocaml
(* Price an option with a flexible payoff using Monte Carlo.
*
* Use 'parameters' objects for vol and rates to allow passing flexible
* parameters *)
let sim ~payoff ~expiry ~spot ~vol ~r ~num_paths =
    let variance = vol#integral_sq 0.0 expiry in
    let root_variance = sqrt variance in
    let ito_correction = -0.5 *. variance in
    let integral_r = r#integral 0.0 expiry in
    let moved_spot = spot *. exp (integral_r +. ito_correction) in
    let rec do_path i running_sum =
        if i < num_paths then begin
            let this_gaussian = Gaussian.get_one_gaussian () in
            let this_spot = moved_spot *. (exp (root_variance *. this_gaussian)) in
            let this_payoff = payoff ~spot:this_spot in
            do_path (i+1) (running_sum +. this_payoff)
        end
        else (running_sum /. (float_of_int num_paths)) *. (exp (-1.0 *. integral_r))
    in
    do_path 0 0.0
    ;;

as you can see, the difference here is that we are calling the "integral_sq" method on the vol parameter, and the "integral" method on the rates parameter. If we put this into the toploop the type has changed to:

val sim :
payoff:(spot:float -> float) ->
expiry:'a ->
spot:float ->
vol:< integral_sq : float -> 'a -> float; .. > ->
r:< integral : float -> 'a -> float; .. > -> num_paths:int -> float = <fun>

So vol and r are params which have an integral and integral_sq method, which have two arguments, and return a float in each case. We could get rid of the polymorphic argument by explicitly specifying the type of the expiry argument, but I'm not sure how to do that for named args. ~expiry:float gives a syntax error anyway- I suppose I could write an explicit sig for the function but it works without it.

Now we can write the base parameters class. This is an abstract class- derived classes will specify how to take the integral and the integral of the square. In this mc model we don't use the concrete methods of this class but presumably they come in handy later.

(** base class specifying the mean and root_mean_sq methods in terms of
* the integral and integral_sq, but don't define those.  This allows us
* to define them for any type of parameter *)
class virtual parameter =
object(self)
    method virtual integral :       float -> float -> float
    method virtual integral_sq :    float -> float -> float
    method mean t1 t2 = (self#integral t1 t2) /. (t2 -. t1)
    method root_mean_sq t1 t2 = (self#integral_sq t1 t2) /. (t2 -. t1)
end;;

Now it's simplicity itself to write our first class, which models a constant parameter:

(** class for constant parameters.  This class will be hidden behind the
* parameter_type interface *)
class parameter_const' x =
object(self)
    inherit parameter
    val x_sq = x*.x
    method integral t1 t2 = (t2-.t1) *. x;
    method integral_sq t1 t2 = (t2-.t1) *. x_sq;
end;;

I add an alias to this (without the tick), which hides the implementation behind the parameter_type interface:

(** Constant parameters class with implementation hidden *)
class parameter_const x = (parameter_const' x : parameter_type);;

The type of parameter_const' is

class parameter_const' :
float ->
object
    val x_sq : float
    method integral : float -> float -> float
    method integral_sq : float -> float -> float
    method mean : float -> float -> float
    method root_mean_sq : float -> float -> float
end

whereas the type of parameter_const is

class parameter_const : float -> parameter_type

...which seems to express the idea of the class much better.

At this point we can run our Monte-Carlo using constant parameters by doing this sort of thing:

let mc payoff =
    Mc.sim
        ~payoff:payoff
        ~expiry:0.25
        ~spot:1.613
        ~vol:(new Param.parameter_const 0.35)
        ~r:(new Param.parameter_const 0.055)
        ~num_paths:num_paths
;;

If you're thinking "so what? That's no better than what we had before except probably a bit slower and certainly harder to read", then I am sympathetic to this point of view. Next we look at other parameters classes and suddenly it starts to become useful.

permalink Updated: 2007-06-19

Testing for put/call parity

Now let's check the prices from the ocaml Monte-Carlo pricer against the most obvious arbitrage relationship

Put call parity is an important relationship in options pricing. Let's test our prices to see if they obey this relationship.

To do so I add a simple helper function to payoff.ml to get the npv of some future cash:

(* get the net present value of some amount *)
let npv ~amt ~rate ~time = amt *. exp(-1.0 *. rate *. time);;

...and the tests themselves. Note that the put/call parity relationship for digitals is slightly different:

let assert_nearly ?(tolerance = 0.001) label a b =
    Printf.printf "Test %s - a: %f b: %f\n" label a b;
    assert( abs_float(a-.b) <tolerance );;

let spot = 1.613 in
let expiry = 0.25 in
let r = 0.055 in
let strike = 1.600 in
let discount x = Payoff.npv ~rate:r ~time:expiry ~amt:x in
let mc payoff =
    Mc1c.sim
        ~payoff:payoff
        ~expiry:expiry
        ~spot:spot
        ~vol:0.35
        ~r:r
        ~num_paths:100000 in
let price payoff = mc (payoff ~strike:strike) in
assert_nearly
    "Put call parity"
    ((price Payoff.call) +. (discount strike))
    ((price Payoff.put) +. spot);
assert_nearly
    ~tolerance:0.01
    "Digital put/digital call parity"
    ((price Payoff.digital_put) +. (price Payoff.digital_call))
    (discount 1.0)
;;

Here's the output:

Test Put call parity - a: 1.707748 b: 1.708199
Test Digital put/digital call parity - a: 0.982665 b: 0.986344

Trebles all round! The prices are ok. It's also interesting that ocaml is actually even faster than I thought. I was running bytecode and if you use ocamlopt to create native code my examples run about five times faster.

permalink Updated: 2007-06-13

Productivity in unix

Loving the Unix user experience comes from realising how you make it work the way you want

People who I work with are often bemused by the perculiar setup I have on my machine, which is the product of constant evolution over the last five years or so. My setup is designed to work well for me, not to look pretty or provide distraction. As such, it can be thought of as sort of the opposite of desktop environments like Gnome. The other thing about my setup is that it is designed first and foremost to be fast and as such programs which are slow to start, slow to run or consume a lot of memory are completely out. For this reason, over the years I have ditched the emacs family of editors, fancy window managers and silly meta-environments like Gnome. People who use vim tend to look productive and slightly wired like this:

Bram Moolenaar, the author of Vim looking like he's  just had a double espresso

Whereas emacs users have been known to look like this:

Richard Stallman, dressed as "Saint Ignucius"

I think that proves my point nicely.

I also use a text-mode web browser where I can because it is quick to start up and makes it quick to find information. Time-wasting web browsing of sites entirely devoid of real information I still do with a graphical browser, however. I have found that as computers that I use have become faster and more powerful, my desire for speed has increased so I am even more obsessed with small, fast apps than I used to be. Goals of the setup

These are pretty simple:

Apps I use

The shell and the unix philosophy

One of the key strengths of unix systems is the philosophy of small tools which can be combined in flexible ways. Writing your own scripts, functions and aliases to customise your user experience not only makes your life better and more productive, its also a great way to learn about unix.

This is the antithesis of the mindset that brings about Integrated Development Environments. The great thing about the unix shell as an integrated environment is that it is infinite in potential given that you can always extend the facilities you have by making other tools using scripting or by getting them from the web. The shell itself is extensible in that you can add little scripts or write functions to add features or do things your way.

For example, the cd builtin in ksh has this tremendous feature for dealing with long paths. Say I'm in /foo/bar/nerf/2003-04-26/conf/frazzle and I realise I need to be in /foo/baz/nerf/2003-04-26/conf/frazzle, all I do is type cd bar baz . That's pretty useful.

I also often find myself typing vi /the/path/to/a/file, then wishing I could do cd !$ to get to the directory to check the file in or whatever. So I built a cd function that changes to the parent directory if the target is a file. When a colleague of mine explained the ksh cd behaviour, I added his ksh-like cd function to mine to get this:

cd()
{
    if [ -z "$1" ] ; then
        builtin cd
    else
        if [ -n "$2" ]; then
            TRY="${PWD/$1/$2}"
        else
            TRY="$1"
        fi

        if [ -f "${TRY}" ]; then
            builtin cd "${TRY:h}"
        else
            builtin cd "${TRY}"
        fi
    fi
}

Notice that this includes a few zsh-isms which you'll need to change if you want to use this in bash. ${TRY:h} is the same as $( dirname "${TRY}" ) but I get to skip the backticks and save a fork. I think you can do this in bash using a string substitution something like "${TRY##*/}". If you are using a csh-like shell I pity you and this cd mechanism will not be enough to save you from your folly anyway.

Shell history completion and searching

One of the easiest ways to improve your productivity on unix systems is to become acquainted with your shell's history completion and search facilities. All of these described here are common to zsh and bash, although zsh has some additional features that bash does not.

Firstly, if you ever find yourself tapping the up arrow to scroll through your command history, you are living in a state of sin- either type history 100 or something similar to actually get a history list or use Ctrl-R to do a recursive incremental search through your shell history. Secondly, learn and use some basic history completions:

There are lots more possible history completions but I use those three ten times for any one of them. Also useful are:

You can combine these with the first three constructs by adding a colon, so:

permalink Updated: 2007-06-13

Using Ocaml in Practice

Starting to learn how to make ocaml programs that are not just toys, and extending the derivatives pricer further

My experience learning ocaml has been pretty enjoyable so far. A friend told me about Jason Hickey's pdf book Introduction to the Objective Caml Programming Language which is commendably brief and has really helped as I go on to try ocaml further. The second thing that has helped is that I have discovered rlwrap, so the toploop is no longer such an unfriendly place to be.

Furthermore, I have split the pricer that I began developing yesterday into a number of files and build them into a single executable using ocamlc. First, gaussian.ml, containing the random number functions:

open Random;;

(* initialize the random number generator *)
Random.self_init;;

(* get a random gaussian using a Box-Muller transform, described
 * here http://en.wikipedia.org/wiki/Box-Muller_transform *)
let rec get_one_gaussian_by_box_muller () =
    (* Generate two uniform numbers from -1 to 1 *)
    let x = Random.float 2.0 -. 1.0 in
    let y = Random.float 2.0 -. 1.0 in
    let s = x*.x +. y*.y in
    if s > 1.0 then get_one_gaussian_by_box_muller ()
    else x *. sqrt (-2.0 *. (log s) /. s)
    ;;

(* get a gaussian through oversampling and subtraction *)
let get_one_gaussian_by_summation () =
    let rec add_one limit count so_far =
        if count==limit then so_far
        else add_one limit (count+1) (so_far +. (Random.float 1.0)) in
    (add_one 12 0 0.0) -. 6.0
    ;;

let get_one_gaussian = get_one_gaussian_by_box_muller

I added the summation method because when I first tried the pricer on real data the numbers were hopeless (now they are just somewhat out of line with market observables), and I suspected a bug in my random numbers. I was correct, I did have incorrect random numbers.

Then I have payoff.ml, containing my payoff functions. I have added a few more simple payoffs, and moved to named function arguments:

(** a vanilla option pays off the difference between the spot price
** and the strike, or expires worthless *)
let call ~strike ~spot = max (spot -. strike) 0.0;;
let put ~strike ~spot = max (strike -. spot) 0.0;;

let digital payoff = if payoff> 0.0 then 1.0 else 0.0;;
let digital_call ~strike ~spot = digital (call ~strike:strike ~spot:spot);;
let digital_put ~strike ~spot = digital (put ~strike:strike ~spot:spot);;

(** A double digital pays 1 if spot is between two barriers, zero
** otherwise *)
let double_digital ~low ~high ~spot =
assert (low < high);
if (low <= spot && spot <= high) then 1.0
else 0.0;;

mc1c.ml contains the actual Monte Carlo simulator, and it is unchanged except to use named function arguments, and to qualify the name of the get_one_gaussian function, which is now in a seperate file:

(* Price an option with a flexible payoff using Monte Carlo. *)
let sim ~payoff ~expiry ~spot ~vol ~r ~num_paths =
    let variance = vol *. vol *. expiry in
    let root_variance = sqrt variance in
    let ito_correction = -0.5 *. variance in
    let moved_spot = spot *. exp (r *. expiry +. ito_correction) in
    let rec do_path i running_sum =
        if i < num_paths then begin
            let this_gaussian = Gaussian.get_one_gaussian () in
            let this_spot = moved_spot *. (exp (root_variance *. this_gaussian)) in
            let this_payoff = payoff ~spot:this_spot in
            do_path (i+1) (running_sum +. this_payoff)
        end
        else (running_sum /. (float_of_int num_paths)) *. (exp (-1.0 *. r *. expiry))
    in
    do_path 0 0.0
    ;;

Finally I have my test file which runs the test cases. Now that I use named function args I can partially-apply function args in any order, so I make a test harness that sets up a particular marketdata scenario and runs the pricer:

let print_mc ?(num_paths=100000) label payoff =
    let mc payoff =
        Mc1c.sim
            ~payoff:payoff
            ~expiry:0.2
            ~spot:161.3
            ~vol:0.35
            ~r:0.045
            ~num_paths:num_paths
         in
    Printf.printf "%s: %f\n" label (mc payoff)
;;

print_mc "call" (Payoff.call ~strike:160.0);;
print_mc "digital call" (Payoff.digital_call ~strike:160.0);;
print_mc "put" (Payoff.put ~strike:170.0);;
print_mc "digital put" (Payoff.digital_put ~strike:170.0);;
print_mc "double digital" (Payoff.double_digital ~low:160.0 ~high:170.0) ~num_paths:250000;;

(* price one option, test the payoff against a target price and
 * print the result  *)
let test_mc ?(num_paths=1000) ?(expiry=1.0) ?(r=0.0) label payoff target =
    let mc payoff =
        Mc1c.sim
            ~payoff:payoff
            ~expiry:expiry
            ~spot:161.3
            ~vol:0.35
            ~r:r
            ~num_paths:num_paths
         in
    let price = mc payoff in
    let tolerance = 0.00001 in
    Printf.printf "Test %s - price: %f target: %f\n" label price target;
    assert( abs_float(price-.target) &lt; tolerance )
;;

test_mc "Spot to one" (fun ~spot-&gt;1.0) 1.0;;
test_mc "Spot to zero" (fun ~spot-&gt;0.0) 0.0;;
let r = 0.05 in
let npv =  exp(-1.0 *. r) in
    test_mc "Spot to one with rates" (fun ~spot-&gt;1.0) npv ~r:r;;

(* vim: set sw=4 ts=4 expandtab: *)

This is pretty cool. As you can see, I made the number of mc paths an optional parameter. I have added a few assertions that trivial payoff functions price correctly and that discounting works.

Now I build an object from each ml file using ocamlc and then compile them all to a built object at the end. You need to be careful to do them all in the correct order where you have functions defined in one file, called in another, as we have here.

permalink Updated: 2007-06-13

Derivatives pricing in ocaml Part 2

Extending the basic mc pricer to handle different payoffs, we see how partial function application works

This is the third article in a series on using functional programming for financial applications which started here.

So given our first Monte Carlo simulator, and the 7payoff functions that we started with, it's easy to see how we extend the pricer to handle Joshi's first question at the end of chapter 1 (to price puts):

(* mc1b.ml - A rudimentary option pricer to answer the exercises at the
* end of chapter 1 in Joshi.
*
* Written by Sean Hunter <sean@uncarved.com>
*
* This is demonstration code only.  You are free to use it under the
* terms of the Creative Commons Attribution 2.5 license, but don't
* expect it to accurately price real options.
*)
open Random;;
open Printf;;

(* initialize the random number generator *)
Random.self_init;;

(* get a random gaussian using a Box-Muller transform, described
 * here http://en.wikipedia.org/wiki/Box-Muller_transform *)
let rec get_one_gaussian_by_box_muller () =
    (* Generate two uniform numbers from -1 to 1 *)
    let x = Random.float 2.0 -. 1.0 in
    let y = Random.float 2.0 -. 1.0 in
    let s = x*.x +. y*.y in
    if s > 1.0 then get_one_gaussian_by_box_muller ()
    else x *. sqrt (-2.0 *. (log s) /. s)
    ;;

 (* a vanilla option pays off the difference between the spot price
 * and the strike, or expires worthless *)
let put_payoff strike spot =
    max ( strike -. spot ) 0.0;;

let call_payoff strike spot =
    max (spot -. strike ) 0.0;;

(* Price an option with a flexible payoff using Monte Carlo. *)
let simple_monte_carlo_1a payoff expiry strike spot vol r num_paths =
    let variance = vol *. vol *. expiry in
    let root_variance = sqrt variance in
    let ito_correction = -0.5 *. variance in
    let moved_spot = spot *. exp (r *. expiry +. ito_correction) in
    let rec do_path i running_sum =
        if i < num_paths then begin
            let this_gaussian = get_one_gaussian_by_box_muller () in
            let this_spot = moved_spot *. (exp (root_variance *. this_gaussian)) in
            let this_payoff = payoff strike this_spot in
            do_path (i+1) (running_sum +. this_payoff)
        end
        else (running_sum /. (float_of_int num_paths)) *. (exp (-1.0 *. r *. expiry))
    in
    do_path 0 0.0
    ;;

(* price one put and one call option struck at the money *)
printf "%f\n" (simple_monte_carlo_1a call_payoff 0.025 195.5 195.5 0.20 0.045 100000);;
printf "%f\n" (simple_monte_carlo_1a put_payoff 0.025 195.5 195.5 0.20 0.045 100000);;

So we pass a payoff function into the Monte Carlo simulator and call that function for each path. This isn't quite general enough to handle double digitals though. A double digital is an option that pays 1 if the spot is between two barrier prices at expiry and zero otherwise. However, the payoff function here has to take the strike and the spot. This is where one of the great features of functional programming comes in: partial function application. In ocaml, if you call a function that takes 2 parameters, but give it only one, the return type is a function that takes one parameter. Amazingly partial function application is just there by default and there's no need for tedious binders like there are in the STL in C++ for example. This will explain:

% ocaml

Objective Caml version 3.09.3

# let myadd x y=x+y;;
val myadd : int -> int -> int = <fun>
# let add2=myadd 2;;
val add2 : int -> int = <fun>
# add2 5;;
- : int = 7
# add2 16;;
- : int = 18

So myadd adds two numbers, and by calling it with just one (a 2) we create a function that takes one argument and adds two to it. This is exactly what we need for our flexible payoff function. You can think of the partially applied function args as being all the things that are constant on the termsheet of the trade. Our payoff functions remain the same, except we add one for double digitals:

let double_digital_payoff low high spot =
    if (low <= spot && spot <= high) then 1.0
    else 0.0;;

...and we change the mc pricer to just call the payoff func with the spot on the current mc path. We will partially-apply any other arguments the payoff functions need when we invoke the pricer:

(* Price an option with a flexible payoff using Monte Carlo. *)
let simple_monte_carlo_1b payoff expiry spot vol r num_paths =
    let variance = vol *. vol *. expiry in
    let root_variance = sqrt variance in
    let ito_correction = -0.5 *. variance in
    let moved_spot = spot *. exp (r *. expiry +. ito_correction) in
    let rec do_path i running_sum =
        if i &lt; num_paths then begin
            let this_gaussian = get_one_gaussian_by_box_muller () in
            let this_spot = moved_spot *. (exp (root_variance *. this_gaussian)) in
            let this_payoff = payoff this_spot in
            do_path (i+1) (running_sum +. this_payoff)
        end
        else (running_sum /. (float_of_int num_paths)) *. (exp (-1.0 *. r *. expiry))
    in
    do_path 0 0.0
    ;;

Now see how we call this pricer. It's simplicity itself:

printf "%f\n" (simple_monte_carlo_1b (call_payoff 160.0) 0.2 161.3 0.35 0.045 250000);;
printf "%f\n" (simple_monte_carlo_1b (put_payoff 170.0) 0.2 161.3 0.31 0.045 250000);;
printf "%f\n" (simple_monte_carlo_1b (double_digital_payoff 160.0 170.0) 0.2 161.3 0.29 0.045 250000);;

I find this immensely pleasing. I have hardly started learning the language, and yet generalising this code was simplicity itself due to features in the language. Partial function application is available in imperative languages (C++ notably uses it to provide predicates and adaptable functions in the STL), but there is a lot of nasty additional syntax. The way it works so seemlessly in ocaml is terrific.

Its worth coming back to earth a little bit with the observation that the prices that I am seeing from this thing right now don't match those I can observe in the market so I am sure there is some debugging yet to do. I would also like to make the code into a few modules, but I am not sure how you do that in ocaml yet.

permalink Updated: 2007-06-06

Derivatives pricing in ocaml Part 1

Following on from the 'Learning Functional Programming' article, we write our first call pricer

So from the first article we move on to write the first Monte Carlo pricer. This is a very close translation of Joshi's implementation in part 1.3 (listings 1.1 to 1.3) except that I don't read the values in from stdin, I just hardcode them.

Writing this was fun, but I am extremely short of sleep so it was actually trickier than I expected. It was made harder by the cryptic error messages that ocaml gives you when things go wrong. These were almost always This expression has type int but is here used with type float which means you are passing int arguments to a float operator or function, rather than the other way around.

% ocaml
Objective Caml version 3.09.3

# 1 /. 2;;
This expression has type int but is here used with type float
# 1.0 / 2.0;;
This expression has type float but is here used with type int

I hope that makes it a little clearer. I find it confusing anyway. It's like the error message has been written by someone for whom English is a second language. Perhaps that's the case.

Here's the listing without further ado. On my computer it does a million paths in 13.2 seconds which is pretty decent I think. The ocamlc compiler is fast as lightning too, compiling this to an executable in 0.029 secs on my machine. If I go on learning ocaml I won't be able to go and get coffee while waiting for my code to compile.

(* mc1.ml - A rudimentary European call option pricer designed to mimic
* listings 1.1 to 1.3 in Joshi.
*
* Written by Sean Hunter <sean@uncarved.com>
*
* This is demonstration code only.  You are free to use it under the
* Creative Commons Attribution 2.5 license, but don't expect it to
* accurately price real options.
*)
open Random;;
open Printf;;

(* initialize the random number generator *)
Random.self_init;;

(* get a random gaussian using a Box-Muller transform, described
 * here http://en.wikipedia.org/wiki/Box-Muller_transform *)
let rec get_one_gaussian_by_box_muller () =
    (* Generate two uniform numbers from -1 to 1 *)
    let x = Random.float 2.0 -. 1.0 in
    let y = Random.float 2.0 -. 1.0 in
    let s = x*.x +. y*.y in
    if s > 1.0 then get_one_gaussian_by_box_muller ()
    else x *. sqrt (-2.0 *. (log s) /. s)
    ;;

(* Price a European call using Monte Carlo.
 *
 * We pre-compute as much as possible before the simulation, then the
 * actual mc paths are done as a nested recursive function.  This seems
 * more idiomatically functional even though ocaml has for loops.
 *
 * It's also good because I couldn't get the other way to work.*)
let simple_monte_carlo1 expiry strike spot vol r num_paths =
    let variance = vol *. vol *. expiry in
    let root_variance = sqrt variance in
    let ito_correction = -0.5 *. variance in
    let moved_spot = spot *. exp (r *. expiry +. ito_correction) in
    let rec do_path i running_sum =
        if i < num_paths then begin
            let this_gaussian = get_one_gaussian_by_box_muller () in
            let this_spot = moved_spot *. (exp (root_variance *. this_gaussian)) in
            let this_payoff = max (this_spot -. strike) 0.0 in
            do_path (i+1) (running_sum +. this_payoff)
        end
        else (running_sum /. (float_of_int num_paths)) *. (exp (-1.0 *. r *. expiry))
    in
    do_path 0 0.0
    ;;

(* price a three-month call option near to the money.
 * we are using 35% vol and 4.5% interest rates*)
printf "%f\n" (simple_monte_carlo1 0.2 160.0 161.3 0.35 0.045 250000);;

This, however, is crying out for more flexibility. The first thing to do is to see if we can get it to price other simple payoffs. That's the subject of the next article.

permalink Updated: 2007-06-06

Learning Functional Programming

I decided to teach myself ocaml by writing a derivatives pricer

I have been interested in functional programming for some time, but finally decided to bite the bullet and learn properly, and that (for me anyway) means writing some code to accomplish a practical task. My idea is to reimplement most of the C++ code in Mark Joshi's excellent book C++ Design Patterns and Derivatives Pricing, but in OCaml. Now I'm a newcomer to functional programming and to ocaml, so what I write won't be pretty or idiomatic, especially at first. To begin with I'm learning from the main tutorial. If I find and use another I'll post that too.

I will try to explain some of the financial stuff that's going on as I do so, but for the full lowdown on why derivatives price the way they do, you need to learn some financial maths. You could do a lot worse that picking up Joshi's other book, The Concepts and Practice of Mathematical Finance. A lot of people get Hull, but I prefer Joshi for a really practical introduction with great explanations of concepts. When I was at Goldman, a lot of people were very sarcastic about Hull as a source. I don't really have a credible opinion about that.

So without further ado, here is my first ocaml program, which defines payoffs for vanilla European put and call options. In fact Joshi starts off straight away with a Monte Carlo pricer, but my copy is downstairs so I'm straying off-piste here. It's my intention to follow Joshi step by step, and write up each one here as I go.

open Printf

(* a vanilla option pays off the difference between the spot price
 * and the strike, or expires worthless *)
let put_payoff strike spot=
        max ( strike -. spot ) 0.0;;

let call_payoff strike spot=
        max (spot -. strike ) 0.0;;

let print_payoff payoff strike spot=
        let outcome=payoff strike spot in
        printf "%f\n" outcome;;

print_payoff call_payoff 195.0 190.0;;
print_payoff call_payoff 195.0 200.0;;
print_payoff put_payoff 195.0 190.0;;
print_payoff put_payoff 190.0 195.0;;

Now I'm running and writing this on fedora Linux, and my ocaml is 3.09.3. When I run this I see:

% ocaml tmp/payoff.ml
5.000000
0.000000
0.000000
5.000000

Which is what I would expect. Now this is very cheesy at present, but it's a start and we'll improve it in the next article. It's worth a couple of observations about this because already this demonstrates a few things that strike me as being interesting about ocaml. For one thing, there isn't any default type promotion or operator overloading, so we need to explicitly qualify our constants with .0 to get them to be floats. Secondly, we need to use -. to subtract them. The max function can operate on any type so it works with floats or ints.

permalink Updated: 2007-06-05

My life (and mail) in subversion

Moving my mail into subversion gives me easy backups and all of my homedir in one place. Here's how I did it.

Inspired by 'Keeping your life in subversion' For some time now I have been slowly migrating all of my personal home directories into the subversion version control system and merging back so that I end up with a spartan configuration and everything gets backed up. Fool that I am, however, I want to take things one step further and also keep all of my mail in version control too.

The benefits of keeping mail in a version control system are marginal. Mails are not typically edited after they are received and sent, so a lot of people use a system such as unison and find it more than adequate. However, since everything else in my homedir is in subversion I wanted the mail in there too. Happily for my purposes this is not too hard.

The first thing is that I wanted a simple script that I can run after I have read or filed my mail that indexes the mail and syncs it to subversion. For indexing, I use mairix, which allows me to create virtual folders and cooperates well with mutt, my favourite email client. One possible solution to virtual folders is here. I decided to do things a little differently, and here's the shell script that does it all for me:

#!/bin/sh

set -e

TODAY=$( date -Iseconds )
cd "${HOME}/mail"
echo "Updating mail index..."
mairix -p       #add '-v' for more verbosity
echo "Emptying search folder"
rm -rf "${HOME}/mail/mfolder/"{cur,new,tmp}/*
echo "svn syncup"
svn status | awk '/^\?/{print $2}' | xargs -r svn add
svn status | awk '/^!/{print $2}' | xargs -r svn rm
svn ci -m "Mail sync for ${TODAY}"

Here's my .mairixrc:

base=/home/sean/mail
maildir=incoming...:sentmail...
mfolder=mfolder
database=/home/sean/mail/mairix_database

All that's needed to get it working in Mutt is to add ~/mail/mfolder to the mailboxes my .muttrc, and add

macro index \e\/"<shell-escape>mairix " "Run a mairix search"

at the same time. Now, in mutt to do a search I do Escape-/ and type a regex. All of the hits are in ~/mail/mfolder and I change to that like a regular mailbox.

...and to keep it all up to date I just run my mailsync script every now and again.

permalink Updated: 2006-05-18

Shell Command History and Searching

Time spent learning how the unix shell works is never wasted

One of the easiest ways to improve your productivity on unix systems is to become acquainted with your shell's history completion and search facilities. All of these described here are common to zsh and bash, although zsh has some additional features that bash does not.

Firstly, if you ever find yourself tapping the up arrow to scroll through your command history, you are living in a state of sin- either type history 100 or something similar to actually get a history list or use Ctrl-R to do a recursive incremental search through your shell history. Secondly, learn and use some basic history completions:

There are lots more possible history completions but I use those three ten times for any one of them. Also useful are:

You can combine these with the first three constructs by adding a colon, so:

permalink Updated: 2006-05-17

RSS 1.1 Feed now available

RDF + Syndication = RSS 1

Now that I am getting the hang of rdf, the next logical step was to link metadata with content and provide an rss 1.1 feed. Rss 1.1 is a fantastic specification in many ways because it allows machine-readable content (it is an application of rdf and xml), and has namespaces that allow content syndication, so you can embed metadata in with articles and include the foaf file and the pgp key of the authors if that's the sort of thing you like. In my RSS 1.1 feed I include the full content of the site with metadata and license information for the full feed and metadata for individual items as appropriate. To see the full end result, click on the rss1.1 link at the bottom of the index page.

Note that rss 2 is not a more recent spec that rss 1.1 (or rss 1)- they are seperate specs for doing a similar thing (content syndication), but the specification forked. Rss 2 is considered simpler by it's adherents because it is an application of xml but not of rdf. That, however, is also it's weakness because it's not as extensible and is not seemlessly part of the semantic web. Certainly for any programmer, rss 1 is no more complex than rss 2 to support and implement and is a great deal richer and more extensible.

For what it's worth, my cheetah template looks like this:

<pre><code><?xml version="1.0" encoding="utf-8"?>
<Channel xmlns="http://purl.org/net/rss1.1#"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:p="http://purl.org/net/rss1.1/payload#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:cc="http://web.resource.org/cc/"
    rdf:about="http://www.uncarved.com/index.py/rss1.1.xml?limit=1">
    <rdf:Description rdf:about="http://www.uncarved.com/index.py/rss1.1.xml?limit=1">
        <dc:creator>Sean Hunter</dc:creator>
        <dc:title>The Uncarved Block</dc:title>
        <dc:description>A collection of articles and software</dc:description>
        <dc:date>2006-05-11T13:32:57Z</dc:date>
        <dc:type>Text</dc:type>
        <dc:format>text/html</dc:format>
        <dc:identifier>http://www.uncarved.com/index.py/rss1.1.xml?limit=1</dc:identifier>
        <dc:language>en-GB</dc:language>
    </rdf:Description>
    <cc:Work rdf:about="http://www.uncarved.com/index.py/rss1.1.xml?limit=1">
        <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.5/" />
    </cc:Work>
    <cc:License rdf:about="http://creativecommons.org/licenses/by/2.5/">
        <cc:permits rdf:resource="http://web.resource.org/cc/Reproduction"/>
        <cc:permits rdf:resource="http://web.resource.org/cc/Distribution"/>
        <cc:requires rdf:resource="http://web.resource.org/cc/Notice"/>
        <cc:requires rdf:resource="http://web.resource.org/cc/Attribution"/>
        <cc:permits rdf:resource="http://web.resource.org/cc/DerivativeWorks"/>
    </cc:License>
    <title>The Uncarved Block</title>
    <description xml:lang="en-GB">Software, unix tips and sundry other things</description>
    <link>http://www.uncarved.com/index.py/</link>
    <items rdf:parseType="daml:collection">
        <item rdf:about="/articles/rss1">
            <title>RSS 1.1 Feed now available</title>
            <link>/articles/rss1</link>
            <description xml:lang="en-GB">RDF + Syndication = RSS 1</description>
            <dc:date>2006-05-11T13:08:47Z</dc:date>
            <p:payload rdf:parseType="Literal">
<pre><code>#filter Filter
<?xml version="1.0" encoding="utf-8"?>
<Channel xmlns="http://purl.org/net/rss1.1#"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:p="http://purl.org/net/rss1.1/payload#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:cc="http://web.resource.org/cc/"
    rdf:about="$self_link">
    <rdf:Description rdf:about="$self_link">
        <dc:creator>Sean Hunter</dc:creator>
        #if $current_tagname
        <dc:title>The Uncarved Block/$current_tagname</dc:title>
        <dc:subject>$current_tagname</dc:subject>
        #else
        <dc:title>The Uncarved Block</dc:title>
        #end if
        <dc:description>A collection of articles and software</dc:description>
        <dc:date>$most_recent_w3c</dc:date>
        <dc:type>Text</dc:type>
        <dc:format>text/html</dc:format>
        <dc:identifier>$self_link</dc:identifier>
        <dc:language>en-GB</dc:language>
    </rdf:Description>
    <cc:Work rdf:about="$self_link">
        <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.5/" />
    </cc:Work>
    <cc:License rdf:about="http://creativecommons.org/licenses/by/2.5/">
        <cc:permits rdf:resource="http://web.resource.org/cc/Reproduction"/>
        <cc:permits rdf:resource="http://web.resource.org/cc/Distribution"/>
        <cc:requires rdf:resource="http://web.resource.org/cc/Notice"/>
        <cc:requires rdf:resource="http://web.resource.org/cc/Attribution"/>
        <cc:permits rdf:resource="http://web.resource.org/cc/DerivativeWorks"/>
    </cc:License>
    #if $current_tagname
    <title>The Uncarved Block/$current_tagname</title>
    #else
    <title>The Uncarved Block</title>
    #end if
    <description xml:lang="en-GB">Software, unix tips and sundry other things</description>
    <link>$home_link</link>
    <items rdf:parseType="daml:collection">
        #for $article in $articles
        <item rdf:about="http://www.uncarved.com/$article.name">
            <title>$article.title</title>
            <link>http://www.uncarved.com/$article.name</link>
            #if $article.precis
            <description xml:lang="en-GB">$article.precis</description>
            #end if
            <dc:date>$article.updated</dc:date>
            <p:payload rdf:parseType="Literal">
                $article.body
            </p:payload>
        </item>
        #end for
    </items>
</Channel>

The feed no longer available and since the sematic web seems to have died a death I don't really see the point in reviving it. No-one subscribed to rss 1.1.

I hope you enjoy creating your own rss 1.1 feeds. You can find out more about all the site syndication variants on the wikipedia.

permalink Updated: 2006-05-11

Getting Things Done in Text with notation3

I want a simple means of tracking and grouping tasks that works in text form and is easily parseable. And I want to learn notation3 too...

Inspired by this article on lifehacker I decided to make my own "getting things done" toolset. It would ideally need to be just as flexible from the commandline as the lifehack "todo.txt" file but have machine readablility too. I have decided to use notation3 as the basic format. n3 is useful because it is a machine-parseable format that is easily converted into xml, but it's not as tedious to write as xml can often be. So far it looks as if I will be able to make a "one line per item" file that meets all of the "getting things done in txt" requirements but will also be able to be used online and via the semantic web toolset. I will be updating this article as I go along, not least because n3 is entirely new to me and I am bound to change things as I go along.

So far, the document looks something like this:

@prefix : <#> .
:tidy_garden :contexts "@home" ; :next_action "Mow lawn" ; :all_actions "Mow law
n" , "Weed borders" , "Turn compost" , "Prune" , "Tidy front garden" ; :priority
2 .
:sort_finances :contexts "@home" , "@work" ; :next_action "Consolidate pensions"
; :all_actions "Consolidate pensions" , "ebay junk", "make budget" ; :priority
1 .

As you can see, for each item, I can list the contexts in which it can take place, the next action and all the actions I have thought of for that item so far. For the next actions at the moment I am just using string literals, but I will soon change those into ":" labels so I can join the tasks together in a graph and have some tasks dependant on others. In n3 you can make rules and draw inferences so I could have it figure out for me what I should be doing next. Since the output is rdf I could make a full ontology for tasks etc and then as well as having a tool for humans, I would have a machine-readable format for this stuff that would fit into the semantic web. I may do that.

To get going with notation3, you need a computer with python and you need to install rdflib and cwm. It's well worth the effort to do this.

permalink Updated: 2006-05-10

Shell Functions that are also executables

Sometimes its handy to have shell functions that can be sourced, but can also be executables in their own right. Here's how.

Sometimes it's useful to have shell functions that can be used both from within my current shell and sourced inside scripts which I write. To achieve this is fairly easy if you arrange things according to a simple convention. What I do is to make a directory called "lib" in my homedir, and put each function into a seperate file in there that has the same name as the function.

For example, I wrote a small function to print the name of the file in a given directory that has most recently been modified. To make this more generic, I got it to print the n-th from last, rather than specifically the last. So I make a small file called ~/lib/printlastfile containing the following:

# vim: set syn=zsh:

#if no args are supplied, print the last filename from $TMP
#if one arg, print the n-th from last filename from $TMP
#if two args, the second arg is the name of a dir to use instead of TMP
printlastfile()
{
    RANK=${1:-1}
    #expand $TMPDIR and append a trailing slash if it doesn't have one
    DIR=$( echo ${2:-${TMPDIR:-'/tmp/'}} | sed 's,[^/]$,&/,' )

    ls -t "${DIR}"* | sed -n "${RANK}p;${RANK}q"
}

So any time I want that function I can just source that file. Now say I want an executable that runs printlastfile. I could make ~/bin/printlastfile as follows:

#!/bin/zsh
. ~/lib/printlastfile
printlastfile "${1}" "${2}"

... but that gets a bit tedious when you have a few of these things so I made one generic executable that for any function, will source the function and then run it. I call it libsquiggle because it turns things which are in squiggle/lib into executables and it looks like this:

#!/usr/bin/env zsh

MYNAME="${0:t}"
. "${HOME}/lib/${MYNAME}"

$MYNAME "$@"

So I make an executable called libsquiggle that looks like that and for any function in ~/lib that I want to work as an executable I make a symlink in ~/bin called the name of the function and pointing to ~/bin/libsquiggle. Thus:

ls -l =printlastfile

lrwxrwxrwx  1 sean sean 11 Oct  5  2005 /home/huntse/bin/printlastfile -> libsquiggle

permalink Updated: 2006-04-30

Building a better cd

Unix shell is very flexible and easy to customise. Here is how I made 'cd' better for me than the builtin one.

One of the key strengths of unix systems is the philosophy of small tools which can be combined in flexible ways. Writing your own scripts, functions and aliases to customise your user experience not only makes your life better and more productive, its also a great way to learn about unix.

This is the antithesis of the mindset that brings about Integrated Development Environments. The great thing about the unix shell as an integrated environment is that it is infinite in potential given that you can always extend the facilities you have by making other tools using scripting or by getting them from the web. The shell itself is extensible in that you can add little scripts or write functions to add features or do things your way.

For example, the cd builtin in ksh has this tremendous feature for dealing with long paths. Say I'm in /foo/bar/nerf/2003-04-26/conf/frazzle and I realise I need to be in /foo/baz/nerf/2003-04-26/conf/frazzle, all I do is type cd bar baz . That's pretty useful.

I also often find myself typing vi /the/path/to/a/file, then wishing I could do cd !$ to get to the directory to check the file in or whatever. So I built a cd function that changes to the parent directory if the target is a file. When a colleague of mine explained the ksh cd behaviour, I added his ksh-like cd function to mine to get this:

cd()
{
    if [ -z "$1" ] ; then
        builtin cd
    else
        if [ -n "$2" ]; then
            TRY="${PWD/$1/$2}"
        else
            TRY="$1"
        fi

        if [ -f "${TRY}" ]; then
            builtin cd "${TRY:h}"
        else
            builtin cd "${TRY}"
        fi
    fi
}

Notice that this includes a few zsh-isms which you'll need to change if you want to use this in bash. ${TRY:h} is the same as $( dirname "${TRY}" ) but I get to skip the backticks and save a fork. I think you can do this in bash using a string substitution something like "${TRY##*/}" but I haven't bothered to try. If you are using a csh-like shell I pity you and this cd mechanism will not be enough to save you from your folly anyway.

permalink Updated: 2006-04-21

Making the switch

Changing over to dvorak typing is tough, but worth it

Some time ago I decided to give the dvorak simplified keyboard layout a go. The major incentive for me to switch was that a colleague of mine introduced me to the brilliant typematrix 2030 keyboard with its innovative and striking "matrix" layout. It also features a hardware switch to configure the keyboard either as qwerty or dvorak. Even on unix-like operating systems where keyboard remapping is easy, this is a great feature because it means that even before login you can type dvorak.

In any case, the change was excruciatingly hard for the first few days as I have always been the sort of typist whose fingers roam erratically over the board and I was being forced to actually learn to stay on the home row. Nonetheless, as the typematrix folks say, if you stick at it and type nothing but dvorak for the first couple of weeks then it soon becomes very easy and switching back to qwerty if you need to use someone else's keyboard or whatever is not difficult at all after that.

I was a bit skeptical before, thinking that a lot of my typing was unix commands and program keywords and that as they are not normal english, the rearrangement of the letters may not be optimal for that use. That doesn't sem to be the case. Until I tried to switch over, I underestimated how much of daily typing was just english text (emails, program comments, documentation, notes to myself etc).

In spite of being an inveterate vim user, I have not found the loss of the vi home row arrow keys (hjkl) particularly hard to bear. It seems that I never used them much anyway, mostly doing searches or word movements rather than line movements. When I do need a line movement, I am happy to use the arrow keys. Certain keystrokes in certain applications, however ("ZZ" in vi to save and quit, for instance) I have obviously learned completely kinesthetically and it took my brain a couple of weeks to stop typing those incorrectly by mistake.

I would recommend switching to dvorak to anybody who types a lot (programmers etc) and is prepared to really give themselves a couple of weeks to do it. Since I switched, four of my colleagues at work have done the same, and they were all up to speed in around the same time as me or less. It seems to especially help if you are a bit of an ad-hoc typist with qwerty, rather than a typing maestro. That way you give yourself the additional benefit of learning to type properly at the same time.

Getting a typematrix keyboard may help because the change in layout is a trigger for my brain to switch into dvorak mode and when I feel a conventional keyboard I naturally think qwerty. I switched briefly on a conventional keyboard too, but since I use the typematrix at work and at home, I didn't do enough dvorak on a staggered keyboard for it to really stick for me.

permalink Updated: 2006-04-21

Being Productive in Text Mode

If you use a Unix system well, the command line can be the ultimate IDE.

People who I work with are often bemused by the perculiar setup I have on my machine, which is the product of constant evolution over the last five years or so. My setup is designed to work well for me, not to look pretty or provide distraction. As such, it can be thought of as sort of the opposite of desktop environments like Gnome. The other thing about my setup is that it is designed first and foremost to be fast and as such programs which are slow to start, slow to run or consume a lot of memory are completely out. For this reason, over the years I have ditched the emacs family of editors, fancy window managers and silly meta-environments like Gnome. People who use vim tend to look productive and slightly wired like this:

Bram Moolenaar, the author of Vim looking like he's  just had a double espresso

Whereas emacs users have been known to look like this:

Richard Stallman, dressed as "Saint Ignucius"

I think that proves my point nicely.

I also use a text-mode web browser where I can because it is quick to start up and makes it quick to find information. Time-wasting web browsing of sites entirely devoid of real information I still do with a graphical browser, however. I have found that as computers that I use have become faster and more powerful, my desire for speed has increased so I am even more obsessed with small, fast apps than I used to be.

The Goals of my setup are straightforward:

Apps I use

permalink Updated: 2006-04-21

Patterns in Perl

An article I wrote for the Perl Journal a long time ago

...about how to use iterators and some other simple design patterns in perl. The idea of iterator chains is still interesting to me, although I prefer python to perl for most tasks these days. Here it is anyway...

permalink Updated: 2006-04-21