May 18, 2020

Trade-Offs in Type Safety

One of the stories people tell about Haskell is that you don't get runtime errors. Once it compiles, there's a good chance the program will run smoothly. However, runtime errors in the form of exceptions do exist in Haskell. Exceptions are a core part of GHC and, whether you like it or not, any code in IO could fail in unforeseen ways.

Some exceptions are avoidable by carrying more proofs at the type level. The natural conclusion then, is to use Fancy types when writing Haskell.

While that might sound sensible, it's not the whole story. Fancy types have a cost. There are scenarios where exceptions are not as bad as one might think. Type safety is a spectrum – it's worth discussing the trade-offs.

Scotty / Servant

When writing a webserver, we want to parse certain fragments of the URL.

"/user/:user_id/post/:post_id"

This route is only matched if both user_id and post_id are present. With scotty, we could get both parameters with something like this:

Scotty.get "/user/:user_id/post/:post_id" $ do
  user <- Scotty.param "user_id"
  post <- Scotty.param "post_id"
  -- do stuff with user and post

Seems legit right? The documentation however, is fairly clear about what happens when the parameter isn't there:

Raises an exception [...] if parameter is not found.

This seems wrong. We know for sure the parameter is there, otherwise the route wouldn't even be matched. So why must we dirty our hands with exceptions and risk blowing everything up?

The reason is simple: we are encoding the route as a string, so there's no way of statically knowing which parameters we're dealing with.

Now take servant. This problem is non-existent because we encode the route at the type level and our handler gets both parameters as arguments:

type Api =
  "user"
  :> Capture "user_id" Integer
  :> "post"
  :> Capture "post_id" Integer
  :> Get '[JSON] Post
  
getPost :: Integer -> Integer -> Handler Post
getPost user post = do
  -- do stuff with user and post

We're now leveraging the type system to prove to the compiler that our route definitely has two parameters. This is nice because we're guaranteed user and post will have a value. No more exceptions, yay!

PostgreSQL Simple / persistent

There are many libraries that deal with SQL in Haskell. The trade-off is apparent in this context as well.

With postgresql-simple, we get very little guarantees about our queries.

res <- PG.query conn "select title from post where post_id = ?" [post]

There are many things that could go wrong here and result in an exception. Our post value might not match the column's type. The result type we expect might be different than the actual data type stored in the database.

These problems go away when using a more type heavy library such as persistent.

-- query DSL is esqueleto
select $ from $ \(b, p) -> do
    where_ (b ^. BlogPostAuthorId ==. p ^. PersonId)
    orderBy [asc (b ^. BlogPostTitle)]
    return (b, p)

Now we're good. We can't mess up our inputs and we know for sure what we're getting back from the query. Again we see how giving more information to the compiler results in more guarantees.

With servant we introduced an Api type. In this case, we need to describe our schema with some Template Haskell, so that GHC can derive everything necessary to let the magic happen:

share [ mkPersist sqlSettings ] [persistLowerCase|
  Person
    name String
    age Int Maybe
    deriving Eq Show
  BlogPost
    title String
    authorId PersonId
    deriving Eq Show
|]

If you've never seen code like this before – yes, this is valid Haskell. Code between [| and |] denotes a Template Haskell block, meaning the compiler will process it and generate more code as a result.

Type safety comes at a cost

Most Haskellers will claim that there's no point writing Haskell if you're not going for the most type safe option.

I argue that the guarantees we get out of these examples aren't always worth it. In other words, I am ok trading off some type safety for less complexity. I would still pick scotty and postgresql-simple because they don't require opting in advanced GHC features, while not degrading the quality of the software I write.

This is counter-intuitive, so let me elaborate on why.

Let's say I make a typo in the Scotty.param call. An exception is thrown and my program blows up. This error wasn't caught at compile time. So what? I should be ashamed of myself and go back to dynamically typed languages if I love exceptions so much, you might say.

Well, hear me out now.

Would you avoid writing a test just because you went for the most type safe option? I wouldn't. And so any test, even the most basic, that just exercises the code path for the handler with a typo in it, would reveal the error.

I want to stress this: you don't even have to assert anything, simply poking the controller would manifest the error.

The same argument could be made for the SQL example. Sure, my code might have serious errors and still compile, but would you not write any tests just because you went for a safer library? Writing a very high level test that executes the query as part of it, would reveal the error.

I'm not saying every route or query should be tested individually, that'd be a no-go. I'm also not an advocate for 100% test coverage. But type-safety is a spectrum and I deliberately make a trade-off. I choose to pick the simpler option and cover my bases with (basic and high level) testing, which I would have to do anyway.

Even though I'm not leveraging Fancy types, I still get a lot out of writing Haskell. My software is correct where it matters – domain modeling and business logic. Complexity tends to increase once you bring in DataKinds and other extensions that are required for libraries like servant and persistent to work. I often don't see that complexity as necessary and I'm fine trading it for errors that are easily caught by tests.

Sticking to the Haskell98 type system is a core tenant of what I call Simple Haskell. I covered more ideas about the topic in the post My thoughts on Haskell in 2020.

Back home