Saturday, March 15, 2008

Brute force SAT solver in Haskell

Sometimes people get wrong impressions after a Theory of Computation course. A "hard" problem there is not necessarily hard to solve with a program, just hard to solve well in realistic circumstances. Take SAT, for example, the boolean satisfiability problem for propositional logic. It's a classic example of a NP-complete problem, whose naive solution -- generate all possible combinations of truth-values for each atomic proposition and then calculate the value for the whole formula -- is clearly exponential. But this naive solution is very easy to write. For example, in Haskell we get something like the following:

import Data.List ( nub )
import Data.Maybe ( fromJust )

data Formula = Var String | And Formula Formula |
Or Formula Formula | Not Formula

getVarsDup (Var v) = [v]
getVarsDup (And f1 f2) = (getVarsDup f1) ++ (getVarsDup f2)
getVarsDup (Or f1 f2) = (getVarsDup f1) ++ (getVarsDup f2)
getVarsDup (Not f) = getVarsDup f

genValues n = genAux $ [replicate n True]
where genAux (v:vs) = if not $ or $ v then v:vs
else genAux ((next v) : v : vs)
next (True : ts) = (False : ts)
next (False : ts) = (True : (next ts))

solveFormula (Var v) env = fromJust $ lookup v env
solveFormula (And f1 f2) env = (solveFormula f1 env) &&
(solveFormula f2 env)
solveFormula (Or f1 f2) env = (solveFormula f1 env) ||
(solveFormula f2 env)
solveFormula (Not f) env = not $ solveFormula f env

allSolutions f = map (solveFormula f) (map (zip vars) vals)
where vars = nub $ getVarsDup f
vals = genValues (length vars)

-- solve the SAT problem
sat f = or $ allSolutions f

-- pointfree!
sat2 = or . allSolutions

The idea is the same as creating a truth-table for the formula and then looking for a T in the possible values for it. Function genValues generates all possible values for the atomic propositions, and this is fed into solveFormula. It would be very easy to alter the code to output the complete truth-table; I leave it as an exercise to the reader. Another standard exercise in logic textbooks is to prove that if a formula contains n atomic propositions, the truth-table will have 2^n lines. Thus, generating all combinations is easily seen to be exponential in time.

But the fact that it is hard for the computer to solve doesn't mean that it's hard for someone to write a (naive) program to solve it. In practical applications, like circuit design, we may need to solve SAT for thousands of variables, and the naive algorithm would not work. For this we must use a more clever technique, like Binary Decision Diagrams. They're not that hard to code either; the book Expert F# includes an example of BDDs in F#.

Wednesday, March 5, 2008

Tumbalalaika

It's probably too late to be hip because of this now, but I have created a tumblelog. It's easier to post there, so it should be updated more often than this blog. I'll probably post a good number of links in common with anarchaia.