'Why is this Haskell code so slow compared to other languages?
As an exercise in learning Haskell I've written an interpreter for the CESIL language (an old, very basic, educational low level language). It works, but compared to implementations I've written in other compiled languages it's very slow, even when compiled with -O2
, and only a little faster than Python. Timing a large CESIL program with time
gives:
Haskell:
real 0m0.346s
user 0m0.199s
sys 0m0.016s
Go:
real 0m0.243s
user 0m0.003s
sys 0m0.007s
Python:
real 0m0.414s
user 0m0.387s
sys 0m0.004s
Here's the main execution part of the code where most of the time is spent, and which I would like to speed up. I'm new to Haskell, and I'm sure there are better, cleaner ways of writing some of this, but my main concern at the moment is the speed. Hopefully I've included enough to make sense:
-- Define the computer state
data Computer =
Computer
{ program :: Array Integer Instruction
, dataVals :: [Integer]
, ram :: Map.Map String Integer
, acc :: Integer
, pc :: Integer
, steps :: Integer
, halted :: Bool
}
-- Initialise the computer and run the program
runProgram ::
Array Integer Instruction -> [Integer] -> Params -> ExceptT String IO ()
runProgram pr dv pars = do
let comp =
Computer
{ program = pr
, dataVals = dv
, ram = Map.empty
, acc = 0
, pc = 0
, steps = 0
, halted = False
}
comp' <- execute comp pars
if countSteps pars
then liftIO . putStrLn $ "Steps executed: " ++ (show $ steps comp')
else return ()
-- Main execution "loop"
execute :: Computer -> Params -> ExceptT String IO Computer
execute comp pars = do
liftEither $ checkPC comp
(comp', output) <- liftEither $ step comp
liftIO $ putStr output
case () of
_
| halted comp' -> return comp'
| Just (steps comp') == maxSteps pars -> do
liftIO $
putStrLn $
"Execution halted after " ++ (show $ steps comp') ++ " steps."
return comp'
| otherwise -> execute comp' pars
-- Check program counter is in range.
checkPC :: Computer -> Either String ()
checkPC comp =
if pc comp >= (toInteger . length . program $ comp) || pc comp < 0
then Left $ "PC OUT OF RANGE: " ++ (show $ pc comp) ++ "\n"
else Right ()
-- Execute a single step/cycle
step :: Computer -> Either String (Computer, String)
step comp = do
let Instruction lineNo label opCode operand =
program comp ! (fromIntegral . pc $ comp)
comp' = comp {pc = pc comp + 1, steps = steps comp + 1}
case opCode of
IN ->
if null $ dataVals comp
then Left $ "Data exhausted at line " ++ show lineNo ++ "\n"
else let a:dv = dataVals comp
in Right (comp {acc = a, dataVals = dv, pc = pc comp + 1}, "")
OUT -> Right (comp', printf "%8d" $ acc comp)
LINE -> Right (comp', "\n")
PRINT ->
let TextOperand s = operand
in Right (comp', s)
HALT -> Right (comp' {halted = True}, "")
LOAD -> do
n <- getVal operand comp' lineNo
Right (comp' {acc = n}, "")
STORE ->
let SymbolOperand s = operand
ram' = Map.insert s (acc comp') (ram comp')
in Right (comp' {ram = ram'}, "")
ADD -> do
n <- getVal operand comp' lineNo
let a = acc comp' + n
Right (comp' {acc = a}, "")
SUBTRACT -> do
n <- getVal operand comp' lineNo
let a = acc comp' - n
Right (comp' {acc = a}, "")
MULTIPLY -> do
n <- getVal operand comp' lineNo
let a = acc comp' * n
Right (comp' {acc = a}, "")
DIVIDE -> do
n <- getVal operand comp' lineNo
if n == 0
then Left $ "Divide by zero error at line " ++ show lineNo ++ "\n"
else let a = acc comp' `div` n
in Right (comp' {acc = a}, "")
JUMP -> do
let AddrOperand a = operand
Right (comp' {pc = a}, "")
JIZERO -> do
let AddrOperand a = operand
comp'' =
if acc comp' == 0
then comp' {pc = a}
else comp'
Right (comp'', "")
JINEG -> do
let AddrOperand a = operand
comp'' =
if acc comp' < 0
then comp' {pc = a}
else comp'
Right (comp'', "")
NoOp -> Right (comp' {steps = steps comp}, "")
-- Get the value of a numeric operand, which may be a literal constant
-- or a reference to a stored variable.
getVal :: Operand -> Computer -> LineNo -> Either String Integer
getVal (ValueOperand (Left n)) _ _ = Right n
getVal (ValueOperand (Right s)) comp lineNo =
case Map.lookup s $ ram comp of
Just n -> Right n
Nothing ->
Left $ "Unknown variable: '" ++ s ++ "' at line " ++ show lineNo ++ "\n"
Solution 1:[1]
As others have pointed out, you're not actually comparing apples to apples here - you've chosen types which are well known to be inefficient - String
, Integer
, []
. So, in order of things to try:
- Profile your code - we can make guesses about what might be slow but only the program can tell us if that's true. GHC 9.2 has made some nice improvements in profiling optimised code, see https://well-typed.com/blog/2022/05/hasura-supports-haskell-tooling/. Covering how to do profiling is a topic too large to go into here, but there is a lot of documentation available: https://downloads.haskell.org/ghc/latest/docs/html/users_guide/profiling.html
- Try basic improvements to the types you use, stop using String and switch to
ByteString
, useInt
instead ofInteger
(you'll lose the ability to do arbitrary precision calculations but I assume CESIL was never intended to do that). Using a HashMap instead of a Map might see some improvements, but you'll have to benchmark to know. - Be more explicit about strictness - most, if not all, of the fields in the Computer type could be made strict to tell the compiler "Always evaluate this, there's no need for it to be lazy":
data Computer =
Computer
{ program :: Array Int Instruction
, dataVals :: [Int]
, ram :: !HashMap ByteString Int
, acc :: !Int
, pc :: !Int
, steps :: !Int
, halted :: Bool
}
Doing this will also remove many of the conversions you had between Integer
and Int
which are unnecessary (unless you really want your program to work with programs with more than 9,223,372,036,854,775,807 instructions)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Axman6 |