Show HN: A weekend replication of STOKE, a stochastic superoptimiser

pcwalton 6 years ago

I used STOKE to generate some SIMD code for PNG decoding once. Fun stuff!

I generally found it most useful to have STOKE generate the code, then reverse engineer what it did and rewrite equivalent code by hand in a higher-level language like C or Rust. The nondeterminism of the process and unmaintainability of the resulting code makes it hard to justify shipping STOKE as-is. But I did find it useful as a tool to give me general optimization ideas. Sometimes it finds clever things you would never have thought of on your own.

azhenley 6 years ago

Any chance you can give us a summary since GitHub is currently down?

lemcoe9 6 years ago

A weekend replication of STOKE, a stochastic superoptimiser
A super optimizer is a type of compiler that takes a program and finds the optimal sequence of instructions that perform the same task as the original program.
A stochastic thing X is a thing X that uses randomness.
STOKE is a superoptimizer that finds optimal instruction sequences to perform a particular task by using numerical techniques which rely on randomness (markov-chain-monte-carlo/MCMC) to explore the space of "all possible programs".
Here, we re-implement STOKE for a tiny stack machine language whose instruction set is Push, Add, Mul, and And. Sample output
* original: (nparams: 0 | [IPush 2,IPush 3,IAdd])* [IPush 5] | score: 2.5 // constant folding: 2 + 3 -> 5
* original: (nparams: 1 | [IPush 2,IMul])* [IDup,IAdd] | score: 2.25 // strength reduction: 2 * x -> x + x
* original: (nparams: 1 | progInsts = [IDup,IAnd])* [] | score: 3.0 // strength folding: x & x == x
That is, we automatically disover those program transformations by randomly trying different programs (in a somewhat smart way).
jcranmer 6 years ago

I'm not the OP, but I have a little bit of experience with STOKE.
STOKE is a superoptimizer, which seeks to solve the problem of optimizing code essentially by means of mathematical optimization (finding the optimal point), in this case on a discrete, highly non-linear input space. These optimizers usually focus on peephole optimizations, which look at short sequences of instructions, because the problem is much less intractable.
The classic superoptimizers (say, http://theory.stanford.edu/~sbansal/pubs/asplos06.pdf) solved the problem by precomputing the best assembly variant for all short (~4 instruction) sequences, but that clearly doesn't scale. STOKE instead approaches the problem in a manner more similar to classic hill-climbing algorithms from machine learning: you randomly mutate the program to discover alternatives, accept the mutation under certain conditions, and do a random walk on these mutations to hopefully arrive at the best result. The result is sensitive to how you implement the details, but it does scale somewhat better (tens of instructions).