Svelte Hacker News logo
  • top
  • new
  • best
  • show
  • ask
  • jobs
  • about

A complete Llama2 inference engine that fits in 1356 bytes of x86 assembly

github.com

26 points by monax 20 hours ago