Show HN: Train Your Own Language Model on Your Favourite Programming Language
github.comIn the context of the Deep Learning class at Tsinghua University, my teammates and I have trained our own (tiny) LLM on Go code using the `go/scanner` tokens as the tokenizer!
If you're ever looking to train a single-GPU Language Model over the weekend on your favorite Programming Language, feel free to use our code and VSCode Extension.