Ask HN: How much would it cost to replicate GPT-2's model?
Cost 1: Scrape 8 Million articles from Reddit posts.
http://files.pushshift.io/reddit/
Cost 2: Training a 1.5 B parameter transformer network.
I'm assuming it could be trained with google's TPUs. And the model would just be a bigger version of the model open ai released.
> Their model used 256 of Google's Cloud TPU v3, though I've not seen training durations. The TPU v3 is only available individually outside of @Google (though @OpenAI likely got special dispensation) which means you'd be paying $8 * 256 = $2048 per hour.
https://twitter.com/Smerity/status/1096189352743301120
> Thanks. So then it was 32 TPUv3s, to be more precise, and sticker-price training costs would then be per Smerity 32 * 24 * 7 * 8 = $43k?
https://www.reddit.com/r/MachineLearning/comments/aqlzde/r_o...
It's likely even more since they needed to perform a hyperparameter tuning. Multiply it by 10 or 100 to get a more realistic estimate.