Ask HN: When/where can we get an offline GPT3 type chatbot?

33 points by namrog84 2 months ago

With the new OpenAI's ChatGPT, it's super neat and all. But there were lots of issues I had with it, sometimes I felt like I was running into arbitrary guard rails because they are concerned about bad PR or something.

Also the fact that you have no control on weights or other modifications to help steer it into certain areas.

After having been majorly spoiled by StableDiffusion and being offline and all the community mods/changes that have been contributed to it. I now what an offline chatbot model.

I think I read there are some older GPT2 that is available offline but also that most of them are still considered 'inefficient'. What does this mean? Is it the compute to use it, or the physical size of the model? Would it be at all possible to split it into groups or stuff (e.g. I only care about English and programming languages and not about other cultural languages).

I am sorry if this is common knowledge to those in the know, but could someone help share some details if what I am asking is silly(like asking for an offline version of a search engine) or I am asking the wrong questions?

tripplyons 2 months ago

I think the best publicly available model that can follow instructions right now is https://huggingface.co/bigscience/bloomz.

It has 176 billion trainable parameters, but I think it uses up terabytes of memory, so there is a trade off between model size and the ability of the model.

On most GPUs, you should be able to run https://huggingface.co/google/flan-t5-large. It is pretty good and is trained to follow instructions.

  • piecerough 2 months ago

    InstructGPT is actually 1.3B, you don't need 175B like initial GPT-3 model: https://openai.com/blog/instruction-following/

    ChatGPT is likely a much better variant, but also must be still small and will likely be portable.

    • tripplyons 2 months ago

      ChatGPT is based on GPT3.5 which is a series of 175B models that includes code-davinci-002, text-davinci-002, and text-davinci-003. I do not think of it as portable.

      • PartiallyTyped 2 months ago

        Network distillation to the rescue! Provided that you have the means to distill in the first place :D

      • ActorNightly 2 months ago

        Would it be possible to run this entire model sequentially, dumping intermediate results to SSD?

  • moffkalast 2 months ago

    I've just tried out a few of the flan-t5s and they're surprisingly coherent. It likes dogs and pine trees, thinks Makerbot is the best 3D printer lol. Can even generate some code, though it's usually wrong. And it can't seem to decide if a flower pot is orange or red.

    Any idea if it's possible to chain feed a conversation into this one? I've tried a few various Q and A formats but none seem to really grab the old context.