How can I do my research as a GPU poor?

23 points by luussta a year ago

I need to train 70b parameter LLMs for my research on world simulation and self-improving system. Right now I can only train small models on my 8gb 3050 or for a couple of dollars on the cloud, but I'm lacking resources to train better, faster models. What's your advice?

protocolture a year ago

Have been looking at this myself and it seems the advice isnt good.

Its basically Cost/Speed/Size Pick 2 or maybe even 1.

Some people have been able to run large LLM's on older slower CUDA gpus. A lot of them are truly ancient and have found their way back to ebay simply due to market conditions. They aren't good, but they work in a lot of use cases.

There are a couple of janky implementations to run on AMD instead. But reviews have been so mixed that I decided not to even test it. Ditto multi GPU setups. I thought that having access to 16 8g AMD cards from old mining rigs would have set me in good stead but apparently it benches roughly the same as just using a server with heaps of RAM because of the way the jobs are split up.

The cloud services seem to be the best option at the moment. But if you are going to spend 100 bucks vs going to spend 1000 bucks it might be worth just to fork out for the card.

Also honestly, hoping that someone else has a better idea in this thread because it will be useful to me too.

gperkins978 a year ago

A100's are cheap. Are those bad now?
- protocolture a year ago
  
  Define cheap?
  I am seeing an average cost of 15k+ on feebay.
  I think anyone with 15k could put together a rig with enough VRam to handle a decent model.
  The question is more focused on a budget that seems sub 2k.
luussta a year ago

thanks!

worstspotgain a year ago

I'd say the first step is to rule out LoRA. If LoRA is an option, it buys you more than just the rig savings:

- You can deploy multiple specialized LoRAs for different tasks

- It massively reduces your train-test latency

- You get upstream LLM updates for "free," maybe you can even add the training to your CI

luussta a year ago

thanks!

jebarker a year ago

Is there a different angle on the research you can take that doesn't require training models of that size?

When choosing research problems it's important to not only follow what's interesting but also what's feasible given your resources. I frequently shelve research ideas because they're not feasible given my skills, time, data, resources etc

RateMyPE a year ago

AWS, Azure and GCP have programs for Startups and for Researchers that give free credits for their respective platforms. Try applying for those programs.

They usually give between $1000 and $5000 worth of credits, and they may have other requirements like being enrolled in college, but you should check each of their respective programs to find out more.

_davide_ a year ago

I have a couple of M40 with 24 gb on a desktop computer, i had to tape the pci-e to "scale it down to 1x". It's ok for interference and play around with small trainings, but I can barely infer a quantized version of 70b parameters. Training anything bigger than 3b parameters in human timescale is impossible. Either you scale it down or ask for a sponsor. It's frustrating because IT had always been approachable, until now...

8organicbits a year ago

Are you part of a university? Do they have resources or funding you can access?

luussta a year ago

I'm not part of any university

RecycledEle a year ago

Look into buying a used Dell c410x GPU chassis (they are $200 on my local CraigsList) and a Dell R720. Then add some cheap GPUs.

For a few thousand dollars you might get some processing power.

But you will never be able to pay your electric bill.

thijson a year ago

Is there a cheap GPU you suggest? I was looking at NVidia P40's as they have 24 GB of ram, and cost a few hundred dollars.
- gperkins978 a year ago
  
  I was going to recommend an A100, but they are not available cheaply anymore. I set someone up with two of them I managed to score from Amazon for like $400 each, but used. They had 40gb RAM, but it was HBM and super fast. There were mining cards based off of these as well, but I have no idea what was stripped down to make them cheaper. The only issue is that you need to heat sink them if they are not going in a server, and in a rack-mounted server, there needs to be proper cooling (big dog fans).
  But I just looked again, and A100's are not around at reasonable prices anymore. I cannot even find the old mining equivalents (they used to be everywhere, and CHEAP). Perhaps many people are building similar systems now. After the Ether merge it was a great time to build.
- NBJack a year ago
  
  Yes, just ensure you can keep them cool. I have an old m40 that opened up my options with 24GB of VRAM, installed on a traditional case with a 3d printed cooler adapter + fan. While it isn't always fun getting older cards to work, it is certainly viable (and scalable if needed).
luussta a year ago

thanks!

muzani a year ago

Would this help? https://github.com/evilsocket/cake

dragonwriter a year ago

That project says distributed inference, OP is looking for training, not inference, solutions.

Log_out_ a year ago

Do an internship at nvidiavdrover development?

newsoul a year ago

Choose Your Weapon: Survival Strategies for Depressed AI Academics

https://arxiv.org/abs/2304.06035