Gwern comments on Ilya Sutskever depature from OpenAI, among other topics

vessenes 1 year ago

Gwern’s final point - that ‘safe’ decisions might lead to a dead end with ultimately not enough creativity to get to AGI - feels contrary to the rest of his essay, so I feel like I must not understand his point.

Is purging / exodus of safety folks at OAI a ‘safe’ choice?

Yo @gwern, I’d love your thoughts on this. To my mind you can read OAI right now as either ‘acceleration for the win’ or ‘the innovative soul is gone’ or just ‘with liquidity some people want to do some different things / work at a different scale than OAI has got going rn’.

Contrary to your pitch I do think there’s a wide middle road where acceleration + billions + solving engineering challenges and collecting data = something valuable and interesting, esp with Sam running strategy.

Maybe Ilya and co are needed for round 5, but maybe not, maybe from here it is just mostly engineering, data and regulatory capture, and the question of creativity won’t be the defining one. I’m not saying this is likely but I don’t think it’s unlikely either. Actually I’d say it’s more likely than not.

gwern 1 year ago

> maybe from here it is just mostly engineering, data and regulatory capture, and the question of creativity won’t be the defining one. I’m not saying this is likely but I don’t think it’s unlikely either. Actually I’d say it’s more likely than not.
I think there is still a lot of taste involved, and also that there is no reason to think that the current default scaling path is the optimal one. Sure, maybe even a braindead OA can reach AGI solely by coasting on inertia and scaling up on MS's Stargate and just training for long enough and spending enough on data and truly bruteforcing it. But that doesn't mean they'd be the first one there. There are a number of ideas floating around about how to do scaling much better. (There is no reason to think that Chinchilla was the end all be all.) And if you lack creativity or taste, you will neither think of nor pick the right one until after it has been proven out... just like pretty much everyone else ignored the scaling results in DL until well after GPT-3. Look at Baidu: their researchers published the first contemporary scaling law paper. Where are they now?
- vessenes 1 year ago
  
  Yeah, interesting and fair points. With the investor hat on, it makes sense to buy in to all of them, of course — doesn’t matter who wins from a portfolio returns point of view.
  From a predict-the-future point of view, I’ll say that Anthropic’s progress this year precisely backs your point that taste might in fact matter. Last year I would have said safety concerns had irrevocably ruined their product, but I would have been wrong.
  The nature of the company / group that does get to some scale-out breakthrough probably has a big impact here. It’s easiest for me to imagine NVIDIA getting a sustained lead with some sort of exotic layout / rapid chip iteration AGI than for a newcomer that’s going to be competing with an app. But, to your point I guess it depends on how much taste gets you - a year? Ten years? A few months?
  If it’s a California company, meaning no non-compete, I would guess that taste gets you max two years, maybe more like one. You’ll instantly have every member of the engineering team in the know able to raise $1bn+ for their own startup, and some will do so. So, you’ll want to be deeply embedded, really rich, and part of global infrastructure before you get there in order to capitalize on your lead. Which, to me, leads back to OA/MS as the current winner.
  See you in a couple years — looking forward to finding out what happens.
  
  gwern 1 year ago
  
  > From a predict-the-future point of view, I’ll say that Anthropic’s progress this year precisely backs your point that taste might in fact matter. Last year I would have said safety concerns had irrevocably ruined their product, but I would have been wrong.
  Yes. Claude-2 underperformed ChatGPT... But you could still see that it was much more pleasant to use for creative writing and nonfiction, while not seeming to be much worse in terms of jailbreaking/safety, and so their RLAIF seemed like the right approach compared to RLHF.
  But meanwhile, OA appears to be in complete denial as an organization that there is anything wrong whatsoever with ChatGPT, and disinclined to 'delve' into what might be the issue with their approach. This is despite the peculiar and alarming side-effects of their RLHF, like the rhyming verse, and our general inability to understand the full implications and all side-effects of decisions like BPE or RLHF. It's not hard to imagine that these sorts of hacks might eventually limit the potential of OA's models - maybe you don't care about non-rhyming poetry, but what else is being handicapped by it? And if no benchmark told you about the rhyming thing, why do you think that they will tell you about other things? How would you know if your new models are incapable of making genuine scientific breakthroughs (which always require a lot of creativity and daring) because your RLHF was wrong and you've built up too much data & infrastructure and now have sunk-cost bias ruling everything?
  It's one of those Feynman things: RLHF was not supposed to destroy poetry, so the fact that it did tells you that something has gone alarmingly wrong in your understanding of what you are doing. And when an organization decides it's not interested in how its rubber seals perform in unusual conditions... And if an organization can't be honest about these sorts of things, that bodes poorly for the long run.
  As Karpathy says, 'neural networks want to work', even if your code has severe flaws; they will just fall perpetually well short of where they could if you fixed those hidden, silent, invisible flaws. OA could wind up puzzling over its frontier models and wondering why the $10b run didn't yield all that impressive a result and not willing to risk the $100b AGI run, while Anthropic just smoothly scales up with a much better bang for buck, and the tortoise passes the hare.

JSDevOps 1 year ago

Cool read. Makes sense.