points by hammyhavoc 2 years ago

As with any of these concepts, they will ultimately live and die by their userbase and thus value of the content presented. Having the software itself is only a small part of the battle ahead.

https://join-lemmy.org/instances says there's only 2.6k monthly active users in the entire "lemmyverse". That's not even getting off the starting blocks for even a single niche sub-Reddit.

Getting people to even humour the idea of going elsewhere when there's genuine value in pre-existing content is difficult, especially when it comes to troubleshooting, guides, advice et al, which unfortunately, Reddit is brimming with.

Windows Phone was a pretty good example of the "no users" problem: a solid alternative to iOS and Android, but not much in the way of big apps because of a lack of users, but the lack of users primarily stemmed from a lack of big apps. Chicken-and-egg scenario ad infinitum.

geysersam 2 years ago

Would it be possible to "replicate" all (or most) existing reddit content to bootstrap a new system?

Not sure if all data is available anywhere? (Common crawl? Those big internet scrapes used to train language models?)

  • hammyhavoc 2 years ago

    It's certainly possible, but there's no recipient on the other end of any interactions for it. On some Reddit threads, I can chime in years later and ask how x panned out for someone in terms of advice taken, or if they ever found a solution, or what they are now doing after x solution is no longer viable.

    Most of the value that I derive from older content is being able to recontextualize it for the present day, especially with extremely niche stuff.

    However, user hostility from Reddit itself is likely going to lead to an exodus of power users, just like with Twitter post-acquisition. It might not be huge numbers of people quitting, but they're meaningful users, which are a sad loss community-wise. Especially if you assume that way more people just view a platform than actually post things on it, which is apparently vastly skewed in that way for most platforms; "lurkers".

  • jakabia 2 years ago

    The whole reddit (posts and comments separately) from 2005-06 until 2022-12 is on this [1] torrent link, it's very easy to download, extract and use the data [2]. I'm writing my thesis about the connection between the reddit post's type and the comment structure, and I've been working with this data, for a few months, it's amazing.

    [1] https://academictorrents.com/details/7c0645c94321311bb05bd87...

    [2] https://github.com/Watchful1/PushshiftDumps

    • hammyhavoc 2 years ago

      If that thesis is ever publicly available, I would love to read it: me@hammyhavoc.com

      • jakabia 2 years ago

        Sure, let me get back to you in 1.5 months

        • hammyhavoc 2 years ago

          Wishing you happy research and writing! Looking forward to it. :- )

    • geysersam 2 years ago

      That's really cool! But a lot happened on Reddit between 2012 and 2023. I think the idea of using old content to bootstrap a new community site won't be feasible unless more recent content can be included.

      (I also imagine something like >95% of all Reddit content was produced in-between 2012 and 2023.)

      • glasslyrata 2 years ago

        It's from 2005 to 2022, so only this year is missing.