Lots of talk about must‑have features and backups here...
BUT there's another piece that makes or breaks these tools... whether they can build a community around them and stick around for years...
Open‑source cloud storage projects come and go when maintainers burn out... a sustainable business model or strong contributor base matters as much as technical checklists...
ALSO interoperability is underrated... if your drive can speak WebDAV or S3 and plug into existing identity systems, teams are more likely to try it...
In the end people want something that won't vanish after the honeymoon... that's harder than adding a progress bar...
Is that a weakness of the tool's organizational model?
I don't want to be part of a community around my cloud storage. I want it to work and I want to think about it as little as possible.
I use Syncthing and it does a pretty great job at this, no one ever insisted I need to join a Syncthing community, yet it keeps on working.
I don't pay a dime for Syncthing but I'm vaguely aware that they're linked to a company called Kastelo which provides enterprise support for Syncthing deployments. Probably a lot of Syncthing development is paid for that way.
Incidentally I founded an open source consulting company that's totally unrelated to cloud storage. We have enterprise as well as smaller contracts. We develop some addons in-house and the bigger enterprise contracts tend to subsidize most of the work that goes into them. We haven't asked anyone to be part of a community and I don't think we need to.
Communities are nice, but if you want your software to last I think a good business model and a good marketing strategy are a better bet. Bonus, you can quit your day job.
For a business headed open source project, it's still about the community. In this case, the business tends to take a defining and often controlling role in the community. This has plusses and minuses. On the plus side, if a business has a vested financial interest in the project, there is financial incentive for continuity. On the minus, when the company's financial interest no longer aligns with the community, many of us retain scars from rug pulling and switcheroos.
So understanding the long term stability of a community is more than just checking whether there is a company backing the project. It is important to analyze the nature and diversity of interest. I think it's just as important that there exists a larger community that the business depends on for extra feature + bugfixing work which is capable of forking. When this is possible, it is much less likely to be necessary.
Indeed. "S3 compatible" is the state of the art for object storage imho. As long as you can talk to a storage system that supports the basic S3 primitives, longevity is improved and there is no lock in. You can use S3 proper, Backblaze, Wasabi, Backblaze B2, local storage exposing an S3 api, etc. Any replacement is mostly drop in assuming it can read, scan, index existing objects.
- Cryptomator (https://cryptomator.org/) to crypt/decrypt sensible directories
(that are synchronized through Syncthing)
Cryptomator allow me to access also the directories via webdav
- MaterialFiles on Andrid to access the files on the server
I access my mini server from outside with a Wireguard VPN created on my Fritz!Box router.
Between home and office I created a site-to-site Wireguard VPN between the two Fritz!Box routers.
Seafile is the only good enough thing i've found so far for self-hosted file sync. But it is still a pain to upgrade the server version. nextCloud and friends is a complete disaster in my oppinion.
> nextCloud and friends is a complete disaster in my oppinion.
Why is that?
Have been using NextCloud in our company and for myself, and I couldn't be happier, no issues since 3 years, all the tools and plugins I need, sync running perfect and hassle-free and performant. I thought it's generally liked up until now - I didn't try any of the alternatives though, so they might indeed be better. Though I don't have any reason to try them tbh, as NC works almost too well.
Using Nextcloud on the web feels like a state of the art 2015 PHP web UI. It is... fine. But compare it to immich for example and they're just not playing in the same league imo
I've been using Nextcloud for years, but I've never used the web UI. Windows desktop app for syncing my Documents folder, and Android app for synchronizing a few folders on my phone, as well as the "append only" upload of my photo reel that something like SyncThing doesn't support. Works great, never had any issues with Nextcloud. The real value is in the companion apps.
I use a cron job to back up Nextcloud to B2 and S3 Glacier.
I use S3 glacier on Scaleway (European cloud provider). Storing about 3TB costs me about 7€ per month including VAT. I've had to restore 1TB in the past and it cost me around 10€. Not bad for a worst case scenario. They also do mini VPSes for 3€ per month with unlimited traffic. Really nice provider, I've used them for many years.
I've only tested partial restores from Glacier since it is expensive. I've got a raidz2 array locally as insurance against having to restore from a backup.
> A 18 Tb NAS harddrive is about 320 USD. A 2 bay unifi unas2 199. It pays off in one year. Restoring data from it is free.
That single 18TB HD is hardly safe from a disaster or even plain old hardware death, and it's a single point of failure. You need at least 3 times as many HDs to start to have something you can actually rely on to keep your data for 3-5years.
100%. Though their UI has been update a little with the last major release.
> But compare it to immich for example and they're just not playing in the same league imo
I mean, this doesn't make sense at all, tbf. They're literally not in the same league, as their targeting different use cases. Nextcloud offers a MUCH broader experience, while Immich has a very clear cut focus and does nothing outside of that. Comparing it doesn't make any sense. Except if you're actually talking about the UI exclusively. Then, yes, Immich feels much more modern and smooth.
Theres a lot of weird setup often required on the backend in my experience, but when it works, it works well. But until you get everything dialed in it can have weird issues that don't have a clear path to fix them.
It might be better in their weird AIO solution? But i dont like the idea of giving a docker container the ability to spawn more containers. I just use one of their normal docker containers and have had to manually change a lot to make it work as they actually suggest. Like just recently i setup their notify_push plugin as it improves performance - but the provided setup instructions didn't work in my setup and i had to manually tweak several things.
It took a while for me to fully set up Nextcloud with STUN/TURN, Office server, etc. in a properly containerised setup. It clearly felt like it was built before containers and modern devops approaches were a best practice.
And while community is great, I don't think Nextcloud developer community is that big and active. Their plugin system is basic, archaic, lots of things there are begging for rework.
So while Nextcloud is decent once set up, I am happy to see some fresh OSS projects solving similar issues appear. Maybe their approach will be better.
Resilio is also pretty good, depending on your use case. (Syncthing is great too, but Resilio seems faster and better at NAT traversal in my experience.)
I recently got into self-hosting Seafile and successfully set it up on my dedicated server. Had to think backup and security strategies quite a bit and ultimately I set up a bulletproof backup mechanism. Tested it pretty rigorously.
Seafile took me by surprise in terms of how quick it was at picking up new files and changes - syncing works incredibly well too. I moved all my files from my Google Drive into my Seafile instance and I'm now using it on all my devices as my main cloud storage solution.
Nextcloud suffers from flexibility, it's got a lot to offer but requires dialling in to your specific use case, the mistake most admins is to assume you can just run it without tuning, it has too many differing options to do that smoothly out of the box.
The ability to just run it in a snap has really contributed to this imho, Nextcloud is enterprise software you just happen to be able to run in your homelab.
I desperately want to be a fan of ownCloud, because it offers clients natively across Mac/Linux/mobile, but it’s such a mess. Every platform has small bugs and reliability problems that makes the whole thing useless.
If you just need a web interface to your filesystem, there’s this single Go executable (https://github.com/filebrowser/filebrowser) that supports sharing and minimal user management.
Snap isn't the best experience for Nextcloud in my experience, fine for a demo or a single user instance that isn't mission critical. Users who expect more out of it will often bump up against its limitations.
Anyone who wants to seriously use Nextcloud should look into the AIO docker containers or rolling the individual containers themselves. Nextcloud has expanded into a full groupware stack and it's expected you have an actual admin managing the system like with any real deployment of enterprise software
It includes most of the essential features, and I’d say it’s excellent for professional use. I’ve been running an instance for many years on a VPS for work collaboration, and it’s been perfect. It’s now hosted behind Cloudflare Tunnels, with group members whitelisted by email.
If you need more advanced or fancy/niche features, AIO might be a better though heavier fit (I run an instance of AIO at home, mostly for testing). Snap is lightweight and a bit opinionated (in reasonable ways in my view), and the documentation used to mention some of its limitations. In exchange, you get snappier, more robust installation.
He complained about the difficulty of installing an application. He didn’t complain about establishing a personal data center.
That one line will give you the Nextcloud. Exactly one more line in snap will give you a self sign cert. Alternatively, the line below will give you remote access, a domain, and a valid certificate for your application:
It seems reasonable that someone would want to go beyond just installing software; they are presumably doing so in order to use it for its purpose. Being pedantic about the nature of the complaint (i.e. "He complained about the difficulty of installing an application. He didn’t complain about...") seems to miss the point. All of the additional steps you lay out also have their own steps to get done or decisions to be made, and when it is all said and done, it seems reasonable to imagine that things could get quite complicated.
I mean if you want a working Nextcloud instance, available through VPN with backups, then no, it doesn't get more complicated than that, actually. It is incredibly easy.
I've run a self-hosted Nextcloud instance for many years and Docker is by far the easiest. I started off with a native installation and that can be a pain when upgrading the OS (Ubuntu in my case). I tried the snap version when that became available and was impressed by how easy it was, though administering it required a bit of learning as the file locations where all different.
Running it in Docker made it so much easier to administer (maybe add in the missing db indexes if there's a major version change).
If you want, I can paste my docker-compose.yml for reference as it's relatively complex.
Open source drive tools live or die on three things.
1) Simple sync that never surprises.
2) Clean conflict handling you can explain to a non tech friend.
3) And zero drama upgrades.
If Twake nails those and keeps a sane on prem story with S3 and LDAP, it has a shot. The harder part is trust and docs. Clear threat model. Crisp migration guides from Drive and Dropbox. And a tiny CLI that just works on a headless box. Do these and teams will try it for real work, not just weekend tests.
I'd add a fourth; "Make it easy to do backups and verify they're correct".
I don't think I've ever considered a data store without that being one of my top concerns. This anxiety comes from real-life experience where the business I worked at had backups enabled for the primary data store for years, but when something finally happened and we lost some production data, we quickly discovered that the backups weren't actually possible to restore from, and had been corrupted this whole time.
Heh - I once made a little chunk of change, because a former client from 10-years previous discovered the shiny "DVD/CD" backups had succumbed to "bit-rot" and needed some source code.
I grabbed the hard-drive off the shelf, put it in an enclosure and handed them the source-code... (At the time, every time I upgraded my system, I would just keep my old drives, so... had a stack of them - buy a new external enclosure, slot it and park it.)
Depends. Even something basic like "Check if the produced artifact is a valid .zip/.tar.gz" can be enough in the beginning, probably would have prevented the issue I shared before.
Then once you grow/need higher reliability, you can start adding more advanced checks, like it has the tables/data structures you expect and so on.
I had a funny where I somewhat regularly test an sql backup, then one day it didn't work, it worked the second time, the 3rd and the 4th. I have no idea why it didn't work. It turned into a permanent background process in the back of my head. The endless what-if loop.
I’m not sure what your point is. Business continuity requires a disaster recovery plan that must be tested regularly. It might be considered slog work, but like taking out the garbage, it’s non negotiable and must be done.
"Great, first you wanted more money to buy compute and storage for dev and staging separate from production, and now you even more for 'testing backups'?!"
I'd like a manual "sync now" option. Sometimes I put stuff in google drive using windows explorer and it's not immediately obvious if it is syncing, why it is or isn't, or what I need to do to make it.
I've got a theory that progress bars for main functionality tasks and the associated manual triggers in modern software are out of favor, as it creates a stage for an error to be displayed and creates expectations the customer can lean on. Less detail in errors displayed to the customer removes their ability to identify a software problem as unique or shared among others.
I think you're right and I think I insufficiently considered malice as the reason for a lot of this type of minimalism. This "SWW" message is great as it doesn't even give a hint as to whether the problem is with the server (all vendor's fault), the network (not vendor's fault), or a client fault (maybe vendor's fault, maybe customer just needs to update it). Users can just do brute force things like "Swipe up the app and open it up again" and eventually just give up.
Syncing should be in the control of users. user should be able to trigger or abort the sync. Also it should provide some sort of indicator of progress.
58.9% TypeScript and 32.6% JavaScript wouldn't be my first preference to implement such a high performance and throughput demanding application? Why is that?
A bit off-topic, but is there a way I can convince various apps (Viber, WhatsApp) to use some replacement instead of Google Drive for backup? They do not offer such an option, but maybe by rooting the phone and faking the interface, or ...?
Where I see the open source platforms lack is selective sync. That thing that dropbox and OneDrive etc do where they have a folder with only placeholders for your files, and only when you actually access one it gets downloaded and then keeps being synced.
The ones I've tried could only download once off via the web, or sync whole folders but not do the placeholder thing. That doesn't really work for me.
Given how integrated Drive and Docs are, if this doesn't have docs-like collaborative realtime document editing, for many people this is like "30% of Google Drive"
For people whose UX is dragging and dropping stuff to browser, and/or using a desktop sync client only, sure why not, the UI looks clean and familiar. But as someone who has used and still uses like 3 different similar things concurrently, the only real reason I use drive is because of the seamless zero-dependency office-like web software being part of the product.
(yes I know it's a curse too, I ended up writing a piece of software just to migrate company drive stuff to my personal drive when a company I was a cofounder in went bust to have a record ... those google docs can really only exist in Drive natively, any export is an immediate downgrade)
Do you really need a database for this? On a unix system, you should be able to: CRUD users, CRUD files and directories, grant permissions to files or directories
Is there a decade-old software that provides a UI or an API wrapper around these features for a "Google Drive" alternative? Maybe over the SAMBA protocol?
How would you implement things like version history or shareable URLs to files without a database?
Another issue would be permissions: if I wanted to restrict access to a file to a subset of users, I’d have to make a group for that subset. Linux supports a maximum of 65536 groups, which could quickly be exhausted for a nontrivial number of users.
There is no support for writing multiple xattrs in one transaction.
There is no support for writing multiple xattrs and file contents in one transaction.
Journaled filesystems that immediately flush xattrs to the journal do have atomic writes of single xattrs; so you'd need to stuff all data in one xattr value and serialize/deserialize (with e.g JSON, or potentially Arrow IPC with Feather ~mmap'd from xattrs (edit: but getxattr() doesn't support mmap. And xattr storage limits: EXT4: 4K, XFS: 64k, BTRFS: 16K)
Well sure there’s a bevy of features you’re missing out on, but it would work. Object store and file metadata solves both of those though feels like cheating.
> Filesystem or LVM snapshots immediately come to mind
I use ZFS snapshots and like them a lot for many reasons. But I don’t have any way to quickly see individual versions of a file without having to wade through a lot of snapshots where the file is the same because snapshots are at filesystem level (or more specifically in ZFS, at “dataset” level which is somewhat like a partition).
And also, because I snapshot at set intervals, there might be a version of a file that I wanted to go back to but which I don’t have a snapshot of at that exact moment. So I only have history of what the file was a bit earlier or a bit later than some specific moment.
I used to have snapshots automatically trigger every 2 minutes and snapshot clean up automatically trigger hourly, daily, weekly and monthly. In that setup it was fairly high chance that if I make some mistake with an edit to a file I also had a version of it that kept the edits from right before as long as I discover the mistake right away.
These days I snapshot automatically a couple of times per day and cleanup every few months with a few keystrokes. Mainly because at the moment the files I store on the servers don’t need that fine-grained snapshots.
Anyway, the point is that even if you snapshot frequently it’s not going to be particularly ergonomic to find the version you want. So maybe the “Google Drive” UI would also have to check each revision to see if they were actually modified and only show those that were. And even then it might not be the greatest experience.
If you are on windows with a Samba share hooked up to zfs you can actually use the "previous versions" in file explorer for a given folder and your snapshots will show up :) there are some guides online on setting it up
With no command line use needed, you can:
Navigate the entire filesystem,
Create, delete, and rename files,
Edit file contents,
Edit file ownership and permissions,
Create symbolic links to files and directories,
Reorganize files through cut, copy, and paste,
Upload files by dragging and dropping,
Download files and directories.
I have no idea how this project was designed, but a) it's expectable that disk operations can and should be cached, b) syncing file shares across multiple nodes can easily involve storing metadata.
For either case, once you realize you need to persist data then you'd be hard pressed to justify not using a database.
I don't know of one- have thought this before but with python and fsspec. Having a google drive style interface that can run on local files, or any filesystem of your choice (ssh, s3 etc) would be really great.
I'm unironically convinced that a basic Samba share with Active Directory ACLs is actually probably the best possible storage system...but the UI for managing permissions sucks, and most people don't have enough access to set it up the way they want.
Like broadly, for all configuration Hashicorp Vault makes you do, you can achieve a much more useful set of permissions with a Samba fileshare and ACLs (certainly it makes it easy to grant targeted access to specific resources - and with IIS and Kerberos you even have an HTTP API).
I need to remind that the time when a service's tenant — be it a file, email, whatever else — automatically meant there was an OS user account for that user, has also been decades ago.
I do, password-protected of course. It is the only "native" way I found to get server files access to my iPhone without downloading a third party app (via Files).
I really hope you lock it down to something like Tailscale so that you have a private area network and your Samba share isn’t open to the entire world.
Samba is a complicated piece of software built around protocols from the 90s. It’s designed around the old idea of physical network security where it’s isolated on a LAN and has a long long history of serious critical security vulnerabilities (eg here’s an RCE from this month https://cybersecuritynews.com/critical-samba-rce-vulnerabili...).
It seems like every network filesystem is irredeemably terrible. SMB and NFS the stuff of security nightmares, chatty performance issues, and awkward user id mapping. WebDAV is a joke. SSHFS is slow. You can get really crazy with CephFS or GlusterFS, and for all that complexity, you don't get much farther way from SMB/NFS issues with those either.
Well one problem is that filesystem in general is a terrible abstraction both in terms of usability and in terms of not fitting well with how you design network applications.
I’d say Dropbox et all is closer to a good design but their backend is insanely crazy optimized to make it work and proprietary. There’s an added challenge that everything these days is behind a NAT so you usually end up needing to have a central rendezvous server where nodes can find each other.
Since you’re looking at rsync where you want something closer to Dropbox, I’d say look at syncthing. It’s designed in a way to make personal file sharing secure.
... well, it makes sense to be able to do a "join" with the `users` and `documents` collections, use the full expressive range of an aggregation pipeline (and it's easy to add additional indices to MongoDB collections, and have transactions, and even add replication - not easy with a generic filesystem)
put all kinds of versioned metadata on docs without coming up with strange encodings, and even though POSIX (and NodeJS) offers a lot of FS related features it probably makes sense to keep things reeeeally simple
With SAMBA you just get boring old authentication, but with SCP you need to file a Form-72B with Site Command, ensure all new users pass a Class-3 memetic hazard screening, and then hope that the account doesn't escape containment and start replicating across subnets.
Sure, it's more overhead, but you can't put a price on preventing your NAS from developing sentience.
I would say that basically all
these software options use a database for things like preferences and user management.
Using a database isn’t some kind of heavy-handed horrendous thing depending on the implementation (e.g., as long as it leaves your content files alone).
+1 for Syncthing. I've been running it for years, after my student discount for Dropbox expired (Google drive and OneDrive were just getting traction at the time).
The mobile experience last I tried was pretty rough though. I don't really need my files on my phone and I have a web interface on my home server I can use to grab them in a pinch, but it's something to keep in mind.
Syncthing is great, but no good for mobile devices if you want to store and access lot of large files - it syncs everything, and last I checked, the features to prevent that were depreciated.
I built something similar years ago. These are terribly hard to build, so I did a bit of digging.
1: This appears to be backed by a French company called Linagoria. I don't know much about the company, but they've been around for a bit.
2: I experimented with Mongodb for the similar product, and it turned out to be very unreliable. A lot can change since I used Mongodb, but in general, I'm weary of any product that uses it unless there's an expectation that data is lossy.
(Which was the problem Mongodb had at the time: Their CTO only wanted to target lossy data use cases, but the people interested in using Mondodb wanted a database that was easier to use than SQL.)
I’ve had similar warnings from multiple very senior devs to never go near mongo. So better explain that choice if you’re wanting adoption. Reliability was the concern.
At the time (2010), MongoDB was intended (from the creators) for handling high volumes of data where some loss was tolerable.
What happened was that its document model, and flexible index model, made it very attractive as an easy-to-use database. I used to call it the "Visual Basic" of databases.
I think the less technical people in marketing latched on to how a lot of people found MongoDB easier to work with, and there was a lot of selling to people who it shouldn't have been sold to.
The problem was that the lossiness nature of MongoDB didn't rear it's ugly head until deep in a project, and the assumptions made when writing documents lead to situations where operations required changing multiple documents; or other corner cases that triggered loss in larger schemas.
Of course, if you used MongoDB as intended, which was for ingesting lots of data with some tolerance of loss, you were totally fine.
I thought the same once, but apparently some of my friends literally do not own a PC. Only tablets or phones, no USB-A in the house except maybe in TV. Oh well, time for USB-C pendrives.
Surely you jest. I love USB sticks. But they are not a proper alternative to cloud storage. For example, how do I do share select files/folders with select people, in other countries?
Until you lose it, break it, damage it accidentally (via high humidity, high heat, etc). Arguably, if you run twake on some VPS, you have additional layers of redundancy by default.
With so much surveillance I think there's a real need for E2E on anything. I just bought the basic Tutanota package - but maybe that's just my OCD acting out.
I know this probably goes against hn ethos, but one of my most important features is the search. I store TB of data and it could be hard to find a picture. I want the cloud software to analyze the image so that I can search "2 people on Nothing street" and find it.
so far google is amazing at search. hopefully others will be better, but it's really hard to evaluate cloud software based on that
> Looks like there is a single commit where a majority of the code came from.
I do this all the time, right before open sourcing a project. Basically while it's private, commit quality can be a bit rough, and if I want to open source it, I'll remove .git, make a new init commit then open source it. No one needs to see what I do in my private abode :)
The history of the development since its beginning can help a lot in studying the code, so I encourage people to avoid the single commit as much as possible.
It's much better to refactor (rebase) the messy commits, removing the personal or embarrassing stuff; although that might result in a "false" history, a series of smaller-sized commits will usually be much easier to follow than reading a whole code base all at once.
Really, I see a ton of open-source projects that do this, and it results in a lot of more opacity and friction than necessary.
It results in less people being able to check the code and contribute to the project.
I promise you're not missing much, except some commits that are implementing something, reverting it, implementing it again slightly differently, fixing typos, replacing 80% of the codebase in one swoop and similar stupid and un-needed stuff.
If the project is from the get-go supposed to be a long-lived project (like professional development for a business) then I agree, don't smoke the entire history no matter how embarrassing it is.
But for my personal projects, I can let you know that having access to the git history before I made it FOSS will make you dumber rather than being helpful for anything, compared to one clean starting commit.
Why do you think it's embarassing? The result is what reasonable people judge. And if you get to it through trial and error, well, that's how it's done almost everytime. It's normal
I don't? I said I remove it because it isn't useful to anyone, might even be adding more confusion than it solves, not because I'm embarrassed over anything.
If it really isn't useful, which I imagine means you committed somewhat haphazardly, ok, of course.
If there might be some usefulness hidden there (for example, trying something and then reverting it shows that you did explore it), it's also possible to place the old stuff in another repository or another branch (better the latter, unless it increases the repository's size too much)
> for example, trying something and then reverting it shows that you did explore it
True, those things tend to go into the documentation itself, checked into the codebase itself instead of being somewhat hidden inside the git history. Usually I end up having both a "Open Problems" (things yet to solve) and a "Tried X, this is why it didn't work" section somewhere in the documentation.
> it's also possible to place the old stuff in another repository
Yes, before the process I initially described, I usually leave a copy intact with the full-full history, but that's not what I published, just kept as an archive.
> > for example, trying something and then reverting it shows that you did explore it
>
> True, those things tend to go into the documentation itself, checked into the codebase itself instead of being somewhat hidden inside the git history. Usually I end up having both a "Open Problems" (things yet to solve) and a "Tried X, this is why it didn't work" section somewhere in the documentation.
That's good, and yes, if that repository history really wouldn't add anything it's fine to squash everything
> > it's also possible to place the old stuff in another repository
>
> Yes, before the process I initially described, I usually leave a copy intact with the full-full history, but that's not what I published, just kept as an archive.
They were originally working on a MS teams replacement, with a bunch of things in one app like teams. (I tried it back then, it was pretty green). Now it looks like they are focused on drive, chat and email. The old app seems deprecated, so I presume they forked it into some of this new stuff.
> If you have a US startup called X and you don't have x.com, you should probably change your name.
But they do own https://twake-drive.com/ already? What exactly is your point here? Either you misunderstand the linked article, or I do. But seems people would be able to find that just fine if they search for, as twake-drive.com comes up as the first result when I search for "Twake Drive".
Besides, Graham's articles are almost always geared towards startups in one way or another. This doesn't seem to be that, so not sure I'd even try to read it if I was the owner of Twake Drive.
The name is hard to convey. Try telling someone verbally how to find it without error: "Twake. No, not take - like Wake with a T, Twayke. T double you ay kay ee. Oh, and there's a hyphen in the domain. T-Wake hyphen Drive dot com."
Re: should they read it? Either you want your product to spread, or you don't.
If you're posting it on HN, you want to share it, and for it to be shared. A tough name makes it harder to share, so you have to decide if you really want your product to spread or not.
Yeah - Twake is a terrible name though, tbf, I wonder what the use case is for open source cloud drive outside of pretty niche situations esp when the cost, in many cases, is for the infrastructure in part
It's really not clear: they seem to show a mobile app (https://static.tildacdn.com/tild3536-3661-4363-b433-35353561...) but there are no links to app stores anywhere, seems like they ended up on HN too early, maybe we should let them some time to get their stuff together
In European nations who aren't English-first-language it's quite widespread around university students and people that outgrown Whatsapp, it isn't very much different than using a Discord groupchat (and you lose less important stuff in Telegram). Admittedly a bit is for network effect around "grindset" jocks but it isn't very much different than using discord or Meta messenger or Slack, just a freemium SaaS that the project doesn't support firsthand so if "our server" is down, "theirs" maybe is not.
I say they are all the same although Telegram's insecurity is proven, they still are the same overall for a FOSS project.
It's not so much about security, as FOSS conversation groups would be open to anyone anyway, but it's not a good look for a project to use a tool that is known to be quite shady while there are FOSS tools, or simply tools with a better reputation. Also the project group seems to be french, not english-first-language, and Telegram is absolutely not well seen in France, not used by much more than a few percent
It beats WA on UI in most cases (especially on desktop), has open source client, much better groups/channels for one-to-many, many-to-many communications. Has bots support like I never seen on WA.
Russians is a bit narrow, but it is mostly the popular choice in eastern Europe. But when it comes to these messengers it differs based on location. There is KakaoChat in Korea, WeChat in China, Whatsapp in central Europe and South America, iMessage in the US, etc.
I use it over discord pretty frequently. The app UI is much simpler than discords and I've been able to get family to stick with using it because of that. Signal is my main way of communication, then telegram, then discord.
Nah, it is used by the likes of my 87yo mother who wouldn't recognise a crypto if it landed in a tree nearby and doesn't speak a word of Russian (she's Dutch). It is used by those who shun anything MetaFacebook and as such won't install Whatsapp. As such I have used it in the past but now mostly use my own XMPP server although I still have it installed on several devices to keep in touch with those who remain on the platform. I do know a few words of Russian but that is unrelated to my (mostly former) use of Telegram.
It doesn't make sense for an open-source group making open-source projects and strongly promoting open-source to use such a tool for group chat. There are plenty of open-source alternatives or alternatives with no such negative association.
edit: it looks even worse knowing that they have their own chat project: https://twake-chat.com that is powered by the Matrix protocol
Wow, they sure turned things around since their CEO allegedly last locked employees in their office so they'd be forced to keep working through the night.
since it's I/O heavy an async web-oriented stack (ie. NodeJS) makes sense, and then TS is an obvious improvement over raw JS, and if the frontend is also JS/TS then at least there's some chance that expertise can be shared
The problem is such systems are also CPU heavy, with extensive hashing, encryption, and really quite a lot of general paperwork, and as such, a system that can efficiently use multiple CPUs is really important. I guarantee that plenty of Twake installs are absolutely spending a ton of time blocked on CPU, both because of the multithreading, and the general 10x-slower-than-C you can expect from Javascript on general code.
Javascript was a poor choice that will hold the project back just as choosing PHP for the base has done and continues to do a lot of damage to NextCloud/OwnCloud. This is not a task for a scripting language, because they're disqualified on performance. It's also not a task for dynamic typing, and using Typescript can help with that, but it doesn't change the fact that Javascript is just generally slow and does not play well on multiple CPUs.
This soundbite really needs to go away. It and its counterexamples don't apply in any significant measure. You can pay and still be the product, and that is often the case.
Lots of talk about must‑have features and backups here...
BUT there's another piece that makes or breaks these tools... whether they can build a community around them and stick around for years...
Open‑source cloud storage projects come and go when maintainers burn out... a sustainable business model or strong contributor base matters as much as technical checklists...
ALSO interoperability is underrated... if your drive can speak WebDAV or S3 and plug into existing identity systems, teams are more likely to try it...
In the end people want something that won't vanish after the honeymoon... that's harder than adding a progress bar...
Is that a weakness of the tool's organizational model?
I don't want to be part of a community around my cloud storage. I want it to work and I want to think about it as little as possible.
I use Syncthing and it does a pretty great job at this, no one ever insisted I need to join a Syncthing community, yet it keeps on working.
I don't pay a dime for Syncthing but I'm vaguely aware that they're linked to a company called Kastelo which provides enterprise support for Syncthing deployments. Probably a lot of Syncthing development is paid for that way.
Incidentally I founded an open source consulting company that's totally unrelated to cloud storage. We have enterprise as well as smaller contracts. We develop some addons in-house and the bigger enterprise contracts tend to subsidize most of the work that goes into them. We haven't asked anyone to be part of a community and I don't think we need to.
Communities are nice, but if you want your software to last I think a good business model and a good marketing strategy are a better bet. Bonus, you can quit your day job.
For a business headed open source project, it's still about the community. In this case, the business tends to take a defining and often controlling role in the community. This has plusses and minuses. On the plus side, if a business has a vested financial interest in the project, there is financial incentive for continuity. On the minus, when the company's financial interest no longer aligns with the community, many of us retain scars from rug pulling and switcheroos.
So understanding the long term stability of a community is more than just checking whether there is a company backing the project. It is important to analyze the nature and diversity of interest. I think it's just as important that there exists a larger community that the business depends on for extra feature + bugfixing work which is capable of forking. When this is possible, it is much less likely to be necessary.
Indeed. "S3 compatible" is the state of the art for object storage imho. As long as you can talk to a storage system that supports the basic S3 primitives, longevity is improved and there is no lock in. You can use S3 proper, Backblaze, Wasabi, Backblaze B2, local storage exposing an S3 api, etc. Any replacement is mostly drop in assuming it can read, scan, index existing objects.
Edit: @n3t heard wrt to the turn of phrase
https://en.wikipedia.org/wiki/State_of_the_art
Ya, something like "table stakes" would be closer to what they mean.
>BUT there's another piece that makes or breaks these tools... whether they can >build a community around them and stick around for years...
Why ? who cares? if the tool solves the problem, you need a community maintain it. And that's it.
I have a miniPC (Minisforum) with Debian as server.
I use : - Syncthing (https://syncthing.net/) to keep the files synchronized between desktops and laptops computers
- Webdav (https://github.com/hacdias/webdav) to access the files on the server via other applications
- Cryptomator (https://cryptomator.org/) to crypt/decrypt sensible directories (that are synchronized through Syncthing) Cryptomator allow me to access also the directories via webdav
- MaterialFiles on Andrid to access the files on the server
I access my mini server from outside with a Wireguard VPN created on my Fritz!Box router.
Between home and office I created a site-to-site Wireguard VPN between the two Fritz!Box routers.
Forgot to mention also SFTPGo : https://sftpgo.com/
Seafile is the only good enough thing i've found so far for self-hosted file sync. But it is still a pain to upgrade the server version. nextCloud and friends is a complete disaster in my oppinion.
> nextCloud and friends is a complete disaster in my oppinion.
Why is that? Have been using NextCloud in our company and for myself, and I couldn't be happier, no issues since 3 years, all the tools and plugins I need, sync running perfect and hassle-free and performant. I thought it's generally liked up until now - I didn't try any of the alternatives though, so they might indeed be better. Though I don't have any reason to try them tbh, as NC works almost too well.
Using Nextcloud on the web feels like a state of the art 2015 PHP web UI. It is... fine. But compare it to immich for example and they're just not playing in the same league imo
I've been using Nextcloud for years, but I've never used the web UI. Windows desktop app for syncing my Documents folder, and Android app for synchronizing a few folders on my phone, as well as the "append only" upload of my photo reel that something like SyncThing doesn't support. Works great, never had any issues with Nextcloud. The real value is in the companion apps.
I use a cron job to back up Nextcloud to B2 and S3 Glacier.
Couple of questions on your backups:
How much storage do you use and how much does it cost?
Have you ever tried restoring from Glacier?
I use S3 glacier on Scaleway (European cloud provider). Storing about 3TB costs me about 7€ per month including VAT. I've had to restore 1TB in the past and it cost me around 10€. Not bad for a worst case scenario. They also do mini VPSes for 3€ per month with unlimited traffic. Really nice provider, I've used them for many years.
I've got 18TB on Glacier and it costs $2.80/day
I've only tested partial restores from Glacier since it is expensive. I've got a raidz2 array locally as insurance against having to restore from a backup.
A 18 Tb NAS harddrive is about 320 USD. A 2 bay unifi unas2 199. It pays off in one year. Restoring data from it is free.
> A 18 Tb NAS harddrive is about 320 USD. A 2 bay unifi unas2 199. It pays off in one year. Restoring data from it is free.
That single 18TB HD is hardly safe from a disaster or even plain old hardware death, and it's a single point of failure. You need at least 3 times as many HDs to start to have something you can actually rely on to keep your data for 3-5years.
It's potentially not off-site though, a house fire and it's gone
Trusted friend or family in another disaster zone is the model here.
100%. Though their UI has been update a little with the last major release.
> But compare it to immich for example and they're just not playing in the same league imo
I mean, this doesn't make sense at all, tbf. They're literally not in the same league, as their targeting different use cases. Nextcloud offers a MUCH broader experience, while Immich has a very clear cut focus and does nothing outside of that. Comparing it doesn't make any sense. Except if you're actually talking about the UI exclusively. Then, yes, Immich feels much more modern and smooth.
So what's the immich equivalent for file sharing
You might be interested in Peergos (https://peergos.org) - creator here.
An old write up is here: https://itsfoss.com/peergos/
Theres a lot of weird setup often required on the backend in my experience, but when it works, it works well. But until you get everything dialed in it can have weird issues that don't have a clear path to fix them.
It might be better in their weird AIO solution? But i dont like the idea of giving a docker container the ability to spawn more containers. I just use one of their normal docker containers and have had to manually change a lot to make it work as they actually suggest. Like just recently i setup their notify_push plugin as it improves performance - but the provided setup instructions didn't work in my setup and i had to manually tweak several things.
It took a while for me to fully set up Nextcloud with STUN/TURN, Office server, etc. in a properly containerised setup. It clearly felt like it was built before containers and modern devops approaches were a best practice.
And while community is great, I don't think Nextcloud developer community is that big and active. Their plugin system is basic, archaic, lots of things there are begging for rework.
So while Nextcloud is decent once set up, I am happy to see some fresh OSS projects solving similar issues appear. Maybe their approach will be better.
Resilio is also pretty good, depending on your use case. (Syncthing is great too, but Resilio seems faster and better at NAT traversal in my experience.)
I recently got into self-hosting Seafile and successfully set it up on my dedicated server. Had to think backup and security strategies quite a bit and ultimately I set up a bulletproof backup mechanism. Tested it pretty rigorously.
Seafile took me by surprise in terms of how quick it was at picking up new files and changes - syncing works incredibly well too. I moved all my files from my Google Drive into my Seafile instance and I'm now using it on all my devices as my main cloud storage solution.
Nextcloud suffers from flexibility, it's got a lot to offer but requires dialling in to your specific use case, the mistake most admins is to assume you can just run it without tuning, it has too many differing options to do that smoothly out of the box.
The ability to just run it in a snap has really contributed to this imho, Nextcloud is enterprise software you just happen to be able to run in your homelab.
Running the docker variant, it is trivially easy, you just bump the tag version.
Been a user for around a decade. It is really great. Nextcloud was choking on large repos back in the day and it requires a beefier machine.
I first run Seafile on a cheap ARM board with 2GB ram and 2 core CPU.
Same journey. I’ve been using seafile with seadrive and a subst to S: for years with very good effect.
running Nextcloud AIO has been reliable for me for a couple of years now.
As others have asked, how does it compare with nextCloud ownCloud? And does it have native clients for the usual suspects? Windows/Mac/Mobile...
I desperately want to be a fan of ownCloud, because it offers clients natively across Mac/Linux/mobile, but it’s such a mess. Every platform has small bugs and reliability problems that makes the whole thing useless.
I tried to install nextcloud once, and it was an exercise in misery.
If you just need a web interface to your filesystem, there’s this single Go executable (https://github.com/filebrowser/filebrowser) that supports sharing and minimal user management.
+1 have deployed thousands of instances of filebrowser without any issues, hidden behind an oauth-proxy.
thousands...?
I couldn't get past installing required PHP extensions, as my hosting provider doesn't allow for that.
Overall it's no WordPress instance that works everywhere.
sudo snap install nextcloud
That’s all!
Auto updates and I can bet it will not break.
Snap isn't the best experience for Nextcloud in my experience, fine for a demo or a single user instance that isn't mission critical. Users who expect more out of it will often bump up against its limitations.
Anyone who wants to seriously use Nextcloud should look into the AIO docker containers or rolling the individual containers themselves. Nextcloud has expanded into a full groupware stack and it's expected you have an actual admin managing the system like with any real deployment of enterprise software
It includes most of the essential features, and I’d say it’s excellent for professional use. I’ve been running an instance for many years on a VPS for work collaboration, and it’s been perfect. It’s now hosted behind Cloudflare Tunnels, with group members whitelisted by email.
If you need more advanced or fancy/niche features, AIO might be a better though heavier fit (I run an instance of AIO at home, mostly for testing). Snap is lightweight and a bit opinionated (in reasonable ways in my view), and the documentation used to mention some of its limitations. In exchange, you get snappier, more robust installation.
Is this a joke?
There's lots more to hosting your own file share/sync tool than just standing it up.
No, it was serious!
He complained about the difficulty of installing an application. He didn’t complain about establishing a personal data center.
That one line will give you the Nextcloud. Exactly one more line in snap will give you a self sign cert. Alternatively, the line below will give you remote access, a domain, and a valid certificate for your application:
curl -fsSL https://tailscale.com/install.sh | sh
You will have a functioning personal Drive on a VPS or a computer at this point!
Toggle snapshots on VPS for backups.
Setting up services with public clouds also takes some steps.
It seems reasonable that someone would want to go beyond just installing software; they are presumably doing so in order to use it for its purpose. Being pedantic about the nature of the complaint (i.e. "He complained about the difficulty of installing an application. He didn’t complain about...") seems to miss the point. All of the additional steps you lay out also have their own steps to get done or decisions to be made, and when it is all said and done, it seems reasonable to imagine that things could get quite complicated.
I mean if you want a working Nextcloud instance, available through VPN with backups, then no, it doesn't get more complicated than that, actually. It is incredibly easy.
When hand-waving away complexity, then yes, everything looks easy. :)
I've run a self-hosted Nextcloud instance for many years and Docker is by far the easiest. I started off with a native installation and that can be a pain when upgrading the OS (Ubuntu in my case). I tried the snap version when that became available and was impressed by how easy it was, though administering it required a bit of learning as the file locations where all different.
Running it in Docker made it so much easier to administer (maybe add in the missing db indexes if there's a major version change).
If you want, I can paste my docker-compose.yml for reference as it's relatively complex.
IME NextCloud is a bloated PHP monster with poor performance. Twake seems to be leaner and have a narrower scope.
Open source drive tools live or die on three things. 1) Simple sync that never surprises. 2) Clean conflict handling you can explain to a non tech friend. 3) And zero drama upgrades.
If Twake nails those and keeps a sane on prem story with S3 and LDAP, it has a shot. The harder part is trust and docs. Clear threat model. Crisp migration guides from Drive and Dropbox. And a tiny CLI that just works on a headless box. Do these and teams will try it for real work, not just weekend tests.
I'd add a fourth; "Make it easy to do backups and verify they're correct".
I don't think I've ever considered a data store without that being one of my top concerns. This anxiety comes from real-life experience where the business I worked at had backups enabled for the primary data store for years, but when something finally happened and we lost some production data, we quickly discovered that the backups weren't actually possible to restore from, and had been corrupted this whole time.
Heh - I once made a little chunk of change, because a former client from 10-years previous discovered the shiny "DVD/CD" backups had succumbed to "bit-rot" and needed some source code.
I grabbed the hard-drive off the shelf, put it in an enclosure and handed them the source-code... (At the time, every time I upgraded my system, I would just keep my old drives, so... had a stack of them - buy a new external enclosure, slot it and park it.)
Schrödinger's backup. Testing the backup works involves even more engineering and non creative work.
Depends. Even something basic like "Check if the produced artifact is a valid .zip/.tar.gz" can be enough in the beginning, probably would have prevented the issue I shared before.
Then once you grow/need higher reliability, you can start adding more advanced checks, like it has the tables/data structures you expect and so on.
I had a funny where I somewhat regularly test an sql backup, then one day it didn't work, it worked the second time, the 3rd and the 4th. I have no idea why it didn't work. It turned into a permanent background process in the back of my head. The endless what-if loop.
I’m not sure what your point is. Business continuity requires a disaster recovery plan that must be tested regularly. It might be considered slog work, but like taking out the garbage, it’s non negotiable and must be done.
"Great, first you wanted more money to buy compute and storage for dev and staging separate from production, and now you even more for 'testing backups'?!"
I'd like a manual "sync now" option. Sometimes I put stuff in google drive using windows explorer and it's not immediately obvious if it is syncing, why it is or isn't, or what I need to do to make it.
I've got a theory that progress bars for main functionality tasks and the associated manual triggers in modern software are out of favor, as it creates a stage for an error to be displayed and creates expectations the customer can lean on. Less detail in errors displayed to the customer removes their ability to identify a software problem as unique or shared among others.
"Something went wrong!"
I think you're right and I think I insufficiently considered malice as the reason for a lot of this type of minimalism. This "SWW" message is great as it doesn't even give a hint as to whether the problem is with the server (all vendor's fault), the network (not vendor's fault), or a client fault (maybe vendor's fault, maybe customer just needs to update it). Users can just do brute force things like "Swipe up the app and open it up again" and eventually just give up.
Syncing should be in the control of users. user should be able to trigger or abort the sync. Also it should provide some sort of indicator of progress.
58.9% TypeScript and 32.6% JavaScript wouldn't be my first preference to implement such a high performance and throughput demanding application? Why is that?
> 58.9% TypeScript and 32.6% JavaScript
Isn't that just 91.5% JavaScript?
TypeScript is not real.
Almost, but not entirely, unlike birds
this app is io bound and in that case it doesn't matter it runes on ts / js
Maybe ask all the startups looking to scale their TS\JS microservices "stack" using event driven architecture.
It appears that the backend is written in TS, while the frontend in JS.
Personally I separate church and state by writing tests in JS and application code in TS.
If you're asking why these languages at all when this and that other language is faster, most likely it's less of a bottleneck than estimated.
Zero percent chance I will ever trust my critical data to a mongo-backed service, personally.
With clients some of them have already made this bad decision; with my own personal files I get to avoid it.
Mongodb used to suck. We use it at work for critical systems, it’s been rock solid for 3+ years.
You can always use FerretDB instead.
Isn't Mongo source available too? So it sort of seems to contradict the mission of this organization to use it.
My first inclination too tbh.
And then I saw Npm references and thought “in JavaScript?!” But at least it’s typescript.
You lose JS but at least you get to keep the supply chain risks.
why? since WiredTiger is the default storage engine it works
Someone else also shared a similar experience, so it seems true that we should avoid store critical data in mongodb
https://news.ycombinator.com/item?id=45694376
A bit off-topic, but is there a way I can convince various apps (Viber, WhatsApp) to use some replacement instead of Google Drive for backup? They do not offer such an option, but maybe by rooting the phone and faking the interface, or ...?
On Android isn't it "just" a share-targrt? You can make a PWA that's a share-target pretty easy.
Message backup is a more complete integration with google drive / iCloud
Where I see the open source platforms lack is selective sync. That thing that dropbox and OneDrive etc do where they have a folder with only placeholders for your files, and only when you actually access one it gets downloaded and then keeps being synced.
The ones I've tried could only download once off via the web, or sync whole folders but not do the placeholder thing. That doesn't really work for me.
Given how integrated Drive and Docs are, if this doesn't have docs-like collaborative realtime document editing, for many people this is like "30% of Google Drive"
For people whose UX is dragging and dropping stuff to browser, and/or using a desktop sync client only, sure why not, the UI looks clean and familiar. But as someone who has used and still uses like 3 different similar things concurrently, the only real reason I use drive is because of the seamless zero-dependency office-like web software being part of the product.
(yes I know it's a curse too, I ended up writing a piece of software just to migrate company drive stuff to my personal drive when a company I was a cofounder in went bust to have a record ... those google docs can really only exist in Drive natively, any export is an immediate downgrade)
Twake includes OnlyOFfice which has collaborative realtime document editing on Docs, Sheets, Slides, etc...
Do you really need a database for this? On a unix system, you should be able to: CRUD users, CRUD files and directories, grant permissions to files or directories
Is there a decade-old software that provides a UI or an API wrapper around these features for a "Google Drive" alternative? Maybe over the SAMBA protocol?
How would you implement things like version history or shareable URLs to files without a database?
Another issue would be permissions: if I wanted to restrict access to a file to a subset of users, I’d have to make a group for that subset. Linux supports a maximum of 65536 groups, which could quickly be exhausted for a nontrivial number of users.
As for the permissions, using ACLs would work better here. Then you don't need a separate group for every grouping.
TIL about ACLs! I think that would nicely solve the group permission issue.
The final project for my senior year filesystems class thirty years ago was to implement ACLs on top of a SunOS 4 filesystem. That was a fun project.
Write up? Code? :D
Then let me also introduce you to extended attributes, aka xattrs. That's how the data for SELinux is stored.
There is no support for writing multiple xattrs in one transaction.
There is no support for writing multiple xattrs and file contents in one transaction.
Journaled filesystems that immediately flush xattrs to the journal do have atomic writes of single xattrs; so you'd need to stuff all data in one xattr value and serialize/deserialize (with e.g JSON, or potentially Arrow IPC with Feather ~mmap'd from xattrs (edit: but getxattr() doesn't support mmap. And xattr storage limits: EXT4: 4K, XFS: 64k, BTRFS: 16K)
Atomicity (database systems) https://en.wikipedia.org/wiki/Atomicity_(database_systems)
Backup files the way Emacs, Vim,... do it: Consistent scheme for naming the copies. As for sharable URLs, they could be links.
The file system is already a database.
Ok this product will be for project with less than 65k users.
For naming, just name the directory the same way on your file system.
Shareable urls can be a hash of the path with some kind of hmac to prevent scraping.
Yes if you move a file, you can create a symlink to preserve it.
Encode paths by algorithm/encryption?
This wouldn’t be robust to moving/renaming files. It also would preclude features like having an expiration date for the URL.
Well sure there’s a bevy of features you’re missing out on, but it would work. Object store and file metadata solves both of those though feels like cheating.
Use sym link in that case to keep the redirect.
> How would you implement things like version history
Filesystem or LVM snapshots immediately come to mind
> or shareable URLs to files without a database?
Uh... is the path to the file not already an URL? URLs are literally an abstraction of a filesystem hierarchy already.
> Filesystem or LVM snapshots immediately come to mind
I use ZFS snapshots and like them a lot for many reasons. But I don’t have any way to quickly see individual versions of a file without having to wade through a lot of snapshots where the file is the same because snapshots are at filesystem level (or more specifically in ZFS, at “dataset” level which is somewhat like a partition).
And also, because I snapshot at set intervals, there might be a version of a file that I wanted to go back to but which I don’t have a snapshot of at that exact moment. So I only have history of what the file was a bit earlier or a bit later than some specific moment.
I used to have snapshots automatically trigger every 2 minutes and snapshot clean up automatically trigger hourly, daily, weekly and monthly. In that setup it was fairly high chance that if I make some mistake with an edit to a file I also had a version of it that kept the edits from right before as long as I discover the mistake right away.
These days I snapshot automatically a couple of times per day and cleanup every few months with a few keystrokes. Mainly because at the moment the files I store on the servers don’t need that fine-grained snapshots.
Anyway, the point is that even if you snapshot frequently it’s not going to be particularly ergonomic to find the version you want. So maybe the “Google Drive” UI would also have to check each revision to see if they were actually modified and only show those that were. And even then it might not be the greatest experience.
If you are on windows with a Samba share hooked up to zfs you can actually use the "previous versions" in file explorer for a given folder and your snapshots will show up :) there are some guides online on setting it up
Take a look at "cockpit", because if there were, that's where it "should" be.
https://cockpit-project.org/applications
--
> Do you really need a database for this?
I have no idea how this project was designed, but a) it's expectable that disk operations can and should be cached, b) syncing file shares across multiple nodes can easily involve storing metadata.
For either case, once you realize you need to persist data then you'd be hard pressed to justify not using a database.
I don't know of one- have thought this before but with python and fsspec. Having a google drive style interface that can run on local files, or any filesystem of your choice (ssh, s3 etc) would be really great.
I'm unironically convinced that a basic Samba share with Active Directory ACLs is actually probably the best possible storage system...but the UI for managing permissions sucks, and most people don't have enough access to set it up the way they want.
Like broadly, for all configuration Hashicorp Vault makes you do, you can achieve a much more useful set of permissions with a Samba fileshare and ACLs (certainly it makes it easy to grant targeted access to specific resources - and with IIS and Kerberos you even have an HTTP API).
I need to remind that the time when a service's tenant — be it a file, email, whatever else — automatically meant there was an OS user account for that user, has also been decades ago.
Perhaps they are using MongoDB GridFS instead of storing files on disk.
You expose SAMBA shares outside your home network?
I do, password-protected of course. It is the only "native" way I found to get server files access to my iPhone without downloading a third party app (via Files).
I really hope you lock it down to something like Tailscale so that you have a private area network and your Samba share isn’t open to the entire world.
Samba is a complicated piece of software built around protocols from the 90s. It’s designed around the old idea of physical network security where it’s isolated on a LAN and has a long long history of serious critical security vulnerabilities (eg here’s an RCE from this month https://cybersecuritynews.com/critical-samba-rce-vulnerabili...).
It seems like every network filesystem is irredeemably terrible. SMB and NFS the stuff of security nightmares, chatty performance issues, and awkward user id mapping. WebDAV is a joke. SSHFS is slow. You can get really crazy with CephFS or GlusterFS, and for all that complexity, you don't get much farther way from SMB/NFS issues with those either.
My solution: Share nothing and use rsync.
Well one problem is that filesystem in general is a terrible abstraction both in terms of usability and in terms of not fitting well with how you design network applications.
I’d say Dropbox et all is closer to a good design but their backend is insanely crazy optimized to make it work and proprietary. There’s an added challenge that everything these days is behind a NAT so you usually end up needing to have a central rendezvous server where nodes can find each other.
Since you’re looking at rsync where you want something closer to Dropbox, I’d say look at syncthing. It’s designed in a way to make personal file sharing secure.
I think you should figure out how to quit while you're ahead. I wouldn't expose Samba to most of the devices on my LAN, never mind the internet.
Search for wannacry. You may rethink your setup.
... well, it makes sense to be able to do a "join" with the `users` and `documents` collections, use the full expressive range of an aggregation pipeline (and it's easy to add additional indices to MongoDB collections, and have transactions, and even add replication - not easy with a generic filesystem)
put all kinds of versioned metadata on docs without coming up with strange encodings, and even though POSIX (and NodeJS) offers a lot of FS related features it probably makes sense to keep things reeeeally simple
and it's easy to hack on this even on Windows
An SCP or FTP client maybe?
Definity. Though SAMBA supports authentication natively. With SCP and sFTP you'll need another admin server to create users.
With SAMBA you just get boring old authentication, but with SCP you need to file a Form-72B with Site Command, ensure all new users pass a Class-3 memetic hazard screening, and then hope that the account doesn't escape containment and start replicating across subnets.
Sure, it's more overhead, but you can't put a price on preventing your NAS from developing sentience.
Can you name a single Google Drive clone that doesn’t use a database?
Would love to see your source code for your take on this product.
The Synology Drive version mirrors the filesystem, though I’m sure it has a database for sharing metadata. Is that what they mean?
Nextcloud too.
There is a database in most if not all useful cases, but there could also be the actual files separately.
I would say that basically all these software options use a database for things like preferences and user management.
Using a database isn’t some kind of heavy-handed horrendous thing depending on the implementation (e.g., as long as it leaves your content files alone).
Give syncthing a go.
+1 for Syncthing. I've been running it for years, after my student discount for Dropbox expired (Google drive and OneDrive were just getting traction at the time).
The mobile experience last I tried was pretty rough though. I don't really need my files on my phone and I have a web interface on my home server I can use to grab them in a pinch, but it's something to keep in mind.
If you’re on iOS, try my (FOSS) app for Syncthing: https://github.com/pixelspark/sushitrain
Why did you restrict it by country? I’m in the EU and can’t install it.
Unfortunately not available in my country (Malaysia)
Android unfortunately.
Cool app though!
Syncthing is easily the most effective FOSS I actively use. It just works and runs on everything.
Syncthing is great, but no good for mobile devices if you want to store and access lot of large files - it syncs everything, and last I checked, the features to prevent that were depreciated.
I built something similar years ago. These are terribly hard to build, so I did a bit of digging.
1: This appears to be backed by a French company called Linagoria. I don't know much about the company, but they've been around for a bit.
2: I experimented with Mongodb for the similar product, and it turned out to be very unreliable. A lot can change since I used Mongodb, but in general, I'm weary of any product that uses it unless there's an expectation that data is lossy.
(Which was the problem Mongodb had at the time: Their CTO only wanted to target lossy data use cases, but the people interested in using Mondodb wanted a database that was easier to use than SQL.)
I’ve had similar warnings from multiple very senior devs to never go near mongo. So better explain that choice if you’re wanting adoption. Reliability was the concern.
At the time (2010), MongoDB was intended (from the creators) for handling high volumes of data where some loss was tolerable.
What happened was that its document model, and flexible index model, made it very attractive as an easy-to-use database. I used to call it the "Visual Basic" of databases.
I think the less technical people in marketing latched on to how a lot of people found MongoDB easier to work with, and there was a lot of selling to people who it shouldn't have been sold to.
The problem was that the lossiness nature of MongoDB didn't rear it's ugly head until deep in a project, and the assumptions made when writing documents lead to situations where operations required changing multiple documents; or other corner cases that triggered loss in larger schemas.
Of course, if you used MongoDB as intended, which was for ingesting lots of data with some tolerance of loss, you were totally fine.
It seems that Twake is the result of Cozy Cloud joining Linagora: https://blog.cozy.io/en/from-7-july-your-cozy-cloud-begins-i...
USB sticks, the alternative to the cloud.
USB sticks can fulfill part of the "2" in the 3-2-1 rule.
https://en.wikipedia.org/wiki/Backup#3-2-1_Backup_Rule
I thought the same once, but apparently some of my friends literally do not own a PC. Only tablets or phones, no USB-A in the house except maybe in TV. Oh well, time for USB-C pendrives.
Not sure how i can collaboratively edit documents thanks to a USB stick.
Surely you jest. I love USB sticks. But they are not a proper alternative to cloud storage. For example, how do I do share select files/folders with select people, in other countries?
Copy selected files to another pendrive. Use postal service to ship pendrive to other person. We used to call that sneakernet :D
But yeah there's a reason people don't do this anymore
Until you lose it, break it, damage it accidentally (via high humidity, high heat, etc). Arguably, if you run twake on some VPS, you have additional layers of redundancy by default.
You mean, like the dns of AWS in us-east-1? #OhWait
There's also https://cryptpad.fr/ - https://cryptpad.org/ - https://github.com/cryptpad/cryptpad
That looks great, thanks for sharing.
I would add to that list something like a splitwise alternative.
And open source too? Seems too good to be true.
I think you're looking for https://spliit.app/
I don't think that's end to end encrypted.
With so much surveillance I think there's a real need for E2E on anything. I just bought the basic Tutanota package - but maybe that's just my OCD acting out.
EDIT: This is closer, and you can self-host
https://github.com/cryptoboid/splitio
But it's in JavaScript <throw up> can't win them all.
Do you feel you need E2E even when you're self hosting?
https://github.com/spliit-app/spliit
I don't want to self host though. That's like giving myself a job.
I always use https://ihatemoney.org/
I know this probably goes against hn ethos, but one of my most important features is the search. I store TB of data and it could be hard to find a picture. I want the cloud software to analyze the image so that I can search "2 people on Nothing street" and find it.
so far google is amazing at search. hopefully others will be better, but it's really hard to evaluate cloud software based on that
Immich can do that
Is this a fork of something? Or recently open sourced? Looks like there is a single commit where a majority of the code came from.
> Looks like there is a single commit where a majority of the code came from.
I do this all the time, right before open sourcing a project. Basically while it's private, commit quality can be a bit rough, and if I want to open source it, I'll remove .git, make a new init commit then open source it. No one needs to see what I do in my private abode :)
The history of the development since its beginning can help a lot in studying the code, so I encourage people to avoid the single commit as much as possible.
It's much better to refactor (rebase) the messy commits, removing the personal or embarrassing stuff; although that might result in a "false" history, a series of smaller-sized commits will usually be much easier to follow than reading a whole code base all at once.
Really, I see a ton of open-source projects that do this, and it results in a lot of more opacity and friction than necessary.
It results in less people being able to check the code and contribute to the project.
I promise you're not missing much, except some commits that are implementing something, reverting it, implementing it again slightly differently, fixing typos, replacing 80% of the codebase in one swoop and similar stupid and un-needed stuff.
If the project is from the get-go supposed to be a long-lived project (like professional development for a business) then I agree, don't smoke the entire history no matter how embarrassing it is.
But for my personal projects, I can let you know that having access to the git history before I made it FOSS will make you dumber rather than being helpful for anything, compared to one clean starting commit.
Why do you think it's embarassing? The result is what reasonable people judge. And if you get to it through trial and error, well, that's how it's done almost everytime. It's normal
> Why do you think it's embarassing?
I don't? I said I remove it because it isn't useful to anyone, might even be adding more confusion than it solves, not because I'm embarrassed over anything.
If it really isn't useful, which I imagine means you committed somewhat haphazardly, ok, of course.
If there might be some usefulness hidden there (for example, trying something and then reverting it shows that you did explore it), it's also possible to place the old stuff in another repository or another branch (better the latter, unless it increases the repository's size too much)
> for example, trying something and then reverting it shows that you did explore it
True, those things tend to go into the documentation itself, checked into the codebase itself instead of being somewhat hidden inside the git history. Usually I end up having both a "Open Problems" (things yet to solve) and a "Tried X, this is why it didn't work" section somewhere in the documentation.
> it's also possible to place the old stuff in another repository
Yes, before the process I initially described, I usually leave a copy intact with the full-full history, but that's not what I published, just kept as an archive.
> > for example, trying something and then reverting it shows that you did explore it > > True, those things tend to go into the documentation itself, checked into the codebase itself instead of being somewhat hidden inside the git history. Usually I end up having both a "Open Problems" (things yet to solve) and a "Tried X, this is why it didn't work" section somewhere in the documentation.
That's good, and yes, if that repository history really wouldn't add anything it's fine to squash everything
> > it's also possible to place the old stuff in another repository > > Yes, before the process I initially described, I usually leave a copy intact with the full-full history, but that's not what I published, just kept as an archive.
Ok, I meant a public repository though
Ha! 100% agree! Lots of my commits have personal info even. Months or years of changes, I'd rather squash and then push publicly.
+1
They were originally working on a MS teams replacement, with a bunch of things in one app like teams. (I tried it back then, it was pretty green). Now it looks like they are focused on drive, chat and email. The old app seems deprecated, so I presume they forked it into some of this new stuff.
If you want to increase adoption, change the name: https://www.paulgraham.com/name.html
TDrive would work
> If you want to increase adoption, change the name: https://www.paulgraham.com/name.html
> If you have a US startup called X and you don't have x.com, you should probably change your name.
But they do own https://twake-drive.com/ already? What exactly is your point here? Either you misunderstand the linked article, or I do. But seems people would be able to find that just fine if they search for, as twake-drive.com comes up as the first result when I search for "Twake Drive".
Besides, Graham's articles are almost always geared towards startups in one way or another. This doesn't seem to be that, so not sure I'd even try to read it if I was the owner of Twake Drive.
The name is hard to convey. Try telling someone verbally how to find it without error: "Twake. No, not take - like Wake with a T, Twayke. T double you ay kay ee. Oh, and there's a hyphen in the domain. T-Wake hyphen Drive dot com."
Re: should they read it? Either you want your product to spread, or you don't.
If you're posting it on HN, you want to share it, and for it to be shared. A tough name makes it harder to share, so you have to decide if you really want your product to spread or not.
It can go wrong too.
You search that in Google with file sharing keywords and the AI will helpfully correct it to 'do you mean GDrive?'
They would've lost a prospective user to a competitor while sounding like a knockoff of some other product.
search engine "correction" to GDrive is a good point. Both Brave & Duck correct to GDrive, but Google finds a local "t-drive" product in ZA.
Where was this guy when Mr Newel was setting up store.steampowered.com? Imagine how much more successful they'd be if they went with steam.com instead
I don't think that advice has been relevant anymore for awhile now.
It's still relevant.
Yeah - Twake is a terrible name though, tbf, I wonder what the use case is for open source cloud drive outside of pretty niche situations esp when the cost, in many cases, is for the infrastructure in part
Why not use Deno instead of Node.js for the backend? For a product like this could the extra security that Deno's sandbox provides help?
You could also just run the node.js process via a `systemd` service and sandbox it that way using hardening directives.
Google safe browsing violation in 3...2...
Cool, who's the audience?
Does it have mobile clients?
It's really not clear: they seem to show a mobile app (https://static.tildacdn.com/tild3536-3661-4363-b433-35353561...) but there are no links to app stores anywhere, seems like they ended up on HN too early, maybe we should let them some time to get their stuff together
Missed an opportunity with the name Twake Dwive.
versus nextCloud ownCloud ?
yes :)
Dreaming of (and working on) making the ATProto PDS capable as a backend for authn/z and storage for ideas like this.
I've definitely been more motivated to de-cloud as the tech bros capitulate as well as push their ai way too hard
Why do we need another file sharing platform?
it is not a new one, it used to be called Cozy drive before.
[flagged]
Plenty of people throughout the world use Telegram. Their platform handles large groups quite well.
[flagged]
In European nations who aren't English-first-language it's quite widespread around university students and people that outgrown Whatsapp, it isn't very much different than using a Discord groupchat (and you lose less important stuff in Telegram). Admittedly a bit is for network effect around "grindset" jocks but it isn't very much different than using discord or Meta messenger or Slack, just a freemium SaaS that the project doesn't support firsthand so if "our server" is down, "theirs" maybe is not. I say they are all the same although Telegram's insecurity is proven, they still are the same overall for a FOSS project.
It's not so much about security, as FOSS conversation groups would be open to anyone anyway, but it's not a good look for a project to use a tool that is known to be quite shady while there are FOSS tools, or simply tools with a better reputation. Also the project group seems to be french, not english-first-language, and Telegram is absolutely not well seen in France, not used by much more than a few percent
How is it shady?
It beats WA on UI in most cases (especially on desktop), has open source client, much better groups/channels for one-to-many, many-to-many communications. Has bots support like I never seen on WA.
Russians is a bit narrow, but it is mostly the popular choice in eastern Europe. But when it comes to these messengers it differs based on location. There is KakaoChat in Korea, WeChat in China, Whatsapp in central Europe and South America, iMessage in the US, etc.
That definitely depends on where you are. In Germany it's quite common, and in other countries too.
I use it over discord pretty frequently. The app UI is much simpler than discords and I've been able to get family to stick with using it because of that. Signal is my main way of communication, then telegram, then discord.
We used it as our family chat for many (10?) years. Only recently we moved to signal.
Discord is unprofessional and tiresome if you ask me; I regret that many projects I admire don't see it that way.
Nah, it is used by the likes of my 87yo mother who wouldn't recognise a crypto if it landed in a tree nearby and doesn't speak a word of Russian (she's Dutch). It is used by those who shun anything MetaFacebook and as such won't install Whatsapp. As such I have used it in the past but now mostly use my own XMPP server although I still have it installed on several devices to keep in touch with those who remain on the platform. I do know a few words of Russian but that is unrelated to my (mostly former) use of Telegram.
Furries use it too
It doesn't make sense for an open-source group making open-source projects and strongly promoting open-source to use such a tool for group chat. There are plenty of open-source alternatives or alternatives with no such negative association.
edit: it looks even worse knowing that they have their own chat project: https://twake-chat.com that is powered by the Matrix protocol
Projects like Valetudo, CoMaps, and Pixel Experience OS (to name a few) also rely on a Telegram group. Not everyone uses Matrix to build a community.
> LINAGORA is a French open-source software publisher, a pioneer in digital sovereignty, and a leader of ethical, user-centered solutions
> LINAGORA > ethical
Wow, they sure turned things around since their CEO allegedly last locked employees in their office so they'd be forced to keep working through the night.
Can you please provide a source, I am curious about this. I've exhausted my googling capabilities here...
I asked ChatGPT and it said the story was absolutely correct
[flagged]
In TypeScript, interesting. Not the obvious choice IMO but trying to keep an open mind.
Was that because of team expertise or particular aspects of TS you thought suited the domain?
since it's I/O heavy an async web-oriented stack (ie. NodeJS) makes sense, and then TS is an obvious improvement over raw JS, and if the frontend is also JS/TS then at least there's some chance that expertise can be shared
The problem is such systems are also CPU heavy, with extensive hashing, encryption, and really quite a lot of general paperwork, and as such, a system that can efficiently use multiple CPUs is really important. I guarantee that plenty of Twake installs are absolutely spending a ton of time blocked on CPU, both because of the multithreading, and the general 10x-slower-than-C you can expect from Javascript on general code.
Javascript was a poor choice that will hold the project back just as choosing PHP for the base has done and continues to do a lot of damage to NextCloud/OwnCloud. This is not a task for a scripting language, because they're disqualified on performance. It's also not a task for dynamic typing, and using Typescript can help with that, but it doesn't change the fact that Javascript is just generally slow and does not play well on multiple CPUs.
How does it make money? Couldn't find any about us page or explanation. As we all know,if it's free, you're the product.
This soundbite really needs to go away. It and its counterexamples don't apply in any significant measure. You can pay and still be the product, and that is often the case.
Open Source != Free, feels like the typical HN user should know this better than the average user.
FWIW, the people working on this project has Mission and Vision pages on their website: https://linagora.com/en/mission https://linagora.com/en/vision
Took me a whooping 17 seconds to find those two.
Damn bro, I didn’t know gcc had been exploiting me for all these years.
GCC was a psyop to destabilize the private compiler industry.
-Someone, surely
I'm pretty sure it reads your code, bro! Sus...
I’m not sure, but if major companies start using it, they’ll definitely find a way to make money from it.