> This has massive implications. SEC means low latency, because nodes don't need to coordinate to handle reads and writes. It means incredible fault tolerance - every single node in the system bar one could simultaneously crash, and reads and writes could still happen normally. And it means nodes still function properly if they're offline or split from the network for arbitrary time periods.
Well, this all depends on the definition of «function properly». Convergence ensures that everyone observed the same state, not that it’s a useful state. For instance, The Imploding Hashmap is a very easy CRDT to implement. The rule is that when there’s concurrent changes to the same key, the final value becomes null. This gives Strong Eventual Consistency, but isn’t really a very useful data structure. All the data would just disappear!
So yes, CRDT is a massively useful property which we should strive for, but it’s not going to magically solve all the end-user problems.
Yeah; this has been a known thing for at least the 15 years I’ve been working in the collaborative editing space. Strong eventual consistency isn’t enough for a system to be any good. We also need systems to “preserve user intent” - whatever that means.
One simple answer to this problem that works almost all the time is to just have a “conflict” state. If two peers concurrently overwrite the same field with the same value, they can converge by marking the field as having two conflicting values. The next time a read event happens, that’s what the application gets. And the user can decide how the conflict should be resolved.
In live, realtime collaborative editing situations, I think the system just picking something is often fine. The users will see it and fix it if need be. It’s really just when merging long running branches that you can get in hot water. But again, I think a lot of the time, punting to the user is a fine fallback for most applications.
So the entire point of the (short) article I wrote was to get people to think outside of the the little box people put CRDTs in: javascript libraries and collaborative editing.
Yet here we are, circling back to collaborative editing...
At this point I think the term "CRDT" has too much baggage and I should probably stop using it, or at least not put it in blog post titles.
good point. the reality is conflicts should often be handled in the business logic, not in the consensus logic, but not universally. For the former, having the conflict state be the consensus state is ideal, but you do risk polluting your upstream application with a bunch of unnecessary conflict handling for trivial state diffs.
With CRDT, you have local consistency and strong convergence, but no guarantee of semantic convergence (i.e. user intent). I would still hire OP, but I would definitely keep him in the backend and away from UX
My point is a good crdt should let you tune that on a per field / per instance basis. Sometimes you want automatic “good enough” merging. Sometimes you want user intervention. When you want each is not obvious at the moment. We haven’t really explored the UX state space yet.
In general the automatic merging works pretty well most of the time. Where things go wrong is - for example - when people think they can put JSON data into a text crdt and have the system behave well. Instead the automatic merging breaks the rules of JSON syntax and the system falls over.
I've prototyped something attempting to solve this problem of preserving user intent and maintaining application semantics. See comment here https://news.ycombinator.com/item?id=45180325
I've replied elsewhere, but on the face of it I can't see how this solves the problem of conflicts in any way. If you disagree, say more about how it solves this?
If two users concurrently edit the same word in a text document, how does your system help?
For a text document a normal CRDT is perfect. They're very good for that specific case. What I tried to solve is eventual consistency that _also_ preserves application semantics. For example a task tracker:
* first update sets task cancelled_at and cancellation_reason
* second update wants the task to be in progress, so sets started_at
CRDT's operate only at the column/field level. In this situation you'd have a task with cancelled_at, cancellation_reason, status in progress, and started_at. That makes no sense semantically, a task can't both be cancelled and in progress. CRDTs do nothing to solve this. My solution is aimed at exactly this kind of thing. Since it replicates _intentions_ instead of just data it would work like this:
When reconciling total order of actions using logical clocks the app logic for setCancelled runs first then setInProgress runs second on every client once they see these actions. The app logic dictates what should happen, which depends on the application. You could have it discard action2. You could also have it remove the cancellation status and set in_progress. It depends on the needs of the application but the application invariants / semantics are preserved and user intentions are preserved maximally in a way that plain CRDTs cannot do.
Yes; I get all that from the readme. You pick an arbitrary order for operations to happen in. What I don't understand is how that helps when dealing with conflicts.
For example, lets say we have a state machine for a task. The task is currently in the IN_PROGRESS state - and from here it can transition to either CANCELLED or COMPLETE. Either of those states should be terminal. That is to say, once a task has been completed it can't be cancelled and vice versa.
The problem I see with your system is - lets say we have a task in the IN_PROGRESS state. One peer cancels a task and another tries to mark it complete. Lets say a peer sees the COMPLETE message first, so we have this:
IN_PROGRESS -> COMPLETE
But then a peer sees the CANCEL message, and decides (unambiguously) that it must be applied before the completion event. Now we have this:
IN_PROGRESS -> CANCELLED (-> COMPLETE ignored)
But this results in the state of the task visibly moving from the COMPLETE to CANCELLED state - which we said above the system should never do. If the task was complete, it can't be cancelled. There are other solutions to this problem, but it seems like the sort of thing your system cannot help with.
In general, CRDTs never had a problem arbitrarily picking a winner. One of the earliest documented CRDTs was a "Last-writer wins (LWW) register" which is a register (ie variable) which stores a value. When concurrent changes happen, the register chooses a winner somewhat arbitrarily. But the criticism is that this is sometimes not the application behaviour what we actually want.
You might be able to model a multi-value (MV) register using your system too. (Actually I'm not sure. Can you?) But I guess I don't understand why I would use it compared to just using an MV register directly. Specifically when it comes to conflicts.
It does not pick an arbitrary order for operations. They happen in total (known at the time, eventually converging) order across all clients thanks to hybrid logical clocks. If events arrive that happened before events a client already has locally it will roll back to that point in time and replay all of the actions forward in total ordering.
As for the specific scenario, if a client sets a task as COMPLETE and another sets it as CANCELLED before seeing the COMPLETE from the other client here's what would happen.
Client2: Replay action { id: 2, action: cancelTask, taskId: 123, clock: ...} <-- This is running exactly the same application logic as the first cancelTask. It can do whatever you want per app semantics. In this case we'll no-op since transition from completed -> cancelled is not valid.
Client2: SYNC -> no newer actions in remote, accepted
At this point client1, client2, and the central DB all have the same consistent state. The task is COMPLETE. Data is consistent and application semantics are preserved.
There's a little more to it than that to handle corner cases and prevent data growth, but that's the gist of it. More details in the repo.
The great thing is that state is reconciled by actually running your business logic functions -- that means that your app always ends up in a valid state. It ends up in the same state it would have ended up in if the app was entirely online and centralized with traditional API calls. Same outcome but works totally offline.
Does that clarify the idea?
You could argue that this would be confusing for Client2 since they set the task to cancelled but it ended up as complete. This isn't any different than a traditional backend api where two users take incompatible actions. The solution is the same, if necessary show an indicator in the UI that some action was not applied as expected because it was no longer valid.
edit: I think I should improve the readme with a written out example like this since it's a bit hard to explain the advantages of this system (or I'm just not thinking of a better way)
LLMs could be good at this, but the default should be suggestions rather than automatic resolution. Users can turn on YOLO mode if their domain is non-critical or they trust the LLM to get it right.
The issue is that to preserve the CRDT property the LLM has to resolve the conflicts in a deterministic and associative way. We can get the first property (although most popular LLMs do not uphold it) but we can hardy get the second one.
I read the comment you're responding to as suggesting a way to resolve the conflicts layered atop the CRDT, not as a component of the CRDT itself. You're very right that LLMs are the wrong tool for CRDT implementation, but using them to generate conflict resolutions seems worth exploring.
He very much leans toward them being hard to use in a sensible way. He has some interesting points about using threshold functions over a CRDT to get deterministic reads (i.e. once you observe the value it doesn't randomly change out from under you). It feels a bit theoretical though, I wish there were examples of using this approach in a practical application.
Why do we even need CRDTs? Why can't we have multi-user editors work like multiplayer video games?
The server has the authoritative state, users submit edits, which are then rejected or applied and the changes pushed to others. The users is always assumed to be online for multiplayer editing. No attempt is made to reconcile independent edits, or long periods of offline behavior.
To prevent data loss, when the user is offline and desyncs, he gets to keep his changes and manually merge them back.
I'm sure this isn't a Google genius worthy implementation and fails in the incredibly realistic scenario where thousands of people are editing the same spreadsheet at the same time, but its simple and fails in predictable ways.
sure, i mean that was how early group editing works, but generally you want to preserve state from both (if we both start typing in the same spot, we both add stuff). Also it prevents any offline editing or high...lag editing really. unlike gaming which needs to be realtime this is much softer.
This needs to be as realtime as WhatsApp. If your internet connection gets bad often enough to have trouble supporting WhatsApp, then my heart goes out to you, but thankfully this is clearly not normal for the most of us most of the time.
And if this happens, your experience is going to be terrible anyway.
The big problem with CRDTs IMO is that they make it incredibly easy to break application semantics.
Just a basic example for a task tracker:
* first update sets task cancelled_at and cancellation_reason
* second update wants the task to be in progress, so sets started_at
If code just uses the timestamps to consider the task state, it would not assume the task is cancelled, unexpected since the later user update set it to in progress.
Easy fix, we just add a state field 'PENDING|INPROGRESS|CANCELLED|...'.
Okay, but now you have a task that is in progress, but also has a cancellation timestamp, which seems inconsistent.
The point is:
With CRDTs you have to consider how partial out of order merges affect the state, and make sure your logic is always written in a way so these are handled properly.
That is *not easy*!
I'd love it if someone came up with a framework that allows defining application semantics on top of CRDTs, and have the framework ensure types remain consistent.
Do not separate the state field from its time stamp(s). Use a sum type (“tagged union”) where the time stamps are the payload for a selected state. Make invalid states unrepresentable.
If you want invalid states unrepresentable, and time as a primary key... How do you deal with time regularly becoming non-linear within the realm of computing?
The general answer is to accept that time isn’t linear. In a collaborative editing environment, every event happens after some set of other events based on what has been observed locally on that peer. This creates a directed acyclic graph of events (like git).
It requires a different primary key than an autoincrementing integer. One popular choice is to use a tuple of (peer_guid, incrementing integer). Or a randomly generated GUID, or a hash of the associated data.
Then each event is associated with zero or more "parent events".
- An event has 0 parents if it is the first change
- An event has 1 parent if it simply came after that event in sequence
- And if an event merges 2 or more branches in history, it says it comes after all of those events
You can also think about it like a set. If I know about events {A, B, C} and generate event D, then D happens-after {A, B, C}. (Written {A,B,C} -> D). But if A->B, then I only need to explicitly record that {B,C} -> D because the relationship is transitive. A -> B -> D implies A -> D.
You mean, like attempting to merge contradictory states? You will need some resolution stategy then, but in general that would be application-specific, and sometimes it may not exist.
Yes, sort of like you have to think about your transaction boundaries in server-side code for every single task.
The difference is that coming up with a correct CRDT solution for application specific consistency requirements can be a research project. In many cases, no CRDT solution can exist.
In my experience, 95% of applications are handled just fine by the sort of JSON types built in to Yjs or automerge. The problems I hear people complain about are things like performance, size on disk and library ergonomics. And the long tail of features - like ephemeral data support and binary assets.
But data mapping seems mostly fine?
I know of a couple of exceptions. Arbitrary nested tree reparenting can be a nightmare. And there aren’t many good rich text implementations out there.
One large class of problems I'm thinking of is simply outside the scope of CRDTs. The whole idea of _eventual_ consistency doesn't really work for things like payment systems or booking systems. A lot of OLTP applications have to be consistent at all times (hence the O). Money must not be double spent. Rooms or seats must not be double booked.
The other class of problems is more debatable. CRDTs can guarantee that collaborative text editing results in the same sequence of letters on all nodes. They cannot guarantee that this sequence makes sense. Authors can step on each other's toes.
Whether or not this is a problem depends on the specific workflow and I think it could be mitigated by choosing better units of storage/work (such as paragraphs rather than letters).
> One large class of problems I'm thinking of is simply outside the scope of CRDTs. The whole idea of _eventual_ consistency doesn't really work for things like payment systems or booking systems.
Yes! I think of it as owned data and shared data. Owned data is data that is owned by one process or node. Eg my bank balance, the position of my mouse cursor, the temperature of my CPU. For this stuff, you don’t want a crdt. Use a database. Or a variable in memory or a file on disk. Broadcast updates if you want, but route all write requests through the data’s owner.
Then there’s shared data - like the source code for a project or an apple note. There, CRDTs might make sense - especially if you get branching and merging support along for the ride.
> Authors can step on each other's toes.
Yeah when merging long lived branches, the workflow most people want is what git provides - of humans manually resolving conflicts. There’s no reason a crdt couldn’t provide this. CRDTs have a superset of the information available to git. It’s weird nobody has coded a system like that up yet.
I think you have the right idea, but possibly the wrong perspective. You want your _source of truth_, which is the "owned data" to be strongly consistent. Your shared data is a "view of truth" which may be incomplete or in disagreement with the source of truth. For example, the color of the sky "right now" depends on where on the earth you are standing, but we can all agree that air is 'just barely blue' and it depends on the light shining into it and how much of there exists.
The _source of truth_ are these facts (like "the air is blue" or "the user inserted the letter A at position X" or "the CPU is 40 degrees"). The view of this source is what we see, and can be seen through a CRDT or any other lens.
The way I’m defining it, my shared state is the data we store in a crdt. And CRDTs have strong eventual consistency. That’s what makes them great. So we can have a data structure which shows all users an identical view of the world.
Normally we do that by storing something totally different under the hood. Eg, git actually stores a commit graph. But the system makes a determinism guarantee: we promise that all users who have the same version checked out will see exactly the same thing. At one level, we’re storing “a list of facts” (the commit graph). But at another level of abstraction, we’re just storing application data. It’s just also replicated between many peers. And editable locally without network access.
> So we can have a data structure which shows all users an identical view of the world.
This is never true. You can prove that at some time now()-T where T > 0 you had the same view of the universe, but you cannot prove that you currently have the exact same view because even with the attempt of checking, T becomes greater than 0. Sometimes, this doesn't matter (T can be arbitrarily large and still effectively be zero -- like asking your friend if he is still married to that person. They can answer you days later, and it'll still be true), but sometimes even very small values of T cannot be assumed to be zero.
Well yeah obviously you never know for sure that a remote peer doesn’t have some changes that they haven’t told you about yet. That’s also true with lots of platforms - like google docs and Notion and multiplayer video games. Seems fine though? I don’t understand why this matters for collaborative editing?
It works like you describe, with humans manually resolving conflicts. The conflicts are represented in the data model, so the data model itself converges without conflicts...if that makes sense.
Interesting idea. As I understand it though, this wouldn’t give you the kind of conflict semantics I’m talking about out of the box. What I want is - if two users concurrently edit the same line of text, the system can “merge” those changes by storing the conflict. Subsequent readers of the document see a merge conflict and can resolve the conflict manually.
Your system looks like it just enforces a global order on the actions. This will give you SEC - but how do you preserve the information that these edits were concurrent - and thus conflict with one another?
You're right, it's not the same as conflict/merge semantics, but you probably could implement those semantics on top of it. My idea was more about being able to merge offline states for arbitrary data without user intervention while also ensuring that application invariants / semantics are preserved. Preserving app semantics while as much as possible preserving user intentions.
>The classical paper-ledger bookkeeping is pretty much eventually consistent. They did not have the Internet when they invented it.
Absolutely. Bookkeeping is an offline activity (I'm only doing it once a year in my company, ha ha). You just have to make sure not to record the same transaction more than once, which could be non-trivial but shouldn't be impossible to do with CRDTs.
>Flight booking is often statistically consistent only. Overbooking, etc.
That may be acceptable in some cases but you still can't use CRDTs for it, because you need a way to limit the extent of overbooking. That requires a centralised count of bookings.
Most complex crdts are built on top of the simple crdt of a grow only set. Ie, what we actually synchronise over the network is a big bag of commits / operations / something such that the network protocol makes sure everyone ends up with all of the operations known to any peer. Then the crdt takes that big set and produces some sort of sensible projection from it.
> You just have to make sure not to record the same transaction more than once
So this should be pretty easy. Have a grow only set of transactions. Give each one a globally unique ID at the point of creation. Order by date and do bookkeeping. One thing you can’t guarantee is that the balance is always positive. But otherwise - yeah.
I prototyped exactly such a framework! It's designed to solve exactly the problem you mentioned. It’s a super interesting problem. https://github.com/evelant/synchrotron
The gist is:
* Replicating intentions (actions, immutable function call definitions that advance state) instead of just replicating state.
* Hybrid logical clocks for total ordering.
* Some client side db magic to make action functions deterministic.
This ensures application semantics are always preserved with no special conflict resolution considerations while still having strong eventual consistency. Check out the readme for more info. I haven’t gotten to take it much further beyond an experiment but the approach seems promising.
I've had similar thoughts, but my concern was: if you have idempotent actions, then why not just encode them as actions in a log. Which just brings you to event sourcing, a quite well-known pattern.
If you go that route, then what do you need CRDTs for?
Doesn't event-sourcing imply that there's a single source-of-truth data store you can source them from? I'm not sure event sourcing says anything about resolving conflicts or consistency.
The pattern I came up with is similar to event sourcing but with some CRDT and offline-first concepts mixed in. By using logical clocks and a client side postgres (pglite) it doesn't have to keep the entire event history for all time and the server side doesn't have to process actions/events at all beyond storing them. The clients do the resolution of state, not the server. Clients can operate offline as long as they like and the system still arrives at a consistent state. AFAIK this is different than most event sourcing patterns.
At least in my thinking/prototyping on the problem so far I think this solution offers some unique properties. It lets clients operate offline as long as they like. It delegates the heavy lifting of resolving state from actions/events to clients, requiring minimal server logic. It prevents unbounded growth of action logs by doing a sort of "rebase" for clients beyond a cutoff. It seems to me like it maximally preserves intentions without requiring specific conflict resolution logic. IMO worth exploring further.
A CRDT is any data structure that meets the definition (associative, commutative, idempotent, etc...)
Event Sourcing is not strictly designed to achieve eventual consistency in the face of concurrent writes though. But that doesn't mean it can't be!
I've also been considering an intent based CRDT system for a while now (looking forward to checking out GPs link) and agree that it looks/sounds very much like Event Sourcing. It's worth while being clear on the definition/difference between the two though!
I wonder how does this handle a modify-rename conflict? e.g. there's a file identified by its name `a` and one client renames it to `b` while another client tries to modify the contents of `a`. Once you replay it in this order does the intent of modifying the contents of what was once `a` remain?
I know you can use some unique persistent ids instead of names, but then you get into issues that two clients create two files with the same name: do you allow both or not? What if they initially create it equal? What if they do so but then they modify it to be different?
It would be up to application logic. This prototype essentially offers the same behavior you would get with a traditional backend API except it works offline. The results would be the same as if clients made those calls to a backend api, that is, up to application logic. My idea was that it's essentially impossible to have generic "conflict resolution" that follows arbitrary business rules so I made the business rules _be_ the conflict resolution. For any given situation the answer to "how would it handle a then b then c" is "the same as any normal backend api, per regular business logic, except it works offline".
Don’t you also have to consider this just as much without CRDT? Not saying it isn’t a real issue, but this example could easily be a problem with a more traditional style app - maybe users open the record on their web browser at same time and make different updates, or they update the different timestamp fields directly in a list of tasks.
The big idea behind CRDTs is that a data structures can have replicas synchronizing on a best-effort basis. That is much closer to the physical reality: server here, client there, phones all over the place.
The basic CRDT ideas are actually pretty easy to implement: add some metadata here, keep some history there. The difficulty, for the past 20 years or so, is making the overheads low, and the APIs understandable.
Many projects revolve around some JSON-ish data format that is also a CRDT:
- Automerge https://automerge.org (the most tested one, but feels like legacy at times, the design is ~10yrs old, there are more interesting new ways)
Others are trying to retrofit CRDTs into SQLite or Postgres. IMO, those end up using last-write-wins in most cases. Relational logic steers you that way.
Could you explain more about Automerge being legacy? They recently released a new major version and revamped their algorithm I believe. What is better about your version?
Automerge is based on design decisions from 2014-2017. I remember that epoch and what everybody thought back then. The views have evolved since, but Automerge is a large project, many things have being built on top. It is unrealistic to change any of the basic assumptions at this point.
I may go into the technical details, assuming my hourly rate is respected.
- RDX has a spec, so it can have compatible implementations. The result of a merge is specified to a bit. Automerge works the way Orion coded it (this time).
- There are equivalent text and binary formats, JDR and RDX.
- RDX palette of types is richer. Automerge is narrower than JSON.
- RDX can work in any commodity LSM db, natively, in the core (and it does).
Do people really distinguish "Strong Eventual Consistency" from "Eventual Consistency"? To me, when I say "Eventual Consistency" I alwayes mean "Strong Eventual Consisteny".
(Non-Strong) Eventual Consistency does not guarantee that all replicas converge in a specific time period.
In an eventually consistent system replicas can diverge. A "last write" system can be eventually consistent, but a given point can read differently.
Eg: operations
1) Add "AA" to end of string
2) Split string in middle
Replicas R1 and R2 both have the string "ZZZZ"
If R1 sees operations (1) then (2) it will get
"ZZZZAA", then "ZZZ", "ZAA"
If R2 sees (2) then (1) it will get:
"ZZ", "ZZ", then "ZZAA", "ZZ".
Strong Eventual Consistency doesn't have this problem because the operations have the time vector on them so the replicas know what order to apply them.
I’m not sure I follow. How would this be eventually consistent at all? It looks like the two peers in your example simply have divergent state and will never converge.
You're not describing an eventually consistent system, you're describing a system that diverges. By definition, eventually consistent means that, after some time, all readers across the entire system are guaranteed to find the same values, even if before that time they may see different values.
Any eventually consistent system has to have a strategy for ensuring that all nodes eventually agree on a final value. R1 and R2 need to communicate their respective states, and agree to a single one of them - maybe using timestamps if R2's value is newer, R1 will replace its own value when they communicate), maybe using a quorum (say there is also an R3 which agrees with R1, then R2 will change its value to match the other two), maybe using an explicit priority list (say, R1's value is assumed better than R2's).
I think there are a lot of systems that have a separate node syncing feature, where two nodes can receive updates, apply them to their local replica right away, and only communicate them and reconcile with the other backend nodes at a later time.
Interesting that neither the article nor the comments mention the CALM theorem [0], which gives a framework to explain when coordination-free consistency is possible, and is arguably the big idea behind SEC.
Would this be a suitable ds to distribute node state for caching indices? Let's say two nodes have a set of N (possibly overlapping) keys and I want both to know all keys of each other for request routing (request for n \in N preferably to node with n in local cache).
At first it made no sense but then I realised that what the author is saying is that in a distributed system, when you make local changes, you do not wait for the changes to propagate to all participants and back to you, before your local state is considered to be consistent with the global state but rather it is considered consistent with the global state immediately even before your local changes leave your system. In other words, every change committed into the distributed system is immediately consistent with the global state even if there are undelivered changes as eventually all the changes produce the same outcome.
In a specific use case that might apply. For example, if two people edit the same document and fix the same typo, the visual outcome is the same, no matter who made the change first or last.
But that is very niche as if we would take a programming code, someone can change a line of code that someone else is changing as well and they might be the same, but then you have other lines of code as well that might not be and then you end up with a code that won't compile. In other words, if we focus on the singular change in insolation, this makes sense. But that is essentially never the case in distributed environments in this context and we have to look at broader picture where multiple changes made by someone are related or tied to each other and do not live insolation.
Either way, i see nothing useful here. You can "render" your local changes immediately vs wait for them to be propagated through the system and return back to you. There is very little difference here and in the end it is mostly just about proper diffing approach and has little to do with the distributed system itself.
PS: the problem here is not really the order of applied changes for local consumer, like in case of editing a shared word document. The problem here is if we have a database and we commit a change locally but then someone else commits different change elsewhere, like "update users set email = foo@bar where id = 5" and before we receive the other, later, change we serve clients invalid data. That is the main issue of eventual consistency here. As I am running a system like this, I have to use "waiters" to ensure I get the correct data. For example, when user creates some content via web ui and is redirected back to list of all content, this is so fast that the distributed system has not had enough time to propagate the changes. So this user will not see his new content in the list - yet. For this scenario, I use correlation id that i receive when content is created and i put it into the redirect so when user moves to the page that lists all the content, this correlation is detected and a network call is made to appropriate server whose sole purpose is to keep the connection open until that server's state is caught up to the provided correlation id. Then I refresh the list of content to present the user the correct information - all of this whilst there is some loading indicator present on the page. There is simply no way around this in distributed systems and so I find this article of no value(at least to me).
> This has massive implications. SEC means low latency, because nodes don't need to coordinate to handle reads and writes. It means incredible fault tolerance - every single node in the system bar one could simultaneously crash, and reads and writes could still happen normally. And it means nodes still function properly if they're offline or split from the network for arbitrary time periods.
Well, this all depends on the definition of «function properly». Convergence ensures that everyone observed the same state, not that it’s a useful state. For instance, The Imploding Hashmap is a very easy CRDT to implement. The rule is that when there’s concurrent changes to the same key, the final value becomes null. This gives Strong Eventual Consistency, but isn’t really a very useful data structure. All the data would just disappear!
So yes, CRDT is a massively useful property which we should strive for, but it’s not going to magically solve all the end-user problems.
Yeah; this has been a known thing for at least the 15 years I’ve been working in the collaborative editing space. Strong eventual consistency isn’t enough for a system to be any good. We also need systems to “preserve user intent” - whatever that means.
One simple answer to this problem that works almost all the time is to just have a “conflict” state. If two peers concurrently overwrite the same field with the same value, they can converge by marking the field as having two conflicting values. The next time a read event happens, that’s what the application gets. And the user can decide how the conflict should be resolved.
In live, realtime collaborative editing situations, I think the system just picking something is often fine. The users will see it and fix it if need be. It’s really just when merging long running branches that you can get in hot water. But again, I think a lot of the time, punting to the user is a fine fallback for most applications.
So the entire point of the (short) article I wrote was to get people to think outside of the the little box people put CRDTs in: javascript libraries and collaborative editing.
Yet here we are, circling back to collaborative editing...
At this point I think the term "CRDT" has too much baggage and I should probably stop using it, or at least not put it in blog post titles.
good point. the reality is conflicts should often be handled in the business logic, not in the consensus logic, but not universally. For the former, having the conflict state be the consensus state is ideal, but you do risk polluting your upstream application with a bunch of unnecessary conflict handling for trivial state diffs.
With CRDT, you have local consistency and strong convergence, but no guarantee of semantic convergence (i.e. user intent). I would still hire OP, but I would definitely keep him in the backend and away from UX
My point is a good crdt should let you tune that on a per field / per instance basis. Sometimes you want automatic “good enough” merging. Sometimes you want user intervention. When you want each is not obvious at the moment. We haven’t really explored the UX state space yet.
In general the automatic merging works pretty well most of the time. Where things go wrong is - for example - when people think they can put JSON data into a text crdt and have the system behave well. Instead the automatic merging breaks the rules of JSON syntax and the system falls over.
We have LLMs now, couldn't they be used to merge conflicts in a more sensible way? It might get a little expensive I imagine.
I've prototyped something attempting to solve this problem of preserving user intent and maintaining application semantics. See comment here https://news.ycombinator.com/item?id=45180325
I've replied elsewhere, but on the face of it I can't see how this solves the problem of conflicts in any way. If you disagree, say more about how it solves this?
If two users concurrently edit the same word in a text document, how does your system help?
For a text document a normal CRDT is perfect. They're very good for that specific case. What I tried to solve is eventual consistency that _also_ preserves application semantics. For example a task tracker:
* first update sets task cancelled_at and cancellation_reason
* second update wants the task to be in progress, so sets started_at
CRDT's operate only at the column/field level. In this situation you'd have a task with cancelled_at, cancellation_reason, status in progress, and started_at. That makes no sense semantically, a task can't both be cancelled and in progress. CRDTs do nothing to solve this. My solution is aimed at exactly this kind of thing. Since it replicates _intentions_ instead of just data it would work like this:
action1: setCancelled(reason) action2: setInProgress
When reconciling total order of actions using logical clocks the app logic for setCancelled runs first then setInProgress runs second on every client once they see these actions. The app logic dictates what should happen, which depends on the application. You could have it discard action2. You could also have it remove the cancellation status and set in_progress. It depends on the needs of the application but the application invariants / semantics are preserved and user intentions are preserved maximally in a way that plain CRDTs cannot do.
Yes; I get all that from the readme. You pick an arbitrary order for operations to happen in. What I don't understand is how that helps when dealing with conflicts.
For example, lets say we have a state machine for a task. The task is currently in the IN_PROGRESS state - and from here it can transition to either CANCELLED or COMPLETE. Either of those states should be terminal. That is to say, once a task has been completed it can't be cancelled and vice versa.
The problem I see with your system is - lets say we have a task in the IN_PROGRESS state. One peer cancels a task and another tries to mark it complete. Lets say a peer sees the COMPLETE message first, so we have this:
But then a peer sees the CANCEL message, and decides (unambiguously) that it must be applied before the completion event. Now we have this: But this results in the state of the task visibly moving from the COMPLETE to CANCELLED state - which we said above the system should never do. If the task was complete, it can't be cancelled. There are other solutions to this problem, but it seems like the sort of thing your system cannot help with.In general, CRDTs never had a problem arbitrarily picking a winner. One of the earliest documented CRDTs was a "Last-writer wins (LWW) register" which is a register (ie variable) which stores a value. When concurrent changes happen, the register chooses a winner somewhat arbitrarily. But the criticism is that this is sometimes not the application behaviour what we actually want.
You might be able to model a multi-value (MV) register using your system too. (Actually I'm not sure. Can you?) But I guess I don't understand why I would use it compared to just using an MV register directly. Specifically when it comes to conflicts.
It does not pick an arbitrary order for operations. They happen in total (known at the time, eventually converging) order across all clients thanks to hybrid logical clocks. If events arrive that happened before events a client already has locally it will roll back to that point in time and replay all of the actions forward in total ordering.
As for the specific scenario, if a client sets a task as COMPLETE and another sets it as CANCELLED before seeing the COMPLETE from the other client here's what would happen.
Client1: { id: 1, action: completeTask, taskId: 123, clock: ...}
Client1: SYNC -> No newer events, accepted by server
Client2: { id: 2, action: cancelTask, taskId: 123, clock: ...}
Client2: SYNC -> Newer events detected.
Client2: Fetch latest events
Client2: action id: 1 is older than most recent local action, reconcile
Client2: rollback to action just before id: 1 per total logical clock ordering
Client2: Replay action { id: 1, action: completeTask, taskId: 123, clock: ...}
Client2: Replay action { id: 2, action: cancelTask, taskId: 123, clock: ...} <-- This is running exactly the same application logic as the first cancelTask. It can do whatever you want per app semantics. In this case we'll no-op since transition from completed -> cancelled is not valid.
Client2: SYNC -> no newer actions in remote, accepted
Client1: SYNC -> newer actions in remote, none local, fetch newer actions, apply action { id: 2, action: cancelTask, ...}
At this point client1, client2, and the central DB all have the same consistent state. The task is COMPLETE. Data is consistent and application semantics are preserved.
There's a little more to it than that to handle corner cases and prevent data growth, but that's the gist of it. More details in the repo.
The great thing is that state is reconciled by actually running your business logic functions -- that means that your app always ends up in a valid state. It ends up in the same state it would have ended up in if the app was entirely online and centralized with traditional API calls. Same outcome but works totally offline.
Does that clarify the idea?
You could argue that this would be confusing for Client2 since they set the task to cancelled but it ended up as complete. This isn't any different than a traditional backend api where two users take incompatible actions. The solution is the same, if necessary show an indicator in the UI that some action was not applied as expected because it was no longer valid.
edit: I think I should improve the readme with a written out example like this since it's a bit hard to explain the advantages of this system (or I'm just not thinking of a better way)
LLMs might be able to use context to auto resolve them often with correct user intent automatically
LLMs could be good at this, but the default should be suggestions rather than automatic resolution. Users can turn on YOLO mode if their domain is non-critical or they trust the LLM to get it right.
The issue is that to preserve the CRDT property the LLM has to resolve the conflicts in a deterministic and associative way. We can get the first property (although most popular LLMs do not uphold it) but we can hardy get the second one.
I read the comment you're responding to as suggesting a way to resolve the conflicts layered atop the CRDT, not as a component of the CRDT itself. You're very right that LLMs are the wrong tool for CRDT implementation, but using them to generate conflict resolutions seems worth exploring.
Joseph Hellerstein has a series of posts on CRDTs: https://jhellerstein.github.io/blog/crdt-intro/
He very much leans toward them being hard to use in a sensible way. He has some interesting points about using threshold functions over a CRDT to get deterministic reads (i.e. once you observe the value it doesn't randomly change out from under you). It feels a bit theoretical though, I wish there were examples of using this approach in a practical application.
It's a bit like how a static type system provides useful guarantees, but you can still do:
``` fn add(x: num, y: num) = x * y ```
Why do we even need CRDTs? Why can't we have multi-user editors work like multiplayer video games?
The server has the authoritative state, users submit edits, which are then rejected or applied and the changes pushed to others. The users is always assumed to be online for multiplayer editing. No attempt is made to reconcile independent edits, or long periods of offline behavior.
To prevent data loss, when the user is offline and desyncs, he gets to keep his changes and manually merge them back.
I'm sure this isn't a Google genius worthy implementation and fails in the incredibly realistic scenario where thousands of people are editing the same spreadsheet at the same time, but its simple and fails in predictable ways.
Once I was using Slack on a bad WiFi and it was an adventure. What I saw as "sent" others never saw.
Yeah it's a common optimization technique I saw from both backend and frontend devs to hide errors and lie about the actual status.
sure, i mean that was how early group editing works, but generally you want to preserve state from both (if we both start typing in the same spot, we both add stuff). Also it prevents any offline editing or high...lag editing really. unlike gaming which needs to be realtime this is much softer.
but no you dont need it
This needs to be as realtime as WhatsApp. If your internet connection gets bad often enough to have trouble supporting WhatsApp, then my heart goes out to you, but thankfully this is clearly not normal for the most of us most of the time.
And if this happens, your experience is going to be terrible anyway.
The big problem with CRDTs IMO is that they make it incredibly easy to break application semantics.
Just a basic example for a task tracker:
* first update sets task cancelled_at and cancellation_reason
* second update wants the task to be in progress, so sets started_at
If code just uses the timestamps to consider the task state, it would not assume the task is cancelled, unexpected since the later user update set it to in progress.
Easy fix, we just add a state field 'PENDING|INPROGRESS|CANCELLED|...'.
Okay, but now you have a task that is in progress, but also has a cancellation timestamp, which seems inconsistent.
The point is:
With CRDTs you have to consider how partial out of order merges affect the state, and make sure your logic is always written in a way so these are handled properly. That is *not easy*!
I'd love it if someone came up with a framework that allows defining application semantics on top of CRDTs, and have the framework ensure types remain consistent.
Do not separate the state field from its time stamp(s). Use a sum type (“tagged union”) where the time stamps are the payload for a selected state. Make invalid states unrepresentable.
If you want invalid states unrepresentable, and time as a primary key... How do you deal with time regularly becoming non-linear within the realm of computing?
The general answer is to accept that time isn’t linear. In a collaborative editing environment, every event happens after some set of other events based on what has been observed locally on that peer. This creates a directed acyclic graph of events (like git).
That requires a different primary key than the time then, no?
It requires a different primary key than an autoincrementing integer. One popular choice is to use a tuple of (peer_guid, incrementing integer). Or a randomly generated GUID, or a hash of the associated data.
Then each event is associated with zero or more "parent events".
- An event has 0 parents if it is the first change
- An event has 1 parent if it simply came after that event in sequence
- And if an event merges 2 or more branches in history, it says it comes after all of those events
You can also think about it like a set. If I know about events {A, B, C} and generate event D, then D happens-after {A, B, C}. (Written {A,B,C} -> D). But if A->B, then I only need to explicitly record that {B,C} -> D because the relationship is transitive. A -> B -> D implies A -> D.
And the moment you need to merge, unrepresentable states become possible.
There are techniques to make it less painful, but it will still be possible.
You mean, like attempting to merge contradictory states? You will need some resolution stategy then, but in general that would be application-specific, and sometimes it may not exist.
Okay... But we're now back to invalid states being possible. Tagging with time isn't enough.
It isn’t enough for what? What are you trying to do?
There may be a way to solve whatever problem you have, but without specifics it’s impossible to tell.
It might be nice if our universe conformed to our intuitions about time steadily marching forward at the same rate everywhere.
Einstein just had to come along and screw everything up.
Causality is the key.
logical clocks
Even with a hybrid logical clock they are not that useful in case of a network partition AND clock drift, no?
There are many ways to solve each individual problem.
The point is that you always have to think about merging behaviour for every piece of state.
Yes, sort of like you have to think about your transaction boundaries in server-side code for every single task.
The difference is that coming up with a correct CRDT solution for application specific consistency requirements can be a research project. In many cases, no CRDT solution can exist.
Can you give some examples?
In my experience, 95% of applications are handled just fine by the sort of JSON types built in to Yjs or automerge. The problems I hear people complain about are things like performance, size on disk and library ergonomics. And the long tail of features - like ephemeral data support and binary assets.
But data mapping seems mostly fine?
I know of a couple of exceptions. Arbitrary nested tree reparenting can be a nightmare. And there aren’t many good rich text implementations out there.
What problems are you actually running into?
There are two rather different issues.
One large class of problems I'm thinking of is simply outside the scope of CRDTs. The whole idea of _eventual_ consistency doesn't really work for things like payment systems or booking systems. A lot of OLTP applications have to be consistent at all times (hence the O). Money must not be double spent. Rooms or seats must not be double booked.
The other class of problems is more debatable. CRDTs can guarantee that collaborative text editing results in the same sequence of letters on all nodes. They cannot guarantee that this sequence makes sense. Authors can step on each other's toes.
Whether or not this is a problem depends on the specific workflow and I think it could be mitigated by choosing better units of storage/work (such as paragraphs rather than letters).
> One large class of problems I'm thinking of is simply outside the scope of CRDTs. The whole idea of _eventual_ consistency doesn't really work for things like payment systems or booking systems.
Yes! I think of it as owned data and shared data. Owned data is data that is owned by one process or node. Eg my bank balance, the position of my mouse cursor, the temperature of my CPU. For this stuff, you don’t want a crdt. Use a database. Or a variable in memory or a file on disk. Broadcast updates if you want, but route all write requests through the data’s owner.
Then there’s shared data - like the source code for a project or an apple note. There, CRDTs might make sense - especially if you get branching and merging support along for the ride.
> Authors can step on each other's toes.
Yeah when merging long lived branches, the workflow most people want is what git provides - of humans manually resolving conflicts. There’s no reason a crdt couldn’t provide this. CRDTs have a superset of the information available to git. It’s weird nobody has coded a system like that up yet.
I think you have the right idea, but possibly the wrong perspective. You want your _source of truth_, which is the "owned data" to be strongly consistent. Your shared data is a "view of truth" which may be incomplete or in disagreement with the source of truth. For example, the color of the sky "right now" depends on where on the earth you are standing, but we can all agree that air is 'just barely blue' and it depends on the light shining into it and how much of there exists.
The _source of truth_ are these facts (like "the air is blue" or "the user inserted the letter A at position X" or "the CPU is 40 degrees"). The view of this source is what we see, and can be seen through a CRDT or any other lens.
The way I’m defining it, my shared state is the data we store in a crdt. And CRDTs have strong eventual consistency. That’s what makes them great. So we can have a data structure which shows all users an identical view of the world.
Normally we do that by storing something totally different under the hood. Eg, git actually stores a commit graph. But the system makes a determinism guarantee: we promise that all users who have the same version checked out will see exactly the same thing. At one level, we’re storing “a list of facts” (the commit graph). But at another level of abstraction, we’re just storing application data. It’s just also replicated between many peers. And editable locally without network access.
> So we can have a data structure which shows all users an identical view of the world.
This is never true. You can prove that at some time now()-T where T > 0 you had the same view of the universe, but you cannot prove that you currently have the exact same view because even with the attempt of checking, T becomes greater than 0. Sometimes, this doesn't matter (T can be arbitrarily large and still effectively be zero -- like asking your friend if he is still married to that person. They can answer you days later, and it'll still be true), but sometimes even very small values of T cannot be assumed to be zero.
Well yeah obviously you never know for sure that a remote peer doesn’t have some changes that they haven’t told you about yet. That’s also true with lots of platforms - like google docs and Notion and multiplayer video games. Seems fine though? I don’t understand why this matters for collaborative editing?
Have you ever worked on the same repo with >500 devs? 99% of the time, it doesn’t matter. People talk to people.
Yes; but I have no idea how that connects to anything else we’ve been discussing here.
Pijul is a version control system based on a CRDT: https://pijul.org/manual/theory.html#conflicts-and-crdts
It works like you describe, with humans manually resolving conflicts. The conflicts are represented in the data model, so the data model itself converges without conflicts...if that makes sense.
Conflict-free is right in the name, layering conflicts on top of it would be blasphemy :p
See my comment below, I prototyped something like this. https://news.ycombinator.com/item?id=45180325
Interesting idea. As I understand it though, this wouldn’t give you the kind of conflict semantics I’m talking about out of the box. What I want is - if two users concurrently edit the same line of text, the system can “merge” those changes by storing the conflict. Subsequent readers of the document see a merge conflict and can resolve the conflict manually.
Your system looks like it just enforces a global order on the actions. This will give you SEC - but how do you preserve the information that these edits were concurrent - and thus conflict with one another?
You're right, it's not the same as conflict/merge semantics, but you probably could implement those semantics on top of it. My idea was more about being able to merge offline states for arbitrary data without user intervention while also ensuring that application invariants / semantics are preserved. Preserving app semantics while as much as possible preserving user intentions.
>CRDTs have a superset of the information available to git. It’s weird nobody has coded a system like that up yet.
That's an interesting idea. I have to think about this.
The classical paper-ledger bookkeeping is pretty much eventually consistent. They did not have the Internet when they invented it.
Flight booking is often statistically consistent only. Overbooking, etc.
>The classical paper-ledger bookkeeping is pretty much eventually consistent. They did not have the Internet when they invented it.
Absolutely. Bookkeeping is an offline activity (I'm only doing it once a year in my company, ha ha). You just have to make sure not to record the same transaction more than once, which could be non-trivial but shouldn't be impossible to do with CRDTs.
>Flight booking is often statistically consistent only. Overbooking, etc.
That may be acceptable in some cases but you still can't use CRDTs for it, because you need a way to limit the extent of overbooking. That requires a centralised count of bookings.
Most complex crdts are built on top of the simple crdt of a grow only set. Ie, what we actually synchronise over the network is a big bag of commits / operations / something such that the network protocol makes sure everyone ends up with all of the operations known to any peer. Then the crdt takes that big set and produces some sort of sensible projection from it.
> You just have to make sure not to record the same transaction more than once
So this should be pretty easy. Have a grow only set of transactions. Give each one a globally unique ID at the point of creation. Order by date and do bookkeeping. One thing you can’t guarantee is that the balance is always positive. But otherwise - yeah.
> The point is that you always have to think about merging behaviour for every piece of state.
CRDTs can't eliminate the requirement to think about what the consistent states are.
I prototyped exactly such a framework! It's designed to solve exactly the problem you mentioned. It’s a super interesting problem. https://github.com/evelant/synchrotron
The gist is:
* Replicating intentions (actions, immutable function call definitions that advance state) instead of just replicating state.
* Hybrid logical clocks for total ordering.
* Some client side db magic to make action functions deterministic.
This ensures application semantics are always preserved with no special conflict resolution considerations while still having strong eventual consistency. Check out the readme for more info. I haven’t gotten to take it much further beyond an experiment but the approach seems promising.
Nice, will have a look!
I've had similar thoughts, but my concern was: if you have idempotent actions, then why not just encode them as actions in a log. Which just brings you to event sourcing, a quite well-known pattern.
If you go that route, then what do you need CRDTs for?
Doesn't event-sourcing imply that there's a single source-of-truth data store you can source them from? I'm not sure event sourcing says anything about resolving conflicts or consistency.
The pattern I came up with is similar to event sourcing but with some CRDT and offline-first concepts mixed in. By using logical clocks and a client side postgres (pglite) it doesn't have to keep the entire event history for all time and the server side doesn't have to process actions/events at all beyond storing them. The clients do the resolution of state, not the server. Clients can operate offline as long as they like and the system still arrives at a consistent state. AFAIK this is different than most event sourcing patterns.
At least in my thinking/prototyping on the problem so far I think this solution offers some unique properties. It lets clients operate offline as long as they like. It delegates the heavy lifting of resolving state from actions/events to clients, requiring minimal server logic. It prevents unbounded growth of action logs by doing a sort of "rebase" for clients beyond a cutoff. It seems to me like it maximally preserves intentions without requiring specific conflict resolution logic. IMO worth exploring further.
A CRDT is any data structure that meets the definition (associative, commutative, idempotent, etc...)
Event Sourcing is not strictly designed to achieve eventual consistency in the face of concurrent writes though. But that doesn't mean it can't be!
I've also been considering an intent based CRDT system for a while now (looking forward to checking out GPs link) and agree that it looks/sounds very much like Event Sourcing. It's worth while being clear on the definition/difference between the two though!
I wonder how does this handle a modify-rename conflict? e.g. there's a file identified by its name `a` and one client renames it to `b` while another client tries to modify the contents of `a`. Once you replay it in this order does the intent of modifying the contents of what was once `a` remain?
I know you can use some unique persistent ids instead of names, but then you get into issues that two clients create two files with the same name: do you allow both or not? What if they initially create it equal? What if they do so but then they modify it to be different?
It would be up to application logic. This prototype essentially offers the same behavior you would get with a traditional backend API except it works offline. The results would be the same as if clients made those calls to a backend api, that is, up to application logic. My idea was that it's essentially impossible to have generic "conflict resolution" that follows arbitrary business rules so I made the business rules _be_ the conflict resolution. For any given situation the answer to "how would it handle a then b then c" is "the same as any normal backend api, per regular business logic, except it works offline".
Don’t you also have to consider this just as much without CRDT? Not saying it isn’t a real issue, but this example could easily be a problem with a more traditional style app - maybe users open the record on their web browser at same time and make different updates, or they update the different timestamp fields directly in a list of tasks.
Sure, but you can usually rely on database transactions to handle the hard part.
Yes!
Any many CRDT implantations have already solved this for the styled text domain (e.g bold and cursive can be additive but color not etc).
But something user definable would be really useful
The big idea behind CRDTs is that a data structures can have replicas synchronizing on a best-effort basis. That is much closer to the physical reality: server here, client there, phones all over the place.
The basic CRDT ideas are actually pretty easy to implement: add some metadata here, keep some history there. The difficulty, for the past 20 years or so, is making the overheads low, and the APIs understandable.
Many projects revolve around some JSON-ish data format that is also a CRDT:
- Automerge https://automerge.org (the most tested one, but feels like legacy at times, the design is ~10yrs old, there are more interesting new ways)
- JsonJoy https://jsonjoy.com/
- RDX (mine) https://replicated.wiki/ https://github.com/gritzko/go-rdx/
- Y.js https://yjs.dev/
Others are trying to retrofit CRDTs into SQLite or Postgres. IMO, those end up using last-write-wins in most cases. Relational logic steers you that way.
Could you explain more about Automerge being legacy? They recently released a new major version and revamped their algorithm I believe. What is better about your version?
Automerge is based on design decisions from 2014-2017. I remember that epoch and what everybody thought back then. The views have evolved since, but Automerge is a large project, many things have being built on top. It is unrealistic to change any of the basic assumptions at this point.
I may go into the technical details, assuming my hourly rate is respected.
Go on. I respect your hourly rate.
How is your version better?
That's a long list.
- RDX has a spec, so it can have compatible implementations. The result of a merge is specified to a bit. Automerge works the way Orion coded it (this time).
- There are equivalent text and binary formats, JDR and RDX.
- RDX palette of types is richer. Automerge is narrower than JSON.
- RDX can work in any commodity LSM db, natively, in the core (and it does).
- and so on...
TIP: define acronyms the first time you use them, and don't put them in headings.
Conflict-free replicated data types (CRDTs) https://en.wikipedia.org/wiki/Conflict-free_replicated_data_...
The majority of the content reminds me of Bigtable (https://static.googleusercontent.com/media/research.google.c...).
Do people really distinguish "Strong Eventual Consistency" from "Eventual Consistency"? To me, when I say "Eventual Consistency" I alwayes mean "Strong Eventual Consisteny".
(Non-Strong) Eventual Consistency does not guarantee that all replicas converge in a specific time period.
In an eventually consistent system replicas can diverge. A "last write" system can be eventually consistent, but a given point can read differently.
Eg: operations
1) Add "AA" to end of string 2) Split string in middle
Replicas R1 and R2 both have the string "ZZZZ"
If R1 sees operations (1) then (2) it will get "ZZZZAA", then "ZZZ", "ZAA"
If R2 sees (2) then (1) it will get:
"ZZ", "ZZ", then "ZZAA", "ZZ".
Strong Eventual Consistency doesn't have this problem because the operations have the time vector on them so the replicas know what order to apply them.
I’m not sure I follow. How would this be eventually consistent at all? It looks like the two peers in your example simply have divergent state and will never converge.
That precludes us from having side effects such as idempotent triggers right?
You're not describing an eventually consistent system, you're describing a system that diverges. By definition, eventually consistent means that, after some time, all readers across the entire system are guaranteed to find the same values, even if before that time they may see different values.
Any eventually consistent system has to have a strategy for ensuring that all nodes eventually agree on a final value. R1 and R2 need to communicate their respective states, and agree to a single one of them - maybe using timestamps if R2's value is newer, R1 will replace its own value when they communicate), maybe using a quorum (say there is also an R3 which agrees with R1, then R2 will change its value to match the other two), maybe using an explicit priority list (say, R1's value is assumed better than R2's).
I think there are a lot of systems that have a separate node syncing feature, where two nodes can receive updates, apply them to their local replica right away, and only communicate them and reconcile with the other backend nodes at a later time.
Interesting that neither the article nor the comments mention the CALM theorem [0], which gives a framework to explain when coordination-free consistency is possible, and is arguably the big idea behind SEC.
[0] https://arxiv.org/abs/1901.01930
Would this be a suitable ds to distribute node state for caching indices? Let's say two nodes have a set of N (possibly overlapping) keys and I want both to know all keys of each other for request routing (request for n \in N preferably to node with n in local cache).
Yes. I think G-Sets (3.3.1) are what you are looking for.
https://dsf.berkeley.edu/cs286/papers/crdt-tr2011.pdf
Thank you for the link, that looks very interesting
happy to help!
CAP applies here.
If you ask your cache for a value, it could choose to reply now, with the information that it has - favouring A.
Or it could wait and hope for more accurate information to return to you later, favouring C.
'Cache' seems to imply that it's built for availability purposes.
I'd like to point out the following open-source SQLite project: https://github.com/sqliteai/sqlite-sync with a built-in cross-platform network layer.
P.S. I am the author of the project.
At first it made no sense but then I realised that what the author is saying is that in a distributed system, when you make local changes, you do not wait for the changes to propagate to all participants and back to you, before your local state is considered to be consistent with the global state but rather it is considered consistent with the global state immediately even before your local changes leave your system. In other words, every change committed into the distributed system is immediately consistent with the global state even if there are undelivered changes as eventually all the changes produce the same outcome.
In a specific use case that might apply. For example, if two people edit the same document and fix the same typo, the visual outcome is the same, no matter who made the change first or last.
But that is very niche as if we would take a programming code, someone can change a line of code that someone else is changing as well and they might be the same, but then you have other lines of code as well that might not be and then you end up with a code that won't compile. In other words, if we focus on the singular change in insolation, this makes sense. But that is essentially never the case in distributed environments in this context and we have to look at broader picture where multiple changes made by someone are related or tied to each other and do not live insolation.
Either way, i see nothing useful here. You can "render" your local changes immediately vs wait for them to be propagated through the system and return back to you. There is very little difference here and in the end it is mostly just about proper diffing approach and has little to do with the distributed system itself.
PS: the problem here is not really the order of applied changes for local consumer, like in case of editing a shared word document. The problem here is if we have a database and we commit a change locally but then someone else commits different change elsewhere, like "update users set email = foo@bar where id = 5" and before we receive the other, later, change we serve clients invalid data. That is the main issue of eventual consistency here. As I am running a system like this, I have to use "waiters" to ensure I get the correct data. For example, when user creates some content via web ui and is redirected back to list of all content, this is so fast that the distributed system has not had enough time to propagate the changes. So this user will not see his new content in the list - yet. For this scenario, I use correlation id that i receive when content is created and i put it into the redirect so when user moves to the page that lists all the content, this correlation is detected and a network call is made to appropriate server whose sole purpose is to keep the connection open until that server's state is caught up to the provided correlation id. Then I refresh the list of content to present the user the correct information - all of this whilst there is some loading indicator present on the page. There is simply no way around this in distributed systems and so I find this article of no value(at least to me).
This blogger has a lot of great takes. I bet he'd make a great addition to any team.