dragonwriter 4 years ago

This looks good but some of the background has the air of the kind of artificial problems presented in late night infomercials:

> It’s impossible to split a single payment into multiple payouts, since there is a many-to-one relationship of payments to payouts.

So your model used a many-to-one relationship when you really wanted a many-to-many?

> Since this is just a SQL database, there’s nothing preventing the payouts from becoming inconsistent.

Preventing inconsistencies by enforcing constraints is a key point of an RDBMS. This is literally why people use SQL DBs.

> The payout_id can be ensured to be a valid foreign key, but nothing is stopping it from being nulled out.

You mean, like a NOT NULL constraint? Or, if you mean zeroed out, an appropriate constraint and/or before update trigger protecting the amount?

  • inopinatus 4 years ago

    I think this mocking of their intention and problem statements is misplaced.

    > So your model used a many-to-one relationship when you really wanted a many-to-many?

    In many schema discussions you’d be right, but in the case of payments systems it’s not the case. Payments and payouts are separate transactions and only loosely related. The allocation of credit from one leads to an increment in the settlement of the other, but there’s no structural relationship in the journal itself.

    > Preventing inconsistencies by enforcing constraints is a key point of an RDBMS. This is literally why people use SQL DBs

    The inconsistencies in question relate to the domain model of accounting, not referential integrity or other (normally) easily enforced database invariants.

    > You mean, like a NOT NULL constraint? Or, if you mean zeroed out, an appropriate constraint and/or before update trigger protecting the amount?

    When you want to archive off aged records, or constrain across db shards, this is a serious issue. Taking the piss doesn’t make it go away. But in that particular paragraph I believe they’re referring to the payout itself become void e.g. due to a reversal, and having designed around similar issues I understand their concern. It’s exactly where single-column ledgers become a huge pain in the audit.

    • jimktrains2 4 years ago

      > The inconsistencies in question relate to the domain model of accounting, not referential integrity or other (normally) easily enforced database invariants.

      Constraints can be more than referential integrity. When PostgreSQL released transition tables, I built a very small proof-of-concept using them to constrain a double-entry (-like) system. https://github.com/jimktrains/pgdea allows any number of debits and credits in a transaction, but the whole transaction must have a 0 balance.

    • WorldMaker 4 years ago

      >> You mean, like a NOT NULL constraint? Or, if you mean zeroed out, an appropriate constraint and/or before update trigger protecting the amount?

      > When you want to archive off aged records, or constrain across db shards, this is a serious issue. Taking the piss doesn’t make it go away. But in that particular paragraph I believe they’re referring to the payout itself become void e.g. due to a reversal, and having designed around similar issues I understand their concern. It’s exactly where single-column ledgers become a huge pain in the audit.

      My impression from the article was more the "baby record" problem: not everything has been paid yet. They were using NULL to indicate "not paid" and thus couldn't put a NOT NULL constraint on that column because it was a valid value for new rows. There are ugly workarounds that could have allowed them to add NOT NULL such as using a special/magic "guard" payment that indicates not paid instead. (Going down that route you'd probably want a "guard" payment per client and then you're getting close to a backwards form of double-entry accounting.)

  • nabilhat 4 years ago

    It wouldn't be too remarkable if they stopped there. It's not unusual for enterprise-targeted application databases to return nothing when one queries sys.foreign_keys. However, they've done the extra credit work to redefine double entry bookkeeping for some indiscernable reason.

    > In fact, for the ease of modeling we relaxed how we reason about it and we don’t stick to standard Double Entry Accounting which has debit-normal and credit normal books which determine the sign as we prefer to consistently treating debits as positive and credits and negative.

    Surely some of their customers have competent accountants maintaining the business's books. This unasked-for effort will make their job harder.

    • dragonwriter 4 years ago

      The replacement of the debit-normal and credit-normal account structure with simple use of positive and negative numbers is not unprecedented in computerized double-entry accounting systems (ledger-cli and it's ports and offshoots do this, too); it's quite natural since the credit-debit system was itself a hack around the medieval European resistance to negative numbers and a way to reduce subtraction operations by grouping everything into addition of columns of positive numbers and then one substraction of the two column totals.

      > Surely some of their customers have competent accountants maintaining the business's books. This unasked-for effort will make their job harder.

      Uh, why? It's trivial to present information tracked as positive debits and negative credits in two positive columns; keeping the internal architecture of the database this way should be irrelevant to the end user, it's just avoiding using a hack designed to optimize the experience of people doing a manual process with medieval European technology and attitudes towards negative numbers in an automated backend where none of the problems it mitigates exist in the first place.

    • inopinatus 4 years ago

      That’s actually pretty common for payment systems where you care about the journal balancing but every account is basically a liability because it’s all someone else’s money.

      You just end up with an apparently negative balance in the income account that represents your own cut due to fees etc. From the GL point of view it looks like a subledger with a contra account but that’s nothing to faze a competent accountant. Back in the 13th century it was probably tricky to represent on an abacus, not so much now.

  • elvinyung 4 years ago

    (Disclaimer: I work at Square, but not on Books.)

    >Preventing inconsistencies by enforcing constraints is a key point of an RDBMS. This is literally why people use SQL DBs.

    This comment is not specific or related to the work at hand, but note that this is only true for a single-machine database or a sharded setup in which you never have to perform transactions across shards. That is, the ACID guarantees are only enforceable in most RDBMS if all your data is on the same machine.

    • josephg 4 years ago

      Huh? Am I missing something obvious here? Plenty of databases support ACID guarantees across multiple machines in a cluster. FoundationDB and CockroachDB come to mind, but I’m sure there are others. What you’ve built is cool, but I’m still confused why existing tools aren’t good enough.

      From the Cockroach documentation (source: https://www.cockroachlabs.com/docs/stable/transactions.html )

      > CockroachDB supports bundling multiple SQL statements into a single all-or-nothing transaction. Each transaction guarantees ACID semantics spanning arbitrary tables and rows, even when data is distributed.

      • elvinyung 4 years ago

        Right -- that's why I said most RDBMS. To get ACID and CAP-consistent (which, as an aside, are not the same thing) cross-shard transactions requires some form of migrating your production system onto such a distributed-capable RDBMS that you describe (of which, Spanner arguably fits the description of).

        • josephg 4 years ago

          The thing I'm confused about is why migrating your production system onto a custom in-house data store (that you've written and you maintain) is easier / less work than migrating onto something you can use off the shelf which solves the problem you have.

          I hear the argument that migrating from (say), postgres -> postgres + books is easier than moving everything into cockroachdb. But why is postgres + books easier than postgres + cockroach? The latter doesn't require you to write your own database from scratch. (And writing your own database is fun, so if you're doing it for that reason I understand, but thats not what the post says.)

          Edit: Ah, I see - my confusion was because I missed the part which mentioned that this is just built on top of spanner. That makes sense!

          • elvinyung 4 years ago

            Yep :P

            For a fun fact, CockroachDB was actually started by ex-Squares, so we're definitely very familiar with this exact same argument internally, back in the day when neither CockroachDB nor Spanner existed as a viable option :)

      • justinjlynn 4 years ago

        Not everyone has the luxury of using databases like those in high-stakes production.

  • rtpg 4 years ago

    Yeah this is really confusing to me. Especially given that tenants are unlikely to mix books, just out of the box multitenant Citus + Postgres would probably get them where they needed with super good perf characteristics.

    Hindsight and all that, and every team has inhouse stuff that could probably be gotten rid of, but this seems like a major operational danger compared to relying on more battle-tested stuff. They should probably get the Jepsen guy on this and see if he can get race conditions

  • throwawayjava 4 years ago

    Infomercials work.

    And not just for overpriced knives.

achiang 4 years ago

Nice article, I'd be curious to know if/how their engine abstracts the complexity behind multi-currency transactions, or whether they rely on the accounting model to handle multi-currency.

We built our own double-entry accounting engine at my previous company, and while the engine was not as fancy as what Square describes, the real challenge was building out the accounting models that manipulated the engine's primitives.

To this day, I have yet to find another resource on multi-currency that is as solid as this one:

https://www.mathstat.dal.ca/~selinger/accounting/tutorial.ht...

  • Exuma 4 years ago

    Damn that link is amazing for accounting

inopinatus 4 years ago

I’ve been involved with several two-sided marketplaces and payment services, and a robust double-entry book-keeping subsystem has always been there from the start. It’s easy to validate the need but/so I also find it hard to believe that Square only started keeping 2E transaction books in the last year. It’s the basis of many high quality controls and their internal audits and reconciliation to GL must’ve been a horrorshow without it.

It’s also the original event-sourced conflict-free replicated data type and the joke about CRDT for your debits is a classic thigh-slapper in the exciting world of accounting software backend implementation

Dowwie 4 years ago

I found it a lot of fun writing a double entry system, but the fun stops when implementation is finished and the "real work" begins. Next comes a costly and risky endeavor of defining the rules governing accounting events transacted by the AIS. This is where real difficulty and liability lies. An event triggers a series of transactions that must be financially sound and sufficient for meeting the expectations of an audit. Competent, qualified people must define these rules. These rules aren't universal truths. Accounting rules aren't black and white. This lead me to imagine that it is difficult, if not impossible, to define universally-applicable accounting events across all kinds of business. With that given, Square might be able to achieve these rules exclusively for the mom and pop retail customers whose types of business transactions are very similar, but the truth of this can only be revealed by third party auditors.

So, approach this space with a healthy dose of skepticism until its been appropriately vetted and certified by third party auditors.

j88439h84 4 years ago

For personal use, Beancount[1] looks like a good double-entry bookkeeping tool.

[1] https://docs.google.com/document/d/1RaondTJCS_IUPBHFNdT8oqFK...

jasim 4 years ago

Ask the same programmer to write an invoicing & payments system, then coupon creation and track its claims, then something where users can recharge their phone, and their usage is tracked and balance is computed -- and each one of them will be a set of ad-hoc tables and custom code.

But most transactional systems where things move from one entity to another and has the notion of "balance value" is best represented by the double-entry system.

When a customer purchases pre-paid balance, the phone company can record it as a "credit", and every call they make becomes a "debit". This is an immutable log - one of the earliest application of immutable data structures in human history.

In this post authors used positive and negative values to represent debit and credit - we could call it yin and yang for all it mattered. The core principle is just that every transaction has a source and destination and the ledger is an append-only table.

  • chris_wot 4 years ago

    What it really shows is the flow of money through the system. That’s really the crux of double-entry bookkeeping.

UtahDave 4 years ago

Isn't this just Accounting 101?

I'm a little surprised they didn't use double entry accounting from the beginning.

adamcharnock 4 years ago

In case it is of use to anyone on this thread, I’ve developed ‘django-hordak’, the core of double entry accounting system based on Django.

It provides models with Postgres constraints to ensure the double entry accounting rules are not broken.

It isn’t the same scale as Books, but I imagine that won’t matter for a lot of use cases.

https://github.com/adamcharnock/django-hordak

nevi-me 4 years ago

Shouldn't an accounting database service (specifically, the ledger) be immutable and double-entry in any case?

> To address consistency, we picked a well-established, public-domain, battle-tested approach to modeling financials that enables all of our properties ...

I think doing anything other than double-entry leads to reinventing the wheel. When a ledger system is implemented correctly, the main remaining issue becomes scalability.

wolfspider 4 years ago

Excellent work and a great write-up! I’ve been implementing double blind accounting systems the past decade or so and what they teach in accounting classes is spot on there just isn’t much need for more complexity. There is the added benefit of staying PCI compliant and reconciling things the right way. Doing splits like this is hard work but I gather there are lots of powerful tools out there to help nowadays. Having worked with NACHA and Wells Fargo getting those ACHs done in one fell swoop is extremely innovative actually the system doesn’t lend itself to this very well.

andriosr 4 years ago

Great write up, but I don’t see how it could scale with the pessimistic lock of Spanner.

Updating the current balance of big merchants, which are receiving multiple payments per second seems to create a lot of lost transactions, with the locking of the balance.

That would generate deadlocks in RDBMSs, curious to know if Spanner is able to scale it.

I built a similar system using DynamoDB and optimistic locking, and for that I had to remove concurrency of updates in the same document, meaning you can get ˜30 updates of the same document per second, with each update taking ˜30 milliseconds.

  • WorldMaker 4 years ago

    That is the kind of content I was hoping for in this article. The review of double-entry accounting is nice and all, but the scale details would have been nice beyond hand-waving "we left it to Spanner, and so we followed Spanner's rules".

    Similarly a quibble about the "immutable" in the headline given that they are still using a mutable field at the book level for balances to make them easier to query. I guess I was hoping they'd have solved a nicer merkle tree balancer or something for balances. Not that I was hoping for a "blockchain" solution, but something smarter than blockchain for "centralized" trust environments like Square.

z3t4 4 years ago

When making a double entry accounting database, the users should only have INSERT and SELECT privileges to the table, so in order to correct a mistake you make another insert. There is really no scaling problem as the data is immutable. And at the end of the year you can achieve the ledger - so you don't have to map over it to check the account status - just put the sum of the last year as the first insert. A database service would still be useful though, in order to make sure you do not edit the values.

joshuakelly 4 years ago

I actually submitted a YC application for the upcoming batch to work, more-or-less, on exactly this idea - a double entry accounting service for developers.

  • jen_h 4 years ago

    This sounds like something internal...I don't think developers or typical small businesses need (or would want to!) to store their double-entry accounting database in shards across Google Cloud Spanner & Kubernetes clusters...

    Good luck with your application -- I, for one, think there's a need for it. I've wished for years that someone would put something like a secure, web-based Gnucash out into the world.

    • sudhirj 4 years ago

      Doesn't necessarily need to be hosted, although given that people are fine with running Quickbooks or any other SaaS it's hard to see why they wouldn't be fine with ledger SaaS.

      Can still make a case for licensing, though, depending on the features.

      We built this for our internal accounting, it's a ledger service that enforced immutable double entry rules and can run as a container with a Postgres DB configured. https://github.com/realimage/qledger

      • jen_h 4 years ago

        Yeah, people have been using things like Mint for ages, too, and it's not like professional accountants hosting Quickbooks on their ancient Windows boxen is any less terrifying from a security perspective than a ledger SaaS startup...

        Personally, I'm likely to stick with Gnucash until or unless it becomes completely unsupported, but Intuit needs some real competition and it would be great to have more choice available.

        gledger looks like a neat project, will check it out!

      • painted 4 years ago

        do you also have some kind of frontend for it? how do you see what's the flow of money and reporting? does it work with kibana out of the box?

        • sudhirj 4 years ago

          Each reporting system / products does its own reporting, and no plans of adding ELK integration right now, although we are moving to the ELK stack so it might happen.

          We think of it more as a database++ with financial primitives baked in, so we want to keep it very simple and dumb. It’s not a complete solution - it’s a book to write in, and you need another service that decides what and how to write and writes it (we call it the bank or teller service, and it’s all business logic for our business). Each product has its own bank to teller service, but they all use the same ledger underneath.

    • joshuakelly 4 years ago

      Got an interview email tonight, wish me luck!

markdown 4 years ago

The headline makes it sound like they'd released a new SAAS app, but it looks like they're just talking about internal infrastructure.

stdbrouw 4 years ago

I don't see how this schema enforces double-entry accounting. You can specify an entry with a debit, and another with a credit, but they are not linked, their amounts are not necessarily equal and you're not required to simultaneously submit both entries. So this is really a single-entry accounting system that they happen to use to do double-entry accounting. Or am I missing anything?

For those of you reading this who wish to implement a simple system for double-entry accounting, do it like this instead: specify `debit` and `credit` columns with a foreign key to a particular account (what Square calls a "book"), and a field for the `amount`. When, later, you want to calculate the balance of an account, take the sum of all transaction `amount` where `debit = <account>` and subtract the sum of all transaction `amount` where `credit = <account>`.

tdhoot 4 years ago

Weird how it still mentions Caviar, even though Square sold it. I guess the blog post was written some time ago.

limeman 4 years ago

So other accounting software provides mutable ? Square are you serious? Just adding a authentication layer over it and calling immutable

  • chris_wot 4 years ago

    I don’t think you quite understand what they are doing. Once the entry is in the ledger, it can’t be changed. To correct it you must add in a compensating transaction that corrects the transaction - almost certainly a reversal.

    Basically, it’s just a list of transactions you can only append to. You can’t delete any of the entries or modify them - they really are immutable.

codedokode 4 years ago

What's good in immutability? If you have found that you have entered too much income and want to lower it, what do you do?

  • whb07 4 years ago

    You generate a new item altogether ( I’m talking about immutability). The positives are you’ll have an ability to keep proper records and changes done to your table which will show a different snapshots as it contained different values.

    Another plus is that you won’t run into an issue like race conditions where two threads are attempting to change the same thing.

  • nevi-me 4 years ago

    You reverse it with an adjusting entry.

    What happens if you've transferred too much from your savings account into your current/cheque account? You make another transaction to adjust the initial one.

    • codedokode 4 years ago

      But then the taxation service can find out that you have lowered your profit to pay less taxes. So what's good in immutability?

CDokolas 4 years ago

Looks like they've discovered double-entry accounting...

...at last ;)

tantalor 4 years ago

Exactly what does "immutable" mean here?

  • dragonwriter 4 years ago

    The natural interpretation would be that entries once posted could not be changed.

    Reading the page, that's exactly it; journal entries are an append-only log.

  • timemachine 4 years ago

    Rows in the database are never updated or deleted. Only new rows are added.

    • Animats 4 years ago

      A blockchain could be used to enforce that immutability. Which might be an actual, useful application for blockchains.

      • Scarblac 4 years ago

        By itself a block chain just stores hash with each block, they could be recomputed after a change. Just like how Git history can be changed.

        Unless you're talking about a distributed public no-trust ledger, yes that can be solved with block chain as it's what Bitcoin is. But that's not a problem they're having.

      • chris_wot 4 years ago

        Sure, but that’s only needed in a purely distributed environment. There is absolutely no need for a blockchain in a centralised private database you have full control over.

      • mateuszf 4 years ago

        > A blockchain could be used to enforce that immutability. Which might be an actual, useful application for blockchains.

        Yes, and surprisingly the implementation is called Bitcoin ;)

jhoechtl 4 years ago

Where is the link, how can I use it?

banq 4 years ago

this is event sourcing

  • dragonwriter 4 years ago

    In that event sourcing is largely inspired by accounting record keeping and how various accounting statements are traditionally implemented as computed functions of the ledger entries.