pscanf 7 days ago

The problem with using markdown for this is that it's unstructured, so when the LLM does calculations on it, extracting data is error prone, you're never sure what exactly gets extracted, and it's then a hassle to verify (like the author had to do).

In my app Superego (https://github.com/superegodev/superego, shameless plug) I use structured JSON documents precisely for this reason, because with a well-defined schema the LLM can write TypeScript functions that must compile. This doesn't guarantee correctness, of course, but it actually goes a long way.

But doing my taxes was a use case I hadn't considered, and it's actually pretty neat! I'll be trying it myself next month (though I'm not looking forward to it).

  • iamspoilt 6 days ago

    > I use structured JSON documents precisely for this reason, because with a well-defined schema the LLM can write TypeScript functions that must compile.

    The intention of this is to reduce hallucination on information extraction, right?

    Also, how do you convert your docs / information into JSON documents?

    • pscanf 6 days ago

      > The intention of this is to reduce hallucination on information extraction, right?

      Correct.

      > Also, how do you convert your docs / information into JSON documents?

      Right now you have to add it yourself to the database. The idea is that you use Superego as the software in which you record your expenses / income / whatever, so the data is naturally there already.

      But I'm also working on an "import from csv/json/etc" feature, where you drop in a file and the AI maps the info it contains to the collections you have in the database.

      • iamspoilt 6 days ago

        Have you looked at qmd by any chance? Thoughts on that for sqlite vs qmd in this case? https://github.com/tobi/qmd

        • pscanf 5 days ago

          Oh, interesting, I didn't know about this project, thanks for sharing!

          I tried to implement something like this for the search functionality, but ended up going with "old school" lexical search instead.

          Mostly because, in my experimentation, vector search didn't perform significantly better, and in some case it performed worse. All the while being much more expensive on several fronts: indexing time (which also requires either an API or a ~big local model), storage, search time, and implementation complexity.

          And Superego's agent actually does quite well with the lexical search tool. The model usually tries a few different queries in parallel, which approximates a bit semantic search.

jerkstate 7 days ago

I bought a copy of Office this year to do my taxes with the excel1040.com template and Claude for Excel, and dropped my 1099s and stuff into the chat window and Claude just transferred the numbers to the correct cells and Excel did the calculations. It was super easy (also easy to check because my tax picture didn’t change much from last year). It got some things right that TurboTax always got wrong (like cost basis for ESPPs). The only part that was difficult was getting Claude to transfer the data to the IRS fillable PDFs - I probably spent longer iterating on that than it would have taken to copy-paste the data from Excel. Other than that, it worked great, highly recommend.

  • iamspoilt 6 days ago

    I couldn't find an equivalent of this for Canadians but thanks for the tip.

    The copy-pasting of data from my markdowns to the tax software was the hardest path for me as well.

Mic92 7 days ago

Just make sure you never let claude make do any calculation without using a programming language. It can be somewhat trusted to make less mistakes than a human when it comes to filling in the right numbers but it's still not great at math.

  • iamspoilt 6 days ago

    100% this! This is why I explicitly mentioned in my CLAUDE files to generously use Python for any calculations using the virtualenv. It was also my first time using Claude for tax related calculations so I did cross-check it against an online ACB calculator https://www.adjustedcostbase.ca

iamspoilt 7 days ago

An opinionated workflow showcasing how I used Claude and Obsidian together to help with my personal tax filing in Canada