Ask HN: Best stack for real time data intensive apps

5 points by warthog 16 days ago

Question to HN: I have been overwhelmed with the variety of new database and analytics solutions out there. This question is meant to hear the experiences of everyone who built data intensive apps like a financial information app or a CRM. Purpose is to simply make data loading speeds fast, whether purely by speed or using sockets or GraphQL does not matter. I am looking for opinionated views.

So far seen: - DuckDB (guess makes things faster but no experience with it) - Tinybird (real time data APIs?) - GraphQL (learning curve?) - and many more

I am relatively new to this stuff but a long time user of Postgres. Looking to learn but wildly confused and overwhelmed around what I should work on to implement for the best UX[1]

[1] Best UX defined as fast loading times for large number of rows and dynamically being able to load data (sockets?) and realtime data communication

austin-cheney 15 days ago

If you want to output to a browser here is the guide to achieve the best possible performance according to the numbers:

https://github.com/prettydiff/wisdom/blob/master/performance...

Warning: every time I post this people claim to want superior performance but then whine when they realize they have to actually write code (as opposed to letting NPM or React or jQuery do 99% of everything).

pjacotg 15 days ago

If you want to display analytics on a dashboard I would generate materialized views from your underlying data first and then display that. Try and precalculate as much as possible to avoid having to do it on your front end.

pants2 15 days ago

Sounds like you might enjoy QuestDB. It's Postgres-compatible but oriented toward time series and streaming data. It's easy to install and play around with and comes with a built-in explorer UI.

zer00eyz 16 days ago

It might matter what platform you choose at all. If you know Postgres pick that.

What matters more than your storage engine. Your provider, your architecture and design, and your plans for scaling (read plans not actions).

There are lots of companies that cant seem grasp that multi-tenancy would be a win for them. You remove a bunch of pain for customer facing code. Your internal admin and reporting tools need to take that weight. It means that scaling problems are likely to impact fewer customers (and will be easier to address). It means that your costs map more directly to your (hopeful) income.

None of this should stop you from playing with other data engines. If you have the capital, just build a linux box (a few hundred bucks), throw prox mox on it, an extra vm and limit the cores and memory and see what some of these DB's can do. Can you install them. What are the interfaces like, can you build a cluster of them. What is their network IO vs the data IO... Is this a perfect test. No. IF something sucks to write good code and something else is easier where do you take that into account.

If your going to have REAL high bandwidth applications then your going to end up out of cloud and on your own hardware. If you cant install these things then there is zero point in pursuing them.

joshxyz 13 days ago

clickhouse + react + websockets for me