asavinov a year ago

Bistro Streams [1] is similar to Kafka Streams in how it is supposed to be used - it is implemented as a library which can be part of an application (including IoT analytics) or run at the edge, for example, on gateways or devices. However, Bistro Streams is based on different principles and has the following major distinguishing features:

o [Column-oriented logical model] Bistro Streams describes its data processing logic using mainly column operations as opposed to set operations [2]. It is a unique feature of this system: no joins, no group-by, no map-reduce.

o [Column-oriented physical model] Data within the system is represented in columns (in-memory). It is apparently not new and widely used in column stores but it is new for stream processing. In the case of long histories (which is needed for complex analysis) and complex analytic workflows it can provide higher performance. It is also important for running on edge devices with limited resources.

o [Separate injection, processing, retention] Bistro Streams separates the logic of (1) triggering the processes for appending, evaluating, and deleting data from the logic of (2) what to do during evaluation (data processing itself). In particular, the frequency and conditions for starting evaluations are specified using a separate API. The same for retention policy where deletion time is determined not by the windows used during its processing (say, for moving average) but separately.

I am the author of Bistro Streams [1] and the underlying Bistro Engine [2]. I will be glad to answer questions and any feedback is welcome. In particular, what are possible application areas for this system.

[1] Bistro Streams:

[2] Bistro Engine:

  • tixocloud a year ago

    Thanks - I would love to know the possible application areas. Specifically where you think Bistro Streams make sense and where it doesn’t.

    • asavinov a year ago

      It is based on a quite general (and novel) technology so it can be applied to many different data processing tasks like data integration, data migration, extract-transform-load (ETL), big data processing or stream processing.

      But I would like to narrow this list down with the purpose to increase probability of success (degree of adoption). Therefore, currently I focus on edge computing and IoT. But maybe I am wrong and therefore I would like to ask the community about potential uses of this kind of technology – fail early fail often :)

      For simplicity, it can be viewed as Kafka Streams but implemented on completely different principles at all levels of the architecture.

      • tixocloud a year ago

        There could be many use cases within my industry of financial services/retail banking but a fantasy one of mine would be to automatically optimize capital risk levels :)

noir-york a year ago

I get the impression that Bistro has a lot of thought put into it, but after quick skim through the docs and examples I still am not sure what Bistro is except "Like Kafka streams but different".

The example code is spread under core and server and very sparsely documented. It is not clear what is going on.

If you had a walk through of the examples that would go a long way to conveying what Bistro is and how to use it. If there already is such a document apolgies, I must have missed it.