NortySpock 7 days ago

I've been using Benthos for some one-off bulk ETL needs and it's been very easy to write a config to shovel data from here-to-there.

I'll keep an eye on Benthos Studio, though I confess I'd prefer a self-hostable version at some point.

  • Scotrix 7 days ago

    Didn’t look too much into Benthos but what’s the difference to Nifi?

    • NortySpock 7 days ago

      I don't know much about Nifi, but from re-reviewing the documentation, it looks like Nifi is a stateful application with orchestration and multi-user access, etc.

      Benthos requires something else handle orchestration and source control and multi-user access and the conditions that trigger an ETL run.

      On the plus side, the stateless nature of Benthos means there isn't any big setup process or cluster of containers to login to or anything. You just call benthos from the cli and pass in a config file.

      Last week I needed to get 120k rows of data from an Oracle view that would crash on a few bad rows, and shove rows that worked into a SQL Server table.

      Writing a Benthos config to tick through every id, fetch the relevant rows one at a time, annotate with batch information, and drop error rows and error messages in a log and healthy rows into SQL Server was, all told, about 120 lines of YAML. Roughly 3 hours to write while consulting the extensive documentation, and then trying performance tweaks until I could saturate my network connection (increase threads to 256) only took another 30 minutes.

      Tweaking that YAML to cover a similar second scenario as a new config then was only another 15 minutes.

      The week before that, I was trying Benthos at home to stress test MQTT on a Raspberry Pi with synthetic data to see which messages got dropped or mis-ordered.

      (Depends on your QoS and in-flight settings, obviously)

      I've never had "basic" performance testing be so simple.

      If you are stuck on a Windows platform, it even works there, which is nice for one-off runs for dev work, troubleshooting, or break-fix work.

      • Scotrix 7 days ago

        Nice, thanks for sharing this. I’ll have a deeper look into Benthos, sounds interesting and I certainly like the simple/easy part :-)

      • mihaitodor 7 days ago

        Really glad to hear the Oracle driver I added to it is proving useful! <3

        • NortySpock 7 days ago

          Let me tell you Benthos is a real breath of fresh air after fighting a few different ETL tools that are more GUI, chrome and administrivia than function.

          Thanks for contributing to Benthos!

      • jeffail 7 days ago

        Thanks for sharing, this is awesome to hear!

    • mihaitodor 7 days ago

      Benthos is much simpler, since it's stateless and it's a single static binary written in Go. I don't really know much about NiFi, but if you need to use a messages bus with Benthos, such as Kafka, you can. However, you don't have to.

    • throwthere 7 days ago

      Benthos looks like a cool project but thanks for turning me on to Nifi. Nifi has a lot more processors out of the box.

      Edit: I’m speechless but see below

      • mihaitodor 7 days ago

        Which ones would you need? Happy to add more to Benthos. Feel free to open issues here:

        • throwthere 7 days ago

          Google Sheets, Drive and Slack. I'm not comfortable asking for it because I can't contribute.

          • mihaitodor 7 days ago

            OK, that's not a problem. There are a bunch of other channels that you can use to propose new features. Feel free to reach out via

            LE: I took note of those 3 and they should be quite straightforward to add. Thanks!

musicale 7 days ago

I liked Yahoo! pipes and the idea of being able to glue things together on the internet, but unfortunately many web sites simply don't have good APIs to enable this. In my experience pipes workflows were also brittle since any site could easily break them.

The lack of stable API support on (many/most) web sites (and no incentive to provide it), the likely low user/developer base, the apparent lack of killer apps, and Yahoo! itself probably all combined to prevent pipes from becoming a big thing.

  • ehnto 7 days ago

    That was my takeaway after building API Blocks. The goal was to display arbitrary data from APIs on a dashboard.

    APIs just aren't standardized or stable enough, and it took too much effort to maintain the API library. The UX just wasn't up to scratch because almost every week a block on your dashboard would be broken because the API changed,or your auth expired or broke. The cost to maintain was not worth it.

    An interconnected public web as data would be incredible but we just haven't built that.

    Not to detract from the concept here with Benthos Studio, the use case is a bit different. But it is something to keep in mind, the end user might not be technical and APIs constantly changing might not be something they expected.

    • dcsan 7 days ago

      Smart contracts do provide a stable (immutable!) api which is designed for composable functionality, although it’s not used across different peoples contracts that much. A pipes like UI that binds things together could be interesting. If the connective contracts are also on chain you could in theory develop flash bots that execute everything in the same block.

  • mihaitodor 7 days ago

    At least some sites provide somewhat-stable and documented APIs nowadays... Definitely a far cry from what was being promised 15 years ago, but at least most of them moved away from SOAP. I guess it's still used in some places, unfortunately.

dchuk 7 days ago

I like the playful vibe of the site, but it’s incredibly unclear what this actually is. Needs more specific description of the problems being solved.

  • mihaitodor 7 days ago

    Benthos Studio is an application that provides visual editing capabilities to the Benthos ( stream processor, which lets you craft and test yaml-based configurations that you can then run using Benthos.

    Benthos itself is a stateless command line (CLI) app written in Go. It supports quite a few types of "macro" building blocks (aka components) which are various flavours of inputs, outputs, processors, caches, rate limits, buffers, metrics, tracers and loggers. The most important processor is the `mapping` one which lets you execute Bloblang code against each message which passes through it. Bloblang is a functional programming language embedded in Benthos as a DSL for manipulating structured data. You can read more about it over here: Also, if you'd like to use it outside of Benthos, you can import it as a library:

    Since I mentioned importing Bloblang as a library, you can import the entire Benthos framework as a library and inject your own custom plugins to create a custom Benthos build with whatever components and extra functionality you need. It's also a great way to slim down the existing distribution and only import the components that you require. See some examples here:

rmorey 7 days ago

I believe I've seen this before, but did this used to be called blobfish? Or am I misremembering that due to the mascot (which is excellent)

  • mihaitodor 7 days ago

    It was called Benthos from the very first commit (Aug 25th, 2015):

    LE: I should've mentioned that the official mascot is called The Benthos Blobfish:

    • gyulai 7 days ago

      > I should've mentioned that the official mascot is called The Benthos Blobfish.

      The mascot really really creeps me out.

      This is entirely subjective, of course, and I really don't want to be mean about it, but I thought it might be a helpful data point for you to collate in case others feel the same way: If I were forced to use benthos, and had that ugly thing on my screen all day long due to working with benthos documentation etc. I really don't know whether I could handle it.

      • jeffail 7 days ago

        Sorry to hear that, unfortunately you can't please everyone whilst also having fun, and my open source work is unapologetically fueled by fun.

POPOSYS 7 days ago

What is the difference to ?

  • mihaitodor 6 days ago

    It's really hard to tell without doing a proper deep dive into Dagster, but even if there is a lot of overlap, there's a lot of reading that one must do before even starting with basic workflows. Is there a one-click demo UI that I can run which produces a valid config that I can just copy / paste and then do smth like `dagster -c config.yaml` to run it?