Show HN: Basis – a production focused robotics framework
github.comHey HN! Myself and my cofounder are excited to actually launch the product we've been working for the past six months. It's a robotics framework with a focus on testing and production. The current industry standard (ROS) is great for prototyping, but suffers from performance and testing problems as the robot gets more complex. I would have loved to work on this another six months or a year to polish it up and really, but I know it's better to launch a bit before you think you're ready.
It's a C++ pub-sub framework (a lot like ROS, in that way), but rather than declaring C++ publishers/subscribers directly, you declare the topics and types your code cares about in a configuration file, along with conditions on those inputs. Doing this allows for deep knowledge about the code running, which unlocks the possibility of deterministic simulation and testing (along with making it easy to generate bindings for other languages, create alternate schedulers, more easily swap out internal concepts, etc). It also enables easy static analysis of the codebase - given a launch file and arguments to the launch file, one can analyze the topic network and find missing publishers or publishers that publish to nothing.
It’s interesting to me that every robotics framework essentially boils down to a specialised pub-sub system with some flavour of serialisation and some kind of startup/launcher to wrangle running separate processes.
Props to this team for picking Protobuf given its wider adoption unlike ROS and its custom format.
Is there a reason an existing pub-sub system like NATS isn’t suitable for a framework like this?
Also unfortunate that this project is written in C++ and offers Rust bindings rather than the other way around.
I always wonder about this too. Half the ROS projects I've seen are just single-board applications with topics.
ROS bags are a useful facility for data capture and testing, and there are some ROS packages that are useful (against the backdrop of an awful packaging/build system). But even all that is just because the community decided this particular pub-sub system was the robotics one and started building packages on it.
So why ROS, robotics community, why?
I suspect it just happened to be at the right place at the right time, worked well for a set of users in the robotics community and built up strong network effects there.
This is a pretty common story. Python became very successful in a similar way. I have seen things like flight critical software written in Python not because Python is a good language for it (if I heard this 15 years ago I would laugh), but because Python became a de-facto standard for prototyping in the domain due to its libraries, tools and bindings and solving a realtime challenge for an existing prototype is often an easier lift than rebuilding it from scratch. My 2c.
One story not mentioned here is the real history.
ROS was developed as the framework for the PR1 robot, as a "Linux for robotics" idea. But it really took off when Willow Garage GAVE out a bunch of PR2 robots to academic institutes around the world.
It then stuck because those labs, even independently, developed really useful tools in a very backwards compatible language (C++) or with support to old languages (Lisp). You can find many repos such as this one which basically have EVERYTHING, handed down from PhD to PhD student over decades
https://github.com/jsk-ros-pkg
Basically a lot of the headache inducing grudge work exist there as a library, you only need to glue it together.
You want to calibrate your camera, get the relative position to the root of a robot arm? You can knack it together with openCV and then end up debugging for hours because coordinate transform convention was wrong. Or just install this library, publish your sensor data in specified topic, and it does it for you.
Imo it's easy to miss the usefulness of ROS if you don't consider the tf package, rosbag and rviz together with ROS as a bundle.
Everytime I see a pub/sub "new ROS" framework, sooner or later I realize I need to reinvent tf/rosbag/rviz, because those standard packages are almost never available (or available only for the developers specific need)
We support mcap (the data container format that ROS2 uses) and Foxglove for viz. Shoutout to Foxglove, it's a much nicer experience than rviz. Something like tf is on the todo list - I recognize it's important.
That's cool to hear! I'll follow the project and fingers crossed it works out. Since I still kind of rely on legacy packages I probably won't/can't move to it in the nearest future but it would definitely be great when "prototyping" and production could be merged to one (without headache)
IMO, the real value of ROS is the data logging (bag/mcap files), visualization (FoxGlove) are the main value of ROS. Even then, I'm not sure it's worth the overhead and brittleness of building and running it. There is just so much complexity deploying and developing for it.
Yep. ROS is great if you are an academic or a garage startup who needs to get anything at all up and running ASAP because it is extremely flexible and has a huge volume of community modules.
But, when it comes time to deploy serious business (very often Safety Critical) robots in the field, you don’t want flexibility. You want certainty. You don’t even want everyone’s contributions. You want specialized, carefully vetted code.
Thus the effort to make it easy to transition off of ROS to something simpler and more reliable.
This, so much.
Can you point to specific examples or discussions supporting the claim that ROS (focusing on ROS 2 given noetic is almost EOL) suffers from performance and testing problems?
Testing: this is mostly my experience, but ROS doesn't have any form of built in integration testing. I should be able to take a launch file and a bag, run it, and validate the outputs easily. And that's not even getting into test determinism - it's killer for larger robots, and IMO nearly required for doing integration testing/full sensor simulation.
Perf: Again, mostly my experience, with ROS1, in self driving. Two issues:
1. High rate topics were previously a bit dicey - I remember someone hooking up an IMU driver directly to the topic graph with no batching (600 Hz) and causing some perf degradation. 2. There's no builtin controls for core pinning, etc, which hurts on high core systems. In fairness, we haven't implemented these either, but it's coming.
There's some ROS2 examples here: https://discourse.ros.org/t/ros2-speed/20162/2 but the topic is a little old, and it honestly looks a little dependent on the DDS backend. They are swapping to Zenoh anyhow (mostly for config complexity reasons, I believe), I'm not sure what the perf will be, there.
Getting real perf benchmarks is top of mind for me, understand if my explanation is not so convincing.
I'm always looking for a ROS replacement.
The limitations section at the bottom of the readme is excellent. Good level of detail and self-awareness. More projects need that.
Sadly, while theres a lot of stuff I like, ultimately I don't think this is the ROS replacement the world needs. Specifically because of step 1: clone the repo (or add as submodule).
Its a terrible move, IMO, to not have the pub/sub system be a binary executable or shared object file. ROS's fatal flaw (among many flaws IMO) is requiring every library-user to be able to build from source. I'm glad it doesnt dictate a folder structure, but I'm really sad to see this project continue the trend of needing to build from source.
We will eventually have a binary distribution. I chose not to for this release because it's not the workflow I want to encourage, and due to the management overhead of having to create packages for every combination of hardware/OS we'd like to support with a team of two (also, I need to learn the proper way to install headers with CMake...). I don't want to lock users to my OS of choice, I want to ensure it's easy to do things like turn on "-march=native" for the whole codebase, and I want it to be easy to fork and patch/customize.
I don't get your comment about ROS - in my experience with ROS, most users don't build the distribution from source, they use apt or similar, up until they need to fork it to fix some bug or another (and then cry because you have to muck with apt's repository version syntax to use _your_ version of the package and not upstream). Regardless, I think it is important that users are able to build from source if needed - see above. Why would you like to restrict that?
To not be all negative though; I love the work on making it deterministic, and having first class docker support. ROS is such a crap-shoot when it comes to reproducibility. I mean bag files are great and all, but when its demo time on stage something always behaves inconsistently.
I'm actually not a fan of the "bring your own serializer" design. I think just use one serializer thats fast (like protobuf) and then let other serializers be built on top of that (e.g. JSON). I could be wrong, but thats my hunch of a good system.
I'd like to very much encourage protobuf as the serializer, but supporting others both gives users the freedom to pick their own tradeoffs, and makes it 10x easier to transition from ROS or another framework. We do support JSON output from the serializer layer - I'd rather not do other conversions if possible.
Looks super cool. Would be nice to have a "try it out" section. I wanna plug it into a simulator and write a node (is that what you call it?), to make it do something. That would make things super clear.
We do have https://docs.basisrobotics.tech/guide-getting-started/enviro... and https://github.com/basis-robotics/basis-examples/tree/main/c... but I want to improve our first time user experience. (we call them units)
Oh very cool.
Would it be crazy to try and connect it to Gazebo?
Only a little crazy - you'd have to use ros1msg or write a ROS2 message plugin. And then link together basis+Gazebo or write a bridge (bridges aren't too hard - see https://github.com/basis-robotics/basis/blob/main/cpp/plugin...).
What functionality does it provide? Can it do collision avoidance planning? Physics simulation?
How does one get started with robotics? What's a good onboarding project?
Hmm, why not Cap'n'Proto? Also this seems to be multi-process like ROS. I wanna see something adopt the Isaac model of nodes and codelets.
We're single process! Multi-process is opt in (it's useful to toss your telemetry and other non-critical code in a separate process so a crash doesn't take down everything), our units are libraries by default. When you make a launch file you're actually composing together libraries at runtime. We also support passing CUDA handles (or any runtime only struct) within a process. See https://github.com/basis-robotics/basis_test_robot/blob/main... - this is a struct containing a CUDA handle, it isn't copied to CPU nor serialized unless you request it over network (ie with Foxglove) or serialize it to disk.
As for why not capnproto - I'm more experienced with protobuf. No good reason, in this case, other than popularity.
There's at least one other good reason to pick protobuf over capnproto: they don't have real polyglot support and most folks are using at least two languages in their stack these days (https://capnproto.org/otherlang.html)
Also, if you do embedded, there's a lack of something akin to nanopb for capnproto.
See https://docs.basisrobotics.tech/guide-tools/launch-files#pro... - our launch file documentation needs some work, and I think it's probably good if I put the nodelet/codelet/process stuff in the main README, you aren't the first to make that assumption.
Have you presented this on any ROS forums? If so, how was it received?
I have not, it honestly feels a little rude to do so - but maybe I should.
Sounds like a speedrun for angry feedback. Maybe find threads discussing problems with ROS, then post in those threads
Cool project. Thanks for sharing!
I’ve worked with ROS in commercial fleets of tens to hundreds of robots for, gosh I guess over a decade now. The main issue from my POV as a web person is how poor a fit ROS comms are across unreliable networks (basically anything outside localhost). ROS2 tries to do better with choose your own DDS but there’s still pains with wanting some of the basics found in other realms: compression, encryption, authentication/authorization, proper schemas and API definition/versioning.
Does Basis intend to target any of this?
Compression, encryption, auth: not immediately. What's the use case here? Multi-machine robots are _tricky_ to get right. Running remote controls and telemetry via the framework would be really cool, but is also tricky to tune (need really good qos controls), and needs some work to be exposed safely to the outside world.
Proper schemas/API versioning: I'm not sure what you mean by "proper schemas", but schema/API versioning is also an area I'd like to get right. I've been bit before by tests using old ros bags, before some message type was changed or topic name was moved.
Use case: Imagine a location with hundreds of autonomous mobile robots that want to share state and plans with each other to behave more optimally than if they only saw each other through their sensors. They also want to share state and telemetry with monitoring systems so that humans can maintain a fleet's healthiness and address issues proactively.
After years of working with tens of thousands of deployed robots, one of the most painful things with ROS, from my perspective, is that the message definitions are very limiting. Cannot have `null` types. In research and shorter term contexts, it doesn't really jump out as an issue, but the story for how to upgrade and deploy to large fleets without downtime or hidden compatibility pains (MD5sum anyone?) is paramount when I think about "production" ready robotics.
Swarm robots: yeah, this is a really cool, but somewhat specialized for now use case. Would love to support it.
Message definitions: ros1msg does make this a little painful. Protobuf at least lets you have differing schema versions, and will do best effort to try and do forwards compatibility. The downside - this has bit me a few times in development where I've forgotten to restart a process - it will happily deserialize with the new fields being empty.
The problem with multitasking OS, drivers, and networked links are well documented:
https://en.wikipedia.org/wiki/Clock_domain_crossing
The trick is usually to stratify the navigation, path-planner, guidance, manual-overrides, and safe-machine-state system definitions into different problem domains.
YMMV =3
[flagged]