Try threading a nut onto a bolt. Pay attention to how your fingers feel when the threads engage properly and you aren't cross-threading it.
Next, insert a Standard screwdriver into a screw head, set the screw in place, and screw it in. In order to make it work, you have to push and torque it at the same time, and not let the blade slip out of the hole or damage the screw head.
If you think this is easy, try to teach a kid to do it. Watch them struggle to control the nut and the screwdriver.
Our hands are really, really good at both major motor control and very fine motor control.
For some nuts and bolts, I myself struggle with it, almost to the point of giving up.
But what the article misses: We can just rearrange our environment to make it easy to interact with by robots. There might be only standardized nuts and bolts, with IDs imprinted so the robots know exactly how to apply them. Dishes might come in certified robot-known dimensions, or with invisible marks on where to best grip them. Matchsticks might be replaced by standardized gas lighters. Maybe robot companies will even sell those themselves.
An example that came to my mind: squeezing fruit juice requires a lot of dexterity. But if we sold pre-chopped fruit bits in a standardized satchet, then robots could easily squeeze the delicious and healthy fruit juice from those! And health-conscious people drink fruit juice every day, so this could easily be made into a subscription-based service! A perfect business model right there. You could call it iJuice or juice.ai or even Juicero.
This is exactly the kinda shit Nietzsche was referring to when he talked about humanity giving up our humanity to better serve machines instead of making machines that better serve humanity.
Nietzsche and his whole philosophy are stupid, wrong, and ontologically evil. His whole philosophy is poorly reacting to the great ideas of Philipp Mainlanders “the philosophy of redemption”. Philip mainlander and Aruther Schopenhauer were correct in their philosophical pessimism and actual nihilism and nietzschian radical optimism is what motivates nearly all modern totalitarian, fascist, and authoritarian movements.
The only good thing nietzsche produced was the “Wall-E” movie from Pixar, which is a radically nietzschian film.
There's more to Nietzsche than "master and slave morality". I'm a big Nietzsche hater too but you're wrong to dismiss the rest of his work because of one part of it. He's incredibly influential in philosophy to this day, for good reason.
E.g. you can look into Deleuze's reading of his work, which focuses on the continuity from Spinoza and the analysis of ethics from a perspective of capabilities rather than obligations (as in the Kantian framing).
Or more directly, read about the idea of eternal recurrence, which I find to be an incredibly pro-, not anti-, human concept.
I dismissed him over his radical optimism. You claimed I dismissed him over master-slave morality. We are not the same. Also I’ve covet to cover read all of his work. Almost all of it is trash and ontologically bad.
You try to quote fashionable nonsense charlatan grifters like deleuze as though they are worth reading a single word of. Their works, the people who read them, and the entire field of critical theory are direct reasons for the rise of trump and right wing authoritarianism world wide.
Kill critical theory or its hateful children will kill us all
But instead you patronize me by acting like I don’t read. This mentality is why the world collectively hates leftists right now.
Haha you're going to be in strange company claiming to hate ""leftists"" and Nietzsche at the same time. Who, might I ask, do you consider "ontologically good"? Do you believe in a God that could provide such content, or is this just your own definition of what that entails? Ironically enough, between that and your seeming rejection of all prior moral philosophies, it sounds like you're operating quite in line with Nietzsche's actual recommendations -- which is far more than I could say about myself, given my rather more religious tendencies.
> But instead you patronize me by acting like I don’t read.
Rather quaint to be upset about this given the bucket of assumptions you yourself made in the comment above.
That's exactly what we've been doing since the industrial revolution.
Step in a car factory, plenty of robots but none of them are humanoids. We redesigned the whole factory around specialized robots, rather than have humanoids on a Ford-like moving assembly line.
So you want to turn your home into the equivalent of a car factory, where everything is designed to be handled by robots? I don't think many people would want to live in such a home.
But the reengineering of assembly lines has targeted speed and cost of assembly, not so much automation. Robots do have a role in manufacturing, but I think it's a relatively small fraction of the whole. AFAIK, most part makers don't rely on automation, and even though final assembly has had greater success, it's still far from as adaptable as humans are.
GM's Saturn was relatively early in that space but it didn't scale up anywhere as well as they had hoped. Likewise, Tesla went there 30 years later, but IIRC, they too experienced myriad difficulties building reliable automated manufacturing processes.
If automation among makers were ready for prime time, the work would have migrated to countries with the cheapest power and fastest mobility while ignoring labor costs. AFAIK, that still hasn't happened.
I mean, think about the batshit idea of raiway transport in Europe. Like trains sound nice in principle, sure, it works on the scale of a mine or a shipyard to move things around, but using that to travel between all major cities (and even most villages) and countries all over? It would require laying thousands and thousands of kilometers of train tracks.
Or introducing electricity and phone lines, public lighting, and adopting various standards, metrification, putting road signage everywhere, etc. etc.
We've done a lot of large scale transformations. But to kickstart the process, robots need to be "good enough" without these infrastructure changes, and then people will see it and want the change. You can't start speculatively. It has to work first, and offer to work better if there is more infra standardization.
Combustion cars were already usable even on the roads built for horse drawn carriages - they were, in fact, adapted to the existing world.
They even ran on things like firewood, coal, or, for the first ICEs, relatively common liquid fuels that could be sourced in large cities.
Cars rely on gas stations today - but gas stations only became a thing after cars already became a thing.
Nowadays, Tesla had to make Superchargers happen all by themselves before EVs could really happen - despite EVs already having the crushing advantage of being able to charge overnight in any garage that has power.
Can you see a robot company branching out to be a competitor to McDonalds to prove that their kitchen robot is viable if you design the entire kitchen for it? Well, it's not entirely impossible, but I think it unlikely.
Yes, I can see restaurants easily adopting an entire robot-friendly kitchen if it means robots can handle dish-handling and repetitive cooking tasks.
From that to every manufacturer adopting the standard on every product, independently of the client you just need some competition on their market. I dunno if there is any, but it's not a lot.
I have no idea how you came from my comment to that idea. But nothing is stopping the chef from just throwing your food into the microwave today, so I don't see what change you are complaining about either.
There's a huge practicality issue for the chef. They don't have the food in microwavable format for many dishes.
But to me thats the end state of this conversation. Like lets take shipping as an example, we came up with pallets and containers not because they're useful for a person to move but because they're helpful for robots (and analogs) to move. People aren't born with palletjacks for hands. So to me it seems as you add more robotics into the kitchen you're going to slowly change your supplies to arrive in more robotic form.
Your comment is actually lagging behind reality. There is a manufacturer of kitchen robots that opened a fully functional demo fast food joint: https://misorobotics.com/CaliExpress/
Typically the world changes when a new market is discovered, making the earth more traversable by car at the time opened up enough of a new market that it was done post haste for better or for worse. The only real way I see self-driving cars opening up markets at a scale that they would justify the amount of overhead is if they created self-driving only lanes with infrastructure closely around them to be quick and easily accessible from the passengers.
Which at that point is really just the Japanese train system and surrounding infrastructure, which many places (at least in the US) don't seem capable or willing to make happen.
The 50’s called and wants all its mechanical lever-actuated extendo arm-clasps back!
Joking aside, the present always has a tangential future that never comes to be. Right now the current zeerust is “AI and robots doing everything”. Continuing to have humans do it is good enough.
Imagine an IKEA robot, they could redesign their kitchens to fit it, as well as all of their other products. I'd never step into my kitchen again so why would it need to be made for me anyways?
(They could give the robot instructions on how to set up their furniture as well, the business plan really writes itself)
I've thought about this with roads and automated driving--today automated driving seems somewhat insane because, even if the system can do the right thing on the overwhelming majority of roads, there is an enormously long tail of roadways that have surprises that these systems will not have encountered and automated systems can't easily handle. In the future, it may seem insane that we would allow roads to be built that aren't accommodating of automated driving systems (or maybe we will just develop AI that is not simply "pattern recognition" but which can actively solve somewhat "complex" problems?).
That plan sounds viable until you consider how much noncompliant legacy hardware is out there that can't all be replaced but must be repaired, like cars, roofs, appliances, HVAC, electrical, plumbing, etc. If robots can't accommodate the huge fraction of old infrastructure already in place, they'll have limited value indeed -- basically just working on assembly lines in factories.
The article doesn't miss it -- this is exactly the point. We can and will rearrange our environment to support robots, but that means they will never learn dexterity.
Fully autonomous vehicles will never reach maximum reliability, speed, and efficiency either until we eliminate human drivers, pedestrians, stoplights, buildings, and pave the entire surface of the planet.
And for maximum safety, we'll need physical barriers that prevent the vehicles from leaving their designated path. The easiest way to do this, in my opinion, would be to put a special type of wheel on the vehicle. The wheels would have a flange on the perimeter, and the road surface could have a groove that the flange fits into, thus preventing the vehicle from veering off outside its prescribed lane. This would actually provide such tight control on the vehicles' lateral movement, that it would become possible to connect several vehicles front to back, in a sort of autonomous convoy, which is pretty cool IMO!
Yeah. Also, please refrain from reaching maximum speed and efficiency of the transportation system alone.
That's not an important goal. The important goal is to optimize the life of the people that use the lines, not artificial measures taken from just looking at the machines running in them.
I imagine things will just be very unified and boring because the same shapes will be recognizable everywhere. But it would be the same things we already have. Just make the robot weak and light enough to not even be able to harm someone. Lighting the kitchen on fire is always a risk though, I guess.
Once the humans in question have paid off the 7 year loan to remodel their kitchen, the real question is what further value do they add to the proposition?
I wasnt thinking Kitchen so much as Data Centre, a lot of whom are already unfriendly to human life, adding 300kg robots, and parts designed for robot interaction are just going to make them even moreso.
I have thought up a HOLY GRAIL test for physical AI. "Open a door given a keyring". It involves vision, dexterity, understanding of physical constraints etc. I find it insane that our hands do this so casually.
While I have no opinion on the "holy grail" part, I think them varying massively is the point.
A lot of locks require a bit of finesse as well, like pulling the door towards you or pulling the key out just the right amount, which would be an interesting challenge. Especially if the technique isn't known ahead of time. Given enough time (and frustration), people can generally figure the "finesse" aspect out for a lock.
We actually have notes taped on two of our doors, with instructions of how to get the locks to line up depending on the season. Another door requires a hard shove during the summer, and a slight pull back during the winter. Someday we'll replace that door with a metal door and get it framed nicely. But we've been saying that for 12 years!
I have to rattle the key around a bit in the lock to get the lock to turn. Good luck designing a robot to figure out it needs to rattle the key in a certain way. Or to realize that the door needs to be pulled or pushed a bit while turning the key, so the bolt doesn't jam in the door jamb.
One of my doors needs to be pulled upwards in order to open/close it. (Due to slowly pulling the doorframe out of alignment over time.)
Both those tasks require very fine force-feedback perception, instantly mapped to multiple possible scenarios that could be happening inside those metal parts. E.g. does it feel like I have crossed the threads, or is there a bit of rust? Let me twist just a bit more, but gently, to find out.
We said the same thing about language to be honest- the nuances of words and concepts are too hard for a word generator to correctly put together and now we have LLMs. We said the same thing about video generation where the nuances of light and shadow and micro expressions would be hard to replicate and LLMs are doing a pretty good job with that. We’re just waiting for physical LLMs, it will happen at some point.
I don’t think it needs to be all that complicated for the vast majority of things. It might not be able to twirl a pen around its fingers but it should be able to hold one and write something.
Yeah, tactile stuff could be described as a dark data stream that only animals have access to. I'm talking about nerve endings. You can't get that wealth of data from any mix of sensors. Lidar and accelerometers can't tell cold from hot, lumpy from smooth.
Meanwhile, I watched a cat today jump off a 4 meter high trellis and onto the top of a 2 meter high fence rail no wider than my hand, and I thought, how can we not marvel at something that's obviously so much more advanced at navigating its environment than we are?
> lidar and accelerometers can't tell cold from hot, lumpy from smooth.
Not those. There are other sensors. Tactile sensors exist. Lumpy and smooth can be distinguished. They are still rudimentary, but there is nothing fundamental blocking progress here. Roughness, squishiness, temperature, all well measurable.
Brooks describes how speech preprocessed by chopping it up into short time segments and converting the segments to the frequency domain. He then bemoans the fact that there's no similar preprocessing for touch data. OK.
But then he goes on to vision, where the form that goes into vision processing today is an array of pixels. That's not much preprocessing. That's pretty much what existed at the image sensor. Older approaches to vision processing had feature extractors, with various human-defined feature sets. That was a dead end. Today's neural nets find their own features to extract.
Touch sensing suffers from sensor problems. A few high-detail skin-like sensors have been built. Ruggedness and wear are a big problem.
Consider, though, a rigid tool such as an end wrench. Humans can feel out the position of a bolt with an end wrench, get the wrench around the bolt, and apply pressure to tighten or loosen a nut. Yet the total information available is position plus six degrees of freedom of force. If the business end of your tool is rigid, the amount of info you can get from it is quite limited. That doesn't mean you can't get a lot done. (I fooled around with this idea pre-LLM era, but didn't get very far.) That's at least a way to get warmed up on the problem.
Here's a video of a surgeon practicing by folding paper cranes with small surgical tools.[1] These are rigid tools, so the amount of touch information available is limited. That's a good problem to work on.
As you tighten a bolt the angle you need to apply force changes. So it’s not just a fixed position plus force in 6 directions its force in 6 directions at each position. You can learn quite a bit about something from such interactions such as an objects weight, center of mass, etc.
Further robots generally have more than a single rigid manipulator.
Not sure which lab (I think google?) it was, but there was a recent demo of a ML-model driven robot that folded paper in that style as one of the tasks.
> Artificial Intelligence researchers have been trying to get [X] to [Y] for over 65 years
For 10,000 different problems. A great many of which have been solved in recent years.
Robotics is improving at a very fast clip, relative to most tech. I am unaware of any barrier, or any reason to infer there is one, for dextrous robots.
I think the primary difference between AI software models and services, and robotic AI, is economics.
The cost per task for AI software is .... very small. And the cost per task for a robot with AI is ... many orders of magnitude over that.
The marginal costs of serving one more customer are completely incomparable.
It's just a push of a button to replace the "fleet" of chatbots a million customers are using. Something unthinkable in the hardware world.
The seemingly lower level of effort and progress is because hardware that could operate in our real world with the same dexterity that ChatGPT/Claude can converse online, will be extremely expensive at first.
Robotics companies are not just focused on dexterity. They are focused on improvements to dexterity that stay within a very tight economic envelope. Inexpensive dexterity is going to take a while.
One very important task to solve is the ability to select a box from a shelf and set it neatly on a pallet, as well as the reverse. People have been working very hard on this problem for a long time, there are impressive demos out there, yet still nobody is ready to set their best box manipulating robots loose in a real warehouse environment.
How hard can it be to consistently pick up boxes and set them down again in a different location? Pretty hard, apparently.
I mean with rigid plastic containers robots are 'pretty consistent' at it now.
The problem with things like cardboard boxes, especially at any size is internal weight distribution and deformation of the box. If you take someone that is pretty new to stacking boxes at a wearhouse and give them sloppy boxes (ones that bend or otherwise shift) they are going to be pretty slow at it for the first hour or so, then we'll internalize the play in the materials and start speeding up considerably while getting a nice result.
It's pretty amazing how evolution has optimized us for feedback sensing like this.
> I am unaware of any barrier, or any reason to infer there is one, for dextrous robots.
I don't think there's a fundamental barrier to building a humanoid robot but the cost will be an extremely high barrier to adoption.
A human is nature's ultimate robot: hundreds of servos, millions of sensors, self-assembling from a bag of rice, self-repairing for minor damage. You just can't beat that, not for a very long time.
First stage, either digitally generate (synthetic) basic movements, or record basic human recorded movements of a model. The former is probably better and an generate endless variation.
But the model is only trying to control joint angles, position etc. no worries about controlling power. The simulate system has no complications like friction.
The you train with friction, joint viscosity, power deviance from demand based on up, down times, fade, etc.
Then train in a complex simulated environment.
Then train for control.
Etc.
The point being, robotic control is easy to be broken down into small steps of capability.
That massively improves training speed and efficiency, even potentially smaller models.
It is also a fear simpler task by many orders of magnitude to learning the corpus of the written internet.
Comparable to that, would be training an AI to operate with any land, sea or air device. Which, nobody today is trying, (AFAIK)
It's so easy! I hope all the leading robotics researchers come to find this comment and finally deliver us the dexterous humanoid robots we've all been waiting for
Well, in fairness, the kind of deep neural architectures needed to do this stuff have only been available for a relatively short period. The robotics researchers in my institution are basically racing to put all this new capability to work.
Do you know of success stories here? Success of transferring models learned in physics simulation to the real world.
When we (ZenRobotics) tried this 15 years ago a big problem was the creation of sufficiently high-fidelity simulated worlds. Gathering statistics and modelling the geometry, brittleness, flexibility, surface texture, friction, variable density etc of a sufficiently large variety of objects was harder than gathering data from the real world.
We have massively better physics simulations today than 15 years ago, so the limitations you found back then don't apply today. It might still not be enough, but 15 years is such a long time with Moore's law and we already know all the physics so we just needed more computation to do what is needed.
I like that this has serious arguments and not "humans are magical" stuff which one sees in discussions of possible limitations of AI reasoning. (The author specifies that it's about humanoids of today and not humanoids in principle.)
This in particular shocked me a bit:
> No sense of touch. Human hands are packed absolutely full of sensors. Getting anywhere near that kind of sensing out of robot hands and usable by a human puppeteer is not currently possible.
This stands in contrast with e.g. self-driving cars, which are already superhuman at the sensor level.
Touch sensors are a thing. A spin off[1] of a company I used to work for uses pressure sensors embedded in rubber to get a pretty robot and sensitive sense of touch, though the granularity is low and it doesn't do lateral force.
Yes, quickly skimming your linked site, I saw Pressure, Force, and Vibration sensors, but as you mention no mention of lateral force, nor of temperature.
Thinking briefly about the sensations used in manupilating tools or checking a surface, both lateral and temperature are hugely important.
And, of course the insane density of sensors built into human fingers relative to anything built by humans.
I don't know if it's playing sports my whole life and watching a lot of nature documentaries, but it really seems to me that certain types who spend 95% of their waking hours sitting at a desk don't appreciate just how capable, and frankly maybe even "magical," animal bodies really are. Someone is in these comments talking about how their muscle car can accelerate faster than a cheetah. Sure, but can it turn like one, jump onto and off of rocks, stop on a dime, time its acceleration to match the exact moment it needs to pounce on prey to knock it off its own feet while avoiding getting crushed or kicked itself?
Humans have swam across the English channel, have swam from Cuba to Florida, from Alaska to Russia. Humans have run across the Sahara desert. We've scaled 2000 foot vertical rock walls with our hands. We've walked to the summit of every 8k mountain on the planet. We can go places like the top of Everest, the middle of Death Valley in the summer, deep wildnerness in thick forest or jungle, places that are so dangerous in part because rescue vehicles like ATVs, snowmobiles, and helicopters can't get there, but humans on foot can. And yeah, we can also thread nuts onto bolts, handle locked doors, do tre flips and impossibles on skateboards.
Ultimately, we and all other animals are still just machines, and there is no reason in principle that machines built from engineered planned rather than grown from evolved plans can't do all of the exact same things and even more, but it's a harder problem than many seem to appreciate. Even when I was in a tank brigade in the late 2000s, we always went to Afghanistan as infantry, because our trucks and tanks were completely useless in the rocky, steep mountains, but there is no terrain anywhere that a sufficiently well-trained human can't traverse.
I'm a bit surprised that Brooks is focusing on sensory and electro-mechanical issues as being what's holding back humanoid robotics...
Just being the shape of a human means nothing if it doesn't also have the brain of a human, which is needed if it's meant to be general purpose, or at least adaptable to a useful variety of tasks.
How exactly are these robots meant to be trained for new tasks (or just same task, different factory, or different work station)? It seems that to be useful they'd need to be able to learn on the job. Even if they've been pre-trained in simulation to do a wide variety of things, and take natural language or imitation instruction, they are still going to need to bridge the sim-2-real gap, and book smart to proficient gap, to be able to perform, and that'd at least require runtime learning.
Tesla/Musk seem to think that they're already in the robotics business with self-driving cars trained in simulation, but a car only has two degrees of freedom (speed and steering angle), and only performs one task - driving. A general-purpose humanoid robot is a whole other level of complexity.
>How exactly are these robots meant to be trained for new tasks (or just same task, different factory, or different work station)? It seems that to be useful they'd need to be able to learn on the job.
The plan for raw intelligence is to push transformers as far as we can. What's the full extent of ICL for robotics ? We don't know the answer to that yet.
Very interesting point that while we've figured out how to digitize images, text and sounds we haven't digitized touch. At best we can describe in words what a touch sensation was like. Smell is in a similar situation. We haven't digitized it at all.
Touch is a 2D field of 3D vectors. Easily stored and transmitted as images, and easily processed by neural nets. You could add temperature and pain/damage channels if you want, though they don't seem essential for most manipulation tasks. (Actually I don't believe touch is as essential as he argues anyway. Of course someone who learned a task with touch will struggle without it, but they can still do it and would quickly change strategies and improve.)
The problem with touch is making sensors that are cheap and durable and light and thin and repairable and sensitive and shape-conforming. Representation is trivial in comparison.
I'm not sure describing it in words is very helpful, and there's probably a good amount of such data available already.
I would think the way to do it is build the touch sensors first (and it seems they're getting pretty close) then just tele-operate some robots and collect a ton of data. Either that, or put gloves on humans that can record. Pay people to live their normal lives but with the gloves on.
I think he meant to write "Prologue" instead of "PROLOG".
I spent a few minutes excitedly trying to figure out how one of my favourite declarative programming languages was used to solve modern robotic sensing problems, only to realise it was probably just a misspelling ... :(
If you asked someone 300 years ago what an automated dishwashing machine would've looked like, it would be a lot more like a person than the wet cupboard we have now. I'm assuming many tasks will be like that -- it's more of a lack of imagination for why we say we need a humanoid robot to solve that task. I'm assuming it'll be the minority of tasks where it make it makes sense for that
A robot that isn't stationary, in a home or in a factory, wants legs. Wheels are fine for cars but not great for stepping over things (like on cluttered floors) and stairs. So legs, assuming we've got the compute and algorithms to get them to work well, only make sense. The rest allows for application of creativity. As a human, have a head, my brain is in it, as are my eyes. Humanoid robot doesn't need a head, and can have cameras in its chest and on its back, and then also have its brain in the chest. Depending on what's useful, it doesn't need to be limited to two arms. It could have one centrally mounted in its chest, with two cheaper ones on both sides. Or four, two on each side. I've wished for three hands before. The problem though is that they look weird. Any non-traditional design is going to fall into the uncanny valley, so that no matter how much better your non-traditionally armed robot is technically, it's just not gonna sell to the mass market. We only have to look at weird cars/vehicles which have a history of being boondoggles. So it's not a failure of imagination, and more a matter of practicality.
This needs to be some sort of maxim: “The most useful robots are the ones that don’t look like (or try to be) humanoids.”
For some reason (judging by Fritz Lang, Gundam, etc.) humanity has some deep desire or curiosity for robots to look like humans. I wonder if cats want robot cats?
> humanity has some deep desire or curiosity for robots to look like humans.
I don't think you can draw that conclusion. Most people find humanoid robots creepy. I think we have a desire for "Universal Robotics". As awesome as my dishwasher is, it's disappointing that it's nearly useless for any other task. Yeah, it washes my dishes, but it doesn't was my clothes, or put away the dishes. Our desire for a humanoid robot, I think, largely grows out of our desire for having a single machine capable of doing anything.
The vast majority of “universal robots” are portrayed as humanoids in science fiction. Perhaps part of the reason is that true “universality” includes socializing, companionship, human emotions, and of course love.
Or, alternatively, general-purpose robots tend to be human-shaped because the world as it already is has already been fully designed for humans. Single doors are tall because that's the size of a human. Tools are designed to be held in something like a hand, sometimes two of them. Stairs are designed to be walked on, and basically any other traversal method just falls apart.
Of course, there is also the thing where authors and artists tend to draw anything with human intelligence as humans, from robots to aliens. Maybe it's the social reason you mention, or they just unconsciously have assumed humans to be the greatest design to ever exist. But even despite this, in a human world, I expect the first true general-purpose robots to be "standing" upright, with one or several arm-like limbs.
If you want to make a more general-purpose robot, then approximating a human form is rational, because our spaces and systems are designed for human interaction. At the moment, though, no-one has really succeeded at that, and all the successful robots are much more specialised.
Right but that's very task specific, and what many people want is a single robot which can do many different tasks, and do so without modifying the existing environment. I would love a robot which could cook and clean and do laundry (including folding) but I still need to live in the same space it would use. The most obvious way to do that is a humanoid robot, which is why nanny companies are working on it, and here he's arguing that's not going to work.
The other obvious way to do it is to centralize it, have a lift in your house that brings up meals and clean laundry to order, where you can put your dirty dishes in when you're done, and a central space where staff and robots take care of things.
I'm actually surprised or interested that this isn't more of a thing, it doesn't take any high tech either. I suppose people like having their own stuff, or people can't be trusted, or it's prohibitively expensive to outsource food / laundry (even if especially in the US ordering food or eating out is very common).
> Before too long (and we already start to see this) humanoid robots will get wheels for feet, at first two, and later maybe more, with nothing that any longer really resembles human legs in gross form. But they will still be called humanoid robots.
Totally agree. Wheels are cheaper, more durable and more effective than legs.
Human would have wheels if there was an evolution pathway to wheels.
The world is full of curbs, stairs, lips, rugs, vehicles, etc. If you're a human-scale robot then your wheels need really wide base to not tip over all the time, so you are extremely awkward in any kind of moderately constrained space. I wouldn't exchange my legs for wheels. Wheelchair users have to fight all the time for oversights to be corrected. I can see maybe see a wheel-based humanoid robot, but only as a compromise.
On the other hand there is not much reason to constrain ourselves to the unstable and tricky bipedal platform or insist on having a really top-heavy human-like torso. You could have 3-4 little legs on a dog scale body with several extra long upwards reaching arms for eg.
It's hard to see one. Even a nice flat world with ample incentive and taking good "bearings" for granted, how can you evolve a wheel-organ that maintains a biological connection as well as being able to rotate an indefinite number of time?
A few difficult and grotesque endpoints:
* The wheel only rotates a fixed number of times before the creature must pivot and "unwind" in the opposite direction. This one seems most plausible, but it's not a real wheel.
* The main body builds replacement wheels internally (like tooth enamel) and periodically ejects a "dead" wheel which can be placed onto a spoke. This option would make it easier to generate very tough rim materials though.
* A biological quick-release/quick-connect system, where the wheel-organ disconnects to move, but then reconnects to flush waste and get more nutrients.
* A communal organism, where wheel-creatures are alive and semi-autonomous, with their own own way to acquire nutrients. Perhaps they would, er... suckle. Eeugh.
In one of Philip Pullmans His Dark Material novels there is a race of creatures that have a symbiosis with a tree whose huge perfectly round nut can be grasped by their fore and hind limbs and they roll around that way.
Wheels are not balls. Balls are common in nature. Wheels are not. The difference is that wheels need roads, which are not common nature and a large scale artificial objects.
Wheels are great until the robot encounters uneven surfaces, such as a stairway, or a curb. So some kind of stepping functionality would still be necessary.
I sometimes imagine wheeled creatures evolving in a location with a big flat hard surface like the Utah salt flats or the Nazca desert, but I guess there's not much reward for being able to roll around since those places are empty as well as flat. Tumbleweed found some success that way though, maybe?
The golden wheel spider lives in the sand dunes of the Namib Desert. When confronted by a spider-hunting wasp, it can perform a "cartwheeling" maneuver to escape. By tucking in its legs and turning onto its side, it can roll down a sand dune.
Is there any biological examples of freely rotating power systems? We have nice rotating joints with muscles to provide power, but I can't think of any joint that would allow the sort of free rotation while also producing torque, a wheeled animal would require.
Something internal to some shellfish, I believe, a kind of mixing rod that rotates. Hold on, I'll check if it's powered. (Also rotifers but they're tiny.)
Hmm, no, it sounds like it's externally powered:
> The style consists of a transparent glycoprotein rod which is continuously formed in a cilia-lined sac and extends into the stomach. The cilia rotate the rod, so that it becomes wrapped in strands of mucus.
Or maybe the cilia ( = wiggly hairs) could be seen as a kind of motor. Depends how you count it and exactly what the set-up is, I can't tell from this.
I think I would count internal power created by the rotating component itself. I hadn't though of that possibility, since human made machinery usually has the power producing component located in the main body and transferring that power to a freely rotating component is quite hard. Biological systems wouldn't necessarily look like that, and could feasibly be powered by the wheels themselves deforming as if the wheels were a separate, but connected, biological system.
In any case, it seems like a "simple" problem to solve. An accelerometer chip costs a few cents and the data rates can be handled by a very light wiring harness, ex I2C.
So embedding such a sensor in every rigid component, wiring a single data line to all of them (using the chassis as electrical ground) and feeding the data back to the model seems a trivial way to work around this problem without any kind of real pressure sensitivity. The model knows the inputs it gives to the actuators/servos, so it will quickly learn to predict the free mechanical behavior of the body, and use any deviation to derive data equivalent to pressure and force feedback.
Another possible source of data is the driving current of the motors/actuators which is proportional to the mechanical resistance the limb encounters. All sorts of garbage sources of data that were almost useless noise in the classical approach become valuable with a model large enough.
> they will likely have to collect the both the right data, and learn the right thing.
The "bitter lesson" says to stop trying to find simple rules for how to do things - stop trying to understand - and instead to use massive data and massive search to deal with all the incredibly fussy and intractable details magically.
But the article here is saying that the lesson is false at its root, because in fact lots of understanding is applied at the point of choosing and sanitising the data. So just throwing noise the model won't do.
This doesn't seem to match experience, where information can be gleaned from noise and "garbage sources of data ... become valuable with a model large enough", but maybe there's something illusory about that experience, IDK.
Natural language wasn't solved by brute force until we started using trillion parameter models and using the whole internet, every book and every article ever published as training data.
I don't know of anyone spending tens of billions on this problem like Microsoft did for OpenAI. First you'd have to build up a dataset of trillions of token equivalents for motion. What that looks like alone is largely guess work. Then you'll need to build a super computer to scale up the current sota motion model to 100 times the size of the biggest model today. Then you'll have to pretrain and finetune the models.
If after all that dexterity still isn't solved all we can say is that we need more data and bigger models.
People seriously don't understand how big big data for AI is and what a moonshot GPT3 and 4 were.
Tesla's approach is "start with motion captured data, move on to first person view video demonstrations, then move on to any video demonstrations - i.e. feed YouTube into the system and hope it learns something from that".
While the companies that have cars moving around safely all used a very diverse mix of human and ML created models. Completely on the face at the Bitter Lesson.
Isn’t one of the main functions of the brain / nervous system to “filter” noisy sensory input data to provide a coherent “signal” to perception? Perhaps a smaller or more specialized model could do that if we ended up packing the “skin” of the humanoid with various sensors.
In theory yes, a small(er) model to do almost anything exists in the problem space of all models.
The problem is getting to that model state. Evolution found these models by creating a ridiculously huge number of experiments with the cut off function being 'can it breed before it dies on a limited number of calories'.
At least at this point it doesn't seem likely we can find a shortcut beyond that necessary computation. Evolution did it with time and parallelism. We do it differently with scale and rapid energy usage.
> Another possible source of data is the driving current of the motors/actuators which is proportional to the mechanical resistance the limb encounters.
The problem is precisely the actuators. A lot of a human's muscles actually come in pairs - agonist and antagonist muscles [1], and it's hard to match the way human muscles work and their relatively tiny size in a non-biological actuator.
Just take your elbow and angle it to 90 degrees, then rapidly close it so your upper and lower arm are (almost) in parallel. An absolutely easy, trivial task to do for your pair of muscles controlling the tendons. But now, try to replicate even this small feat in a motor based actuator. You either use some worm gear to prevent the limb from going in the wrong direction but lose speed, or you use some sort of stepper motor that's very hard to control and takes up a lot of space.
> Just take your elbow and angle it to 90 degrees, then rapidly close it so your upper and lower arm are (almost) in parallel.
That's trivial with the modern flat motors and position feedback. In fact, motors can do it faster and with more precision than we.
The only reason it was ever hard was because motors didn't have a lot of torque/volume.
The reason our muscles come in pairs is because they can only really apply force in one direction. Motors don't have this limitation, and don't need to be paired.
Anyway, motors still don't have enough torque density for making fine manipulators, and the lack of sensorial data will still stop you from interacting well with the outside world.
From the article:
a human hand has about 17,000 low-threshold mechanoreceptors in the glabrous skin (where hair doesn’t grow) of the hand, with about 1,000 of them right at the tip of each finger, but with much lower density over the rest of each finger and over the palm. These receptors come in four varieties (slow vs fast adapting, and a very localized area of sensitivity vs a much larger area) and fire when they sense pressure applied or released.
Naturalistic fallacies will only carry you so far. For example, my 12 year old car has none of the incredibly adapted limbs and muscles of a cheetah, but can still easily exceed the animal land speed.
The article makes a compelling case that a certain kind of sensory input and learning is necessary to crack robotic movement in general, it remains to be seen if such a fine array of sensors as the human hand is useful outside very specific use-cases. A robot that can stock shelves reliably would still be immensely useful and very generalizable, even if it can't thread the needle due to limited fine sensory abilities.
Title of the article you're commenting:
Why Today’s Humanoids Won’t Learn Dexterity
Thesis the article is contradicting:
The idea is that humanoid robots will share the same body plan as humans, and will work like humans in our built for human environment. This belief requires that instead of building different special purpose robots we will have humanoid robots that do everything humans can do.
You are now arguing that a specialized robot lacking dexterity would still be immensely useful. Nobody is disputing that. It's just not what the article is about.
Because you have learnt it already and you can make predictions. And you don’t lose pressure sensitivity, you still feel the pressure of your hand to the glove, a better example would be using an exoskeleton or robotic arm or inactivate certain nerves. Still you risk more of breaking it imo and you have to be more careful in the beginning until you learn again.
Edit: and you probably are not gonna be as fast doing it
You don't lose pressure or touch sensitivity from wearing even thick welding gloves. You can still feel how hard you are gripping the rod quite easily.
Depends heavily on the use case. Indeed many tasks humans carry out are done without touch feedback - but many also require it.
An example of feed-forward manipulation is lifting a medium-sized object. Classic example is lifting a coffee cup. If you misjudge a full cup for empty you may spill the contents before your brain manages to replan the action based on sensory input. It takes around 300ms for that feedback loop to happen. We do many thing faster than that would allow.
The linked article has a great example of a task where a human needs feedback control: picking up and lighting a match.
Sibling comments also make a good point on that touch may well be necessary to learn the task. Babies do a lot of trial-and-error manipulation and even adults will do new tasks slower first.
The industry's approach to "trial and error to learn the task" is to have warehouses of robots perform various tasks until they get good at them. I imagine that you'd rely on warehouses less once you have a real fleet of robots performing real tasks in real world environments (and, at first, failing in many dumb and amusing ways).
Robots can also react much faster than 300ms. Sure, that massive transformer you put in charge of high level planning and reasoning probably isn't going to run at 200 tokens a second. But a dozen smaller control-oriented networks that are directly in charge of executing the planned motions can clock at 200 Hz or more. They can adjust fast if motor controllers, which know the position and current draw of any given motor at any given time, report data that indicates the grip is slipping.
This is a good point, but I’m not convinced it negates the author’s argument.
Consider whether you could pick up that same fragile glass with your eyes closed? I’d wager you could, as you’d still receive (diminished) textile feedback despite the thick gloves.
I don't follow (possibly through my own limitations) the main argument.
> The center piece of my argument is that the brute force learning approaches that everyone rightfully touts as great achievements relied on case-specific very carefully engineered front-ends to extract the right data from the cacophony of raw signals that the real-world presents.
In nearly each of the preceding examples, isn't the argument really about the boundaries that define the learning machine? Just because data preparation / formatting / sampling / serialization is more cost-effective to do externally from the learning machine, doesn't mean that boundary is necessary. One could build all of this directly inside the boundary of the learning machine and feed it the raw, messy, real world signals.
Also, humans having plentiful learning aids doing "tokenization", as anyone who helped a child learn to count has experienced first hand.
If you understand "cost-effective" to mean the same thing as "feasible with today's tech", maybe. As in, if we feed it all the raw data, we'd need more powerful, expensive devices and they would take years or decades to complete any training on the raw data set.
But without it being done, it's an unproven hypothesis at best.
It wouldn't take decades or years of compute to train a language model that doesn't tokenize text first. It's not an 'unproven hypothesis' because it's already been done. It's just a good deal more cost effective to tokenize so those exercises aren't anything more than research novelty.
> No sense of touch. Human hands are packed absolutely full of sensors. [...] We store energy in our tendons and reuse it on the next step
Side-rant: As cool as some cyberpunk/sci-fi ideas are, I can't imagine a widespread elective mechanical limb replacement within the lifetime of anyone here. We dramatically under-estimate how amazing our normal limbs are. I mean, they're literally swarms of nanobots beyond human comprehension. To recycle an old comment against mechanical limbs:
________
[...] just remember that you're sacrificing raw force/speed for a system with a great deal of other trade-offs which would be difficult for modern science to replicate.
1. Supports a very large number of individual movements and articulations
2. Meets certain weight-restrictions (overall system must be near-buoyant in water)
3. Supports a wide variety of automatic self-repair techniques, many of which can occur without ceasing operation
4. Is entirely produced and usually maintained by unskilled (unconscious?) labor from common raw materials
5. Contains a comprehensive suite of sensors
6. Not too brittle, flexes to store and release mechanical energy from certain impacts
7. Selectively reinforces itself when strain is detected
8. Has areas for the storage of long-term energy reserves, which double as an impact cushion
9. Houses small fabricators to replenish some of its own operating fluids
10. Subsystems for thermal management (evaporative cooling, automatic micro-activation)
_______________
I predict the closest thing we might see instead will be just growing replacement biological limbs, followed by waldoes where you remotely control an arm without losing your own.
Per 5, it says here "Human hands are packed absolutely full of sensors. Getting anywhere near that kind of sensing out of robot hands and usable by a human puppeteer is not currently possible."
Then another quote, "No one has managed to get articulated fingers (i.e., fingers with joints in them) that are robust enough, have enough force, nor enough lifetime, for real industrial applications."
So (3) and (7) are relevant to lifetime, but another point, related to sensors, is that humans will stop hurting themselves if finger strain occurs, such as by changing their grip or crying off the task entirely. Hands are robust because they can operate at the edge of safe parameters by sensing strain and strategizing around risk. Humans know to come in out of the rain, so to speak.
I have come to realize that we barely understand complexity. I've read a lot on information theory, thermodynamics, many takes on entropy. Not to mention literature on software development, because a lot of this field is managing complexity.
We severely underestimate how complex natural systems are. Autonomous agents seem like something we should be able to build. The idea is as old as digital computers. Turing famously wrote about that.
But an autonomous complex system is complex to an astronomical degree. Self driving vehicles, let alone autonomous androids, are several orders of magnitude more complex that we can even model.
I have read Wiener and Ashby to reach this conclusion. I've used this argument before. A piece of software capable of creating any possible software would be infinitely complex. Also the reason I don't buy the "20 w general intelligence exists". The wattage for generally intelligent humans would be the entire energy input to the biosphere up to the evolution of humans.
Planetary biospheres show general intelligence, not individual chunks of head meat.
That knowledge held in evolution equates to "training" for an AGI, I guess. Mimicking 4 billion years of evolution shouldn't take that long ... but it does sound kind of expensive now you mention it.
Now I'm imagining a brain in a jar, but with every world-mimicking evolved aspect of the brain removed. Like, it has no implicit knowledge of sound waves or shapes or - well, maybe those low-level things are processed in the ears and retinas, but it has no next-stage anticipation of audio or visual data, either, and no body plan that relates to the body's nerves, and no relationship to digestion or hormones or gravity or jump scares or anything else that would prepare it for being monkey-shaped and living in the world. But, it has the key thing for intelligence, the secret sauce, whatever that is. So it can sit there and be intelligent.
Then you can connect it up to some input and output, and ... it exhibits intelligence somehow. Initially by screaming like a baby. Then it adapts to the knowledge implicit in its input and output systems ... and that's down to the designer. If it has suction cup end effectors and a CCD image sensor array doobrie ... I guess it's going to be clumsy and bewildered. But would it be noticeably intelligent? Could it even scream like a baby, actually? I suppose our brains are pre-evolved to learn to talk. Maybe this unfortunate person would only be able to emit a static hiss. I can't decide if I think it would ever get anywhere and develop appreciable smarts or not.
I feel like I can intuit these things pretty well but others can't. For example I see everyone talking about LLMs replacing developers and I'm over here thinking there is absolutely no way an LLM is replacing me any time soon. I'll be using it to do my job faster and better sure, but it won't replace me. It can barely do a good job while I hold it's hand every step of the way. It often goes crazy and does all kinds of dumb stuff.
Similarly reading this article I agree with the author and I feel like what they're saying seems obvious. Of course making robots that can match humans' abilities is an absolutely insurmountable task. Yes, insurmountable as in I don't think we will ever do it.
Automating specific tasks in a factory is one thing, making a robot that can just figure out how to do things and learn like a human does is many orders of magnitude beyond. Even LLMs aren't there, as we can see from how they fail at basic tasks like counting the Rs in Raspberry. It's not intelligence it's just the illusion of intelligence. Actual intelligence requires learning. Not training. Actual intelligence won't run a command, fail to read it's output, make up the output and continue as if everything is fine while in fact nothing is fine. But LLMs will because they're stupid stochastic parrots, basically fancy search engines. It's really strange to me how everyone else seems blind to this.
Maybe if we some day figure out real artificial intelligence we will have a chance to make humanoids that can match our own abilities.
I'd add an 11th point to expand on #1: supports a very wide range of movement speeds, movement force/torque and movement precision.
Take the elbow joint and the muscles it's connected to. It supports very fine precision, slow speed operations as well as high speed but at the same time the same operation at high speeds - say, lifting yourself up on a horizontal bar, assuming adequate strength you can either do a slow or a fast lift, and both at enough precision and torque to prevent your body mass from impacting to the bar which is another feat in itself.
Now try to replicate that with a classic mechanical mechanism, you'll always lose either precision, speed or torque.
Bolting on extra senses, tools, limbs is no big deal.
Humans are also some of the most physically adaptable animals on the planet, in terms of being able to remodel our bodies to serve new tasks. "specific adaptation to imposed demand" is one of the things that really sets us (and a few other animals) apart in a remarkable way. Few animals can practice and train their bodies like we can.
In addition, I understand research shows that people with amputations very quickly adapt both practically and psychologically, as a general principle (some unfortunate folks are stuck with phantom pain and other adaptive issues).
The old discussion about "adding 20 minutes to your commute is worse than losing a leg below the knee" takes into account the fact that most people underestimate how large a negative effect commuting has, but also overestimate how large a negative effect losing a portion of a limb has.
It's likely that humans beat basically every other animal at this - because humans are social tool users. Most animals learn their body plan once and almost never change it. Humans have to learn to use new tools or work with other humans all the time.
Which seems to reuse the same brain wiring as what's used for controlling the body. To a professional backhoe operator, the arm of the backhoe is, in a very real way, his arm.
Curiously enough, most current neural interfaces don't seem to expose much of this flexibility. It's likely that you'd have to wire into premotor cortex for that - but for now, we're mostly using the primary motor cortex instead, because it's much better understood. The signals found there are more human-comprehensible and more prior work was done on translating them into useful motions.
Who is kidding who?
Just watch a film of a single cell critter, approach something and one , go yum! and engulf it or two go ahhhhhhhh!, and run away
I believe the full technical explanation for that goes, mumble,mumble,chemical receptors something, mumble mumble.Humans are sensitive to certain chemicals in the parts per billion, and your finger can detect surface roughness down to 1/1000'th of an inch, thats the standard issue, exceptional indivuals with training will perform significantly better.
> When an instability is detected while walking and the robot stabilizes after pumping energy into the system all is good, as that excess energy is taken out of the system by counter movements of the legs pushing against the ground over the next few hundred milliseconds. But if the robot happens to fall, the legs have a lot of free kinetic energy, rapidly accelerating them, often in free space. If there is anything in the way it gets a really solid whack of metal against it. And if that anything happens to be a living creature it will often be injured, perhaps severely.
The fine article has a carefully crafted set of media queries. They react to every increase in the zoom level by shrinking the text. I would have read the article but my tired old eyes were unable to squint hard enough. Thanks web designers!
It's been tried a number of times already though, robotics companies have been around for decades, Sony, Boston Dynamics, Hyundai and many others are already in the space (and some of those are on the stonks market). I don't think it'll become any bigger than what it is. Also since many have tried to make it hype already, Tesla being the latest.
There's a number of "robotics and embodied AI" ETFs out there that should show up with a quick search. I don't have an opinion as to their quality so you'd have to do your own research.
Pshaw, that's nothing, you need to invest in the companies promising to make a Greater Fool robotic investor, now that's where the market'll take off. :P
Good question, it's already a thing for some use cases; what they can do in surgery is pretty amazing, but that's a use case where the robotic things have a huge benfit over people themselves. But that's a clear usecase of a complex task. For banal tasks like folding laundry... it'll remain more practical to just do it yourself, in the flesh. It can be done remotely but due to the limitations of robots and the internet it'll be slower and more expensive.
I misread the title and I thought it was about humans.
And I could see it. With prevalence of screens kids already don't learn a lot of dexterity that previous generations have learned. Their grip strength is weak and capacity for fine 3d motions is probably underdeveloped as well.
Last week I've seen an intelligent and normally developing 7 year old kid asking mum to operate a small screwdriver to get to the battery compartment of a toy because that apparently was beyond his competence.
Now with recent developments in robotics, fully neural controllers and training in simulated environments there could be that modern babies will have very little tasks requiring dexterity left when they grow up.
> because that apparently was beyond his competence.
This has almost nothing to do with nature (barring a development issue).
This has to do with nurture. Every time they went to do something with a tool a helicopter gunship of a parent showed up to tell them no. Now they have a learned helplessness when it comes to these things.
But that's not really any different then when I was a kid so very long ago. At 4 or 5 I was given a stack of old broken radios and took them to the garage for a rip and tear session. I got to look at all their pretty electronic guts that fascinated me. There were plenty of other parents of that time that would have been horrified to see their kids do something similar.
Isn't another hardware problem being ignored here? Pound-for-pound muscle fibers are just superior to what you can achieve with electric motors or pneumatics.
Take size, strength, precision, longevity, and speed. It's not hard to match or beat organic muscle fibers on one or two of these dimensions with an electrically driven system, but if it does, it's going to neglect other dimensions to such a degree as to put building a humanoid robot that achieves parity with a human completely out of reach.
You can slather as much AI as you want on top of inadequate hardware - it's not going to help.
Electric motors are significantly better at the requirement that matters: endurance.
Sure it takes a bigger motor to produce the same torque, but speed and precision are actually the strengths of electric motors. The fundamental problem with them is that reducers are not impact resistant and they have internal inertia, which is something muscles do not have. Another problem is building actuators with multiple degrees of freedom. The ideal configuration for legs is a ball joint, not two consecutive rotary joints.
Which gives him great credibility on the history of the field, but is maybe a hinderance in advancing it. Advancement comes from people who are too ignorant to know that what they want to do is considered impossible by experts in the field. Not always, of course, but enough times that being the expert that created the Roomba back in the augts doesn't automatically mean he's right.
I lost a ton of respect for the author when he started talking about speech recognition.
He makes a few claims:
(1) That speech recognition isn't end to end because it requires highly sophisticated mathematically crafted preprocessing.
(2) That this is evidence human learning is more sophisticated than deep learning.
So (1) is just nonsense. It was true 10 years ago but wasn't true 6 years ago. And if he's that far out of date, that really poisons my ability to trust him.
And (2) misses some important knowledge about how humans work, which most speech recognition researchers know about. The human ear actually does it's own version of Fourier decomposition by using different length hairs in the ear. The human body does a ton of evolved preprocessing. Given that we could develop in decades audio preprocessing that took evolution millenia to build, we seem to be doing pretty well.
Confusing title because of the choice for the word "humanoid". When I see that, I expect we're talking about a creature shaped like a human. The word for human-shaped robots has always been "android". Can we please just continue using that?
True, most of the tasks can be done with off-the-shelf hardware already. But single task robotics is already a solved problem, what the humanoid robots are about is multi-task, aimed at replacing the tasks that still require human hands / legs / eyes / brains / etc.
But I think most of those can be replaced by existing robotics as well anyway. I mean take car manufacturing, over time more and more humans were replaced by robots, and nowadays the newest car factories are mostly automated (see lights-out manufacturing: https://en.wikipedia.org/wiki/Lights_out_(manufacturing)). Interestingly a Japanese robot factory has been lights-out since 2001, where they can run for 30 days on end without any lights, humans, heating or cooling.
The article points out that the human hand has over 10000 sensors with specific spatial layout and various specialised purposes (pressure / vibration / stretching / temperature) that require different mechanical connections between the sensor and the skin.
You don't need all those for most tasks modern tasks though. Sure if you wanna sew a coat or something like that, but most modern day tasks require very little of that sort of skill.
the Nature limited us to just 2 hands for all tasks and purposes. The humanoids have no such limitations.
>10000 sensors with specific spatial layout and various specialised purposes (pressure / vibration / stretching / temperature) that require different mechanical connections between the sensor and the skin.
mechanical connection wouldn't be an issue if we lithograph the sensors right onto the "skin" similarly to chips.
Sorry, I meant to emphasize _different_ mechanical connections. That a sensor that detects pressure has a different mechanical linkage than the one detecting vibration. So you need multiple different manufacturing techniques to replicate that at correspondingly higher cost.
The “more than 10000” also has a large impact in size (sensors need to be very small) and cost (you are not paying for one sensor but 10000).
Of course some applications can do with much less. IIUC the article is all about a _universal_ humanoid robot, able to do _all_ tasks.
Try threading a nut onto a bolt. Pay attention to how your fingers feel when the threads engage properly and you aren't cross-threading it.
Next, insert a Standard screwdriver into a screw head, set the screw in place, and screw it in. In order to make it work, you have to push and torque it at the same time, and not let the blade slip out of the hole or damage the screw head.
If you think this is easy, try to teach a kid to do it. Watch them struggle to control the nut and the screwdriver.
Our hands are really, really good at both major motor control and very fine motor control.
For some nuts and bolts, I myself struggle with it, almost to the point of giving up.
But what the article misses: We can just rearrange our environment to make it easy to interact with by robots. There might be only standardized nuts and bolts, with IDs imprinted so the robots know exactly how to apply them. Dishes might come in certified robot-known dimensions, or with invisible marks on where to best grip them. Matchsticks might be replaced by standardized gas lighters. Maybe robot companies will even sell those themselves.
That's a great idea actually!
An example that came to my mind: squeezing fruit juice requires a lot of dexterity. But if we sold pre-chopped fruit bits in a standardized satchet, then robots could easily squeeze the delicious and healthy fruit juice from those! And health-conscious people drink fruit juice every day, so this could easily be made into a subscription-based service! A perfect business model right there. You could call it iJuice or juice.ai or even Juicero.
I spent last night bemoaning the sad internet culture of negativity and sarcasm and hatred... But this post made me laugh.
If HN sold the ability to buy more upvotes they would have all my money for this comment.
I can rent them to you. Contact us for pricing today!
(Do not ask about our burst pricing though, it usually screws up the deal)
This is exactly the kinda shit Nietzsche was referring to when he talked about humanity giving up our humanity to better serve machines instead of making machines that better serve humanity.
Aren't the nuts and bolts part of the machine?
elaborate please?
Nietzsche and his whole philosophy are stupid, wrong, and ontologically evil. His whole philosophy is poorly reacting to the great ideas of Philipp Mainlanders “the philosophy of redemption”. Philip mainlander and Aruther Schopenhauer were correct in their philosophical pessimism and actual nihilism and nietzschian radical optimism is what motivates nearly all modern totalitarian, fascist, and authoritarian movements.
The only good thing nietzsche produced was the “Wall-E” movie from Pixar, which is a radically nietzschian film.
There's more to Nietzsche than "master and slave morality". I'm a big Nietzsche hater too but you're wrong to dismiss the rest of his work because of one part of it. He's incredibly influential in philosophy to this day, for good reason.
E.g. you can look into Deleuze's reading of his work, which focuses on the continuity from Spinoza and the analysis of ethics from a perspective of capabilities rather than obligations (as in the Kantian framing).
Or more directly, read about the idea of eternal recurrence, which I find to be an incredibly pro-, not anti-, human concept.
I dismissed him over his radical optimism. You claimed I dismissed him over master-slave morality. We are not the same. Also I’ve covet to cover read all of his work. Almost all of it is trash and ontologically bad.
You try to quote fashionable nonsense charlatan grifters like deleuze as though they are worth reading a single word of. Their works, the people who read them, and the entire field of critical theory are direct reasons for the rise of trump and right wing authoritarianism world wide.
Kill critical theory or its hateful children will kill us all
But instead you patronize me by acting like I don’t read. This mentality is why the world collectively hates leftists right now.
Haha you're going to be in strange company claiming to hate ""leftists"" and Nietzsche at the same time. Who, might I ask, do you consider "ontologically good"? Do you believe in a God that could provide such content, or is this just your own definition of what that entails? Ironically enough, between that and your seeming rejection of all prior moral philosophies, it sounds like you're operating quite in line with Nietzsche's actual recommendations -- which is far more than I could say about myself, given my rather more religious tendencies.
> But instead you patronize me by acting like I don’t read.
Rather quaint to be upset about this given the bucket of assumptions you yourself made in the comment above.
Dead end. You can't redesign and replace the entire world.
It's the same issue as self-driving cars: universal worker robots have to either learn to use the same things humans do, or never leave the labs.
That's exactly what we've been doing since the industrial revolution.
Step in a car factory, plenty of robots but none of them are humanoids. We redesigned the whole factory around specialized robots, rather than have humanoids on a Ford-like moving assembly line.
So you want to turn your home into the equivalent of a car factory, where everything is designed to be handled by robots? I don't think many people would want to live in such a home.
But the reengineering of assembly lines has targeted speed and cost of assembly, not so much automation. Robots do have a role in manufacturing, but I think it's a relatively small fraction of the whole. AFAIK, most part makers don't rely on automation, and even though final assembly has had greater success, it's still far from as adaptable as humans are.
GM's Saturn was relatively early in that space but it didn't scale up anywhere as well as they had hoped. Likewise, Tesla went there 30 years later, but IIRC, they too experienced myriad difficulties building reliable automated manufacturing processes.
If automation among makers were ready for prime time, the work would have migrated to countries with the cheapest power and fastest mobility while ignoring labor costs. AFAIK, that still hasn't happened.
Not the entire world but maybe enough of it.
I mean, think about the batshit idea of raiway transport in Europe. Like trains sound nice in principle, sure, it works on the scale of a mine or a shipyard to move things around, but using that to travel between all major cities (and even most villages) and countries all over? It would require laying thousands and thousands of kilometers of train tracks.
Or introducing electricity and phone lines, public lighting, and adopting various standards, metrification, putting road signage everywhere, etc. etc.
We've done a lot of large scale transformations. But to kickstart the process, robots need to be "good enough" without these infrastructure changes, and then people will see it and want the change. You can't start speculatively. It has to work first, and offer to work better if there is more infra standardization.
We changed the world for combustion cars, why not for self-driving cars?
Combustion cars were already usable even on the roads built for horse drawn carriages - they were, in fact, adapted to the existing world.
They even ran on things like firewood, coal, or, for the first ICEs, relatively common liquid fuels that could be sourced in large cities.
Cars rely on gas stations today - but gas stations only became a thing after cars already became a thing.
Nowadays, Tesla had to make Superchargers happen all by themselves before EVs could really happen - despite EVs already having the crushing advantage of being able to charge overnight in any garage that has power.
Can you see a robot company branching out to be a competitor to McDonalds to prove that their kitchen robot is viable if you design the entire kitchen for it? Well, it's not entirely impossible, but I think it unlikely.
Yes, I can see restaurants easily adopting an entire robot-friendly kitchen if it means robots can handle dish-handling and repetitive cooking tasks.
From that to every manufacturer adopting the standard on every product, independently of the client you just need some competition on their market. I dunno if there is any, but it's not a lot.
If all a robot does is take a package and throw it into the microwave why don't I just save a trip to the "restaurant" and eat at home?
I have no idea how you came from my comment to that idea. But nothing is stopping the chef from just throwing your food into the microwave today, so I don't see what change you are complaining about either.
There's a huge practicality issue for the chef. They don't have the food in microwavable format for many dishes.
But to me thats the end state of this conversation. Like lets take shipping as an example, we came up with pallets and containers not because they're useful for a person to move but because they're helpful for robots (and analogs) to move. People aren't born with palletjacks for hands. So to me it seems as you add more robotics into the kitchen you're going to slowly change your supplies to arrive in more robotic form.
Your comment is actually lagging behind reality. There is a manufacturer of kitchen robots that opened a fully functional demo fast food joint: https://misorobotics.com/CaliExpress/
While in Europe earlier I learned that BYD had to make hybrids for the European market since their charging infastructure isn't quite there.
Typically the world changes when a new market is discovered, making the earth more traversable by car at the time opened up enough of a new market that it was done post haste for better or for worse. The only real way I see self-driving cars opening up markets at a scale that they would justify the amount of overhead is if they created self-driving only lanes with infrastructure closely around them to be quick and easily accessible from the passengers.
Which at that point is really just the Japanese train system and surrounding infrastructure, which many places (at least in the US) don't seem capable or willing to make happen.
How so? roads were already dimensioned for horses, motorways for tanks. Most major change had industry (shipping, logistics etc) or military backing.
We changed the main purpose of roads to be for cars
https://www.forbes.com/sites/carltonreid/2022/11/08/happy-bi...
Yeah bit we didn't change the world, and we didn't add roads. We just refused the existing ones, building out as necessary.
Only if you think a road back then is the same as a road now.
We changed the layout, the material, added guardrails and guide posts etc.
The 50’s called and wants all its mechanical lever-actuated extendo arm-clasps back!
Joking aside, the present always has a tangential future that never comes to be. Right now the current zeerust is “AI and robots doing everything”. Continuing to have humans do it is good enough.
Imagine an IKEA robot, they could redesign their kitchens to fit it, as well as all of their other products. I'd never step into my kitchen again so why would it need to be made for me anyways?
(They could give the robot instructions on how to set up their furniture as well, the business plan really writes itself)
I've thought about this with roads and automated driving--today automated driving seems somewhat insane because, even if the system can do the right thing on the overwhelming majority of roads, there is an enormously long tail of roadways that have surprises that these systems will not have encountered and automated systems can't easily handle. In the future, it may seem insane that we would allow roads to be built that aren't accommodating of automated driving systems (or maybe we will just develop AI that is not simply "pattern recognition" but which can actively solve somewhat "complex" problems?).
It’s only a stopgap measure. And maybe that explains why autonomous robots in space struggle so much: we first need to adapt space to robot (joke)
Standards are good, but then how about you take up pottery and make a plate yourself and now your robot won’t handle them…
You can do pottery in VR if you want.
That plan sounds viable until you consider how much noncompliant legacy hardware is out there that can't all be replaced but must be repaired, like cars, roofs, appliances, HVAC, electrical, plumbing, etc. If robots can't accommodate the huge fraction of old infrastructure already in place, they'll have limited value indeed -- basically just working on assembly lines in factories.
In New Zealand there are rules, and more extensive guidelines, on "disabled access". (Probably called something different now)
But it means that access to publicly accessible places is possible for a wide variety of disabilities
I wonder if that would help robotic access? For example you do not need to grip and turn a knob to open a door, they should all be levers.
> Maybe robot companies will even sell those themselves.
They will sell them so only their robots can use them.
The article doesn't miss it -- this is exactly the point. We can and will rearrange our environment to support robots, but that means they will never learn dexterity.
That will fail rather quickly if the robot is trying to repair something dirty.
Fully autonomous vehicles will never reach maximum reliability, speed, and efficiency either until we eliminate human drivers, pedestrians, stoplights, buildings, and pave the entire surface of the planet.
And for maximum safety, we'll need physical barriers that prevent the vehicles from leaving their designated path. The easiest way to do this, in my opinion, would be to put a special type of wheel on the vehicle. The wheels would have a flange on the perimeter, and the road surface could have a groove that the flange fits into, thus preventing the vehicle from veering off outside its prescribed lane. This would actually provide such tight control on the vehicles' lateral movement, that it would become possible to connect several vehicles front to back, in a sort of autonomous convoy, which is pretty cool IMO!
Yeah. Also, please refrain from reaching maximum speed and efficiency of the transportation system alone.
That's not an important goal. The important goal is to optimize the life of the people that use the lines, not artificial measures taken from just looking at the machines running in them.
Wait, what if we put them on tracks and they had predetermined stations they stopped at?
I'd be willing to wear clothes that have ultraviolet stripes and QR codes on them if a laundry robot can fold them for me.
Yep, this is what I reckon will be tried too. But then making that robot environment safe for humans to be in will be quite a tough problem.
I imagine things will just be very unified and boring because the same shapes will be recognizable everywhere. But it would be the same things we already have. Just make the robot weak and light enough to not even be able to harm someone. Lighting the kitchen on fire is always a risk though, I guess.
Once the humans in question have paid off the 7 year loan to remodel their kitchen, the real question is what further value do they add to the proposition?
I wasnt thinking Kitchen so much as Data Centre, a lot of whom are already unfriendly to human life, adding 300kg robots, and parts designed for robot interaction are just going to make them even moreso.
> But what the article misses: We can just rearrange our environment...
The article does discuss this...
What boosters of humanoid robots specifically do not want to doI have thought up a HOLY GRAIL test for physical AI. "Open a door given a keyring". It involves vision, dexterity, understanding of physical constraints etc. I find it insane that our hands do this so casually.
https://generalrobots.substack.com/p/benjies-humanoid-olympi... "Use a key", about halfway down
Can you explain in more detail (doors vary massively, as do keyrings)
While I have no opinion on the "holy grail" part, I think them varying massively is the point.
A lot of locks require a bit of finesse as well, like pulling the door towards you or pulling the key out just the right amount, which would be an interesting challenge. Especially if the technique isn't known ahead of time. Given enough time (and frustration), people can generally figure the "finesse" aspect out for a lock.
I've been outsmarted by a door more than once in my life.
We actually have notes taped on two of our doors, with instructions of how to get the locks to line up depending on the season. Another door requires a hard shove during the summer, and a slight pull back during the winter. Someday we'll replace that door with a metal door and get it framed nicely. But we've been saying that for 12 years!
> But we've been saying that for 12 years!
Welcome, fellow traveler!
Quite similar to the "coffee test" of Steve Wozniak.
I have to rattle the key around a bit in the lock to get the lock to turn. Good luck designing a robot to figure out it needs to rattle the key in a certain way. Or to realize that the door needs to be pulled or pushed a bit while turning the key, so the bolt doesn't jam in the door jamb.
One of my doors needs to be pulled upwards in order to open/close it. (Due to slowly pulling the doorframe out of alignment over time.)
Both those tasks require very fine force-feedback perception, instantly mapped to multiple possible scenarios that could be happening inside those metal parts. E.g. does it feel like I have crossed the threads, or is there a bit of rust? Let me twist just a bit more, but gently, to find out.
We said the same thing about language to be honest- the nuances of words and concepts are too hard for a word generator to correctly put together and now we have LLMs. We said the same thing about video generation where the nuances of light and shadow and micro expressions would be hard to replicate and LLMs are doing a pretty good job with that. We’re just waiting for physical LLMs, it will happen at some point.
As another poster pointed out, nobody has been able to remotely duplicate the sensors in our fingers.
I don’t think it needs to be all that complicated for the vast majority of things. It might not be able to twirl a pen around its fingers but it should be able to hold one and write something.
Yeah, tactile stuff could be described as a dark data stream that only animals have access to. I'm talking about nerve endings. You can't get that wealth of data from any mix of sensors. Lidar and accelerometers can't tell cold from hot, lumpy from smooth.
Meanwhile, I watched a cat today jump off a 4 meter high trellis and onto the top of a 2 meter high fence rail no wider than my hand, and I thought, how can we not marvel at something that's obviously so much more advanced at navigating its environment than we are?
> lidar and accelerometers can't tell cold from hot, lumpy from smooth.
Not those. There are other sensors. Tactile sensors exist. Lumpy and smooth can be distinguished. They are still rudimentary, but there is nothing fundamental blocking progress here. Roughness, squishiness, temperature, all well measurable.
Brooks describes how speech preprocessed by chopping it up into short time segments and converting the segments to the frequency domain. He then bemoans the fact that there's no similar preprocessing for touch data. OK.
But then he goes on to vision, where the form that goes into vision processing today is an array of pixels. That's not much preprocessing. That's pretty much what existed at the image sensor. Older approaches to vision processing had feature extractors, with various human-defined feature sets. That was a dead end. Today's neural nets find their own features to extract.
Touch sensing suffers from sensor problems. A few high-detail skin-like sensors have been built. Ruggedness and wear are a big problem.
Consider, though, a rigid tool such as an end wrench. Humans can feel out the position of a bolt with an end wrench, get the wrench around the bolt, and apply pressure to tighten or loosen a nut. Yet the total information available is position plus six degrees of freedom of force. If the business end of your tool is rigid, the amount of info you can get from it is quite limited. That doesn't mean you can't get a lot done. (I fooled around with this idea pre-LLM era, but didn't get very far.) That's at least a way to get warmed up on the problem.
Here's a video of a surgeon practicing by folding paper cranes with small surgical tools.[1] These are rigid tools, so the amount of touch information available is limited. That's a good problem to work on.
[1] https://www.youtube.com/watch?v=5q-HHoqzQi0
As you tighten a bolt the angle you need to apply force changes. So it’s not just a fixed position plus force in 6 directions its force in 6 directions at each position. You can learn quite a bit about something from such interactions such as an objects weight, center of mass, etc.
Further robots generally have more than a single rigid manipulator.
Yes, it's time-varying data, but there are not that many channels. And the sensors are off the shelf items, although overpriced.
Yeah, humans can teleop some pretty complex operations even through cheap robot arms!
> That's a good problem to work on.
Not sure which lab (I think google?) it was, but there was a recent demo of a ML-model driven robot that folded paper in that style as one of the tasks.
That's a bit like saying speech recognition can be solved with ML and an air pressure sensor.
> Artificial Intelligence researchers have been trying to get [X] to [Y] for over 65 years
For 10,000 different problems. A great many of which have been solved in recent years.
Robotics is improving at a very fast clip, relative to most tech. I am unaware of any barrier, or any reason to infer there is one, for dextrous robots.
I think the primary difference between AI software models and services, and robotic AI, is economics.
The cost per task for AI software is .... very small. And the cost per task for a robot with AI is ... many orders of magnitude over that.
The marginal costs of serving one more customer are completely incomparable.
It's just a push of a button to replace the "fleet" of chatbots a million customers are using. Something unthinkable in the hardware world.
The seemingly lower level of effort and progress is because hardware that could operate in our real world with the same dexterity that ChatGPT/Claude can converse online, will be extremely expensive at first.
Robotics companies are not just focused on dexterity. They are focused on improvements to dexterity that stay within a very tight economic envelope. Inexpensive dexterity is going to take a while.
One very important task to solve is the ability to select a box from a shelf and set it neatly on a pallet, as well as the reverse. People have been working very hard on this problem for a long time, there are impressive demos out there, yet still nobody is ready to set their best box manipulating robots loose in a real warehouse environment.
How hard can it be to consistently pick up boxes and set them down again in a different location? Pretty hard, apparently.
I mean with rigid plastic containers robots are 'pretty consistent' at it now.
The problem with things like cardboard boxes, especially at any size is internal weight distribution and deformation of the box. If you take someone that is pretty new to stacking boxes at a wearhouse and give them sloppy boxes (ones that bend or otherwise shift) they are going to be pretty slow at it for the first hour or so, then we'll internalize the play in the materials and start speeding up considerably while getting a nice result.
It's pretty amazing how evolution has optimized us for feedback sensing like this.
> I am unaware of any barrier, or any reason to infer there is one, for dextrous robots.
I don't think there's a fundamental barrier to building a humanoid robot but the cost will be an extremely high barrier to adoption.
A human is nature's ultimate robot: hundreds of servos, millions of sensors, self-assembling from a bag of rice, self-repairing for minor damage. You just can't beat that, not for a very long time.
Economy of scale. Musk in his vaporware style said a humanoid robot will cost 10k usd.
> I am unaware of any barrier, or any reason to infer there is one, for dextrous robots.
Pretraining data?
So much easier in many ways.
You can train in stages.
First stage, either digitally generate (synthetic) basic movements, or record basic human recorded movements of a model. The former is probably better and an generate endless variation.
But the model is only trying to control joint angles, position etc. no worries about controlling power. The simulate system has no complications like friction.
The you train with friction, joint viscosity, power deviance from demand based on up, down times, fade, etc.
Then train in a complex simulated environment.
Then train for control.
Etc.
The point being, robotic control is easy to be broken down into small steps of capability.
That massively improves training speed and efficiency, even potentially smaller models.
It is also a fear simpler task by many orders of magnitude to learning the corpus of the written internet.
Comparable to that, would be training an AI to operate with any land, sea or air device. Which, nobody today is trying, (AFAIK)
It's so easy! I hope all the leading robotics researchers come to find this comment and finally deliver us the dexterous humanoid robots we've all been waiting for
Well, in fairness, the kind of deep neural architectures needed to do this stuff have only been available for a relatively short period. The robotics researchers in my institution are basically racing to put all this new capability to work.
Eg: https://hub.jhu.edu/2025/07/09/robot-performs-first-realisti...
Synthetic data works better for robots since you can generate endless scenarios based on real physical laws.
Do you know of success stories here? Success of transferring models learned in physics simulation to the real world.
When we (ZenRobotics) tried this 15 years ago a big problem was the creation of sufficiently high-fidelity simulated worlds. Gathering statistics and modelling the geometry, brittleness, flexibility, surface texture, friction, variable density etc of a sufficiently large variety of objects was harder than gathering data from the real world.
We have massively better physics simulations today than 15 years ago, so the limitations you found back then don't apply today. It might still not be enough, but 15 years is such a long time with Moore's law and we already know all the physics so we just needed more computation to do what is needed.
Example of modern physics simulation: https://www.youtube.com/watch?v=7NF3CdXkm68
Google has done training in simulation: https://x.company/projects/everyday-robots/#:~:text=other%20...
I believe this is the most popular tool now: https://github.com/google-deepmind/mujoco
Thanks for the links.
AFAICT these have not resulted in any shipping products.
I like that this has serious arguments and not "humans are magical" stuff which one sees in discussions of possible limitations of AI reasoning. (The author specifies that it's about humanoids of today and not humanoids in principle.)
This in particular shocked me a bit:
> No sense of touch. Human hands are packed absolutely full of sensors. Getting anywhere near that kind of sensing out of robot hands and usable by a human puppeteer is not currently possible.
This stands in contrast with e.g. self-driving cars, which are already superhuman at the sensor level.
Touch sensors are a thing. A spin off[1] of a company I used to work for uses pressure sensors embedded in rubber to get a pretty robot and sensitive sense of touch, though the granularity is low and it doesn't do lateral force.
[1]https://www.takktile.com/
Yes, quickly skimming your linked site, I saw Pressure, Force, and Vibration sensors, but as you mention no mention of lateral force, nor of temperature.
Thinking briefly about the sensations used in manupilating tools or checking a surface, both lateral and temperature are hugely important.
And, of course the insane density of sensors built into human fingers relative to anything built by humans.
It has a looong way to go
I don't know if it's playing sports my whole life and watching a lot of nature documentaries, but it really seems to me that certain types who spend 95% of their waking hours sitting at a desk don't appreciate just how capable, and frankly maybe even "magical," animal bodies really are. Someone is in these comments talking about how their muscle car can accelerate faster than a cheetah. Sure, but can it turn like one, jump onto and off of rocks, stop on a dime, time its acceleration to match the exact moment it needs to pounce on prey to knock it off its own feet while avoiding getting crushed or kicked itself?
Humans have swam across the English channel, have swam from Cuba to Florida, from Alaska to Russia. Humans have run across the Sahara desert. We've scaled 2000 foot vertical rock walls with our hands. We've walked to the summit of every 8k mountain on the planet. We can go places like the top of Everest, the middle of Death Valley in the summer, deep wildnerness in thick forest or jungle, places that are so dangerous in part because rescue vehicles like ATVs, snowmobiles, and helicopters can't get there, but humans on foot can. And yeah, we can also thread nuts onto bolts, handle locked doors, do tre flips and impossibles on skateboards.
Ultimately, we and all other animals are still just machines, and there is no reason in principle that machines built from engineered planned rather than grown from evolved plans can't do all of the exact same things and even more, but it's a harder problem than many seem to appreciate. Even when I was in a tank brigade in the late 2000s, we always went to Afghanistan as infantry, because our trucks and tanks were completely useless in the rocky, steep mountains, but there is no terrain anywhere that a sufficiently well-trained human can't traverse.
I feel like the word ‘robots’ is a fairly serious omission from the title…
My first though at the title was "But hominids are already quite dexterous."
I don't feel that scarecrow or sex doll dexterity is a major issue
At least for me it was obvious what humanoids mean, but I have worked in the field.
I really understood that to mean "modern humans", and expected some archeology content.
Well, I mean, that’s what it means in one field.
But what humanoids are in the reality which lack the dexterity?
I'm a bit surprised that Brooks is focusing on sensory and electro-mechanical issues as being what's holding back humanoid robotics...
Just being the shape of a human means nothing if it doesn't also have the brain of a human, which is needed if it's meant to be general purpose, or at least adaptable to a useful variety of tasks.
How exactly are these robots meant to be trained for new tasks (or just same task, different factory, or different work station)? It seems that to be useful they'd need to be able to learn on the job. Even if they've been pre-trained in simulation to do a wide variety of things, and take natural language or imitation instruction, they are still going to need to bridge the sim-2-real gap, and book smart to proficient gap, to be able to perform, and that'd at least require runtime learning.
Tesla/Musk seem to think that they're already in the robotics business with self-driving cars trained in simulation, but a car only has two degrees of freedom (speed and steering angle), and only performs one task - driving. A general-purpose humanoid robot is a whole other level of complexity.
>How exactly are these robots meant to be trained for new tasks (or just same task, different factory, or different work station)? It seems that to be useful they'd need to be able to learn on the job.
The plan for raw intelligence is to push transformers as far as we can. What's the full extent of ICL for robotics ? We don't know the answer to that yet.
This was released just a few days ago, https://www.skild.ai/blogs/omni-bodied
Very interesting point that while we've figured out how to digitize images, text and sounds we haven't digitized touch. At best we can describe in words what a touch sensation was like. Smell is in a similar situation. We haven't digitized it at all.
Touch is a 2D field of 3D vectors. Easily stored and transmitted as images, and easily processed by neural nets. You could add temperature and pain/damage channels if you want, though they don't seem essential for most manipulation tasks. (Actually I don't believe touch is as essential as he argues anyway. Of course someone who learned a task with touch will struggle without it, but they can still do it and would quickly change strategies and improve.)
The problem with touch is making sensors that are cheap and durable and light and thin and repairable and sensitive and shape-conforming. Representation is trivial in comparison.
This, it's a transduction problem (it's difficult to sense and even more difficult to output), not a representation problem.
A person who can't feel anything would struggle to reach around an obstacle, find a bolt they can't actually see, and thread a nut onto it.
I've done that and similar things many times. Touch is important. It may not be essential for all tasks but it is for some. Maybe even many.
There actually has been some recent work on digitizing smell, most notably Osmo, which was founded by some ex-Google ML researchers: https://www.salon.com/2025/01/05/digital-smell-has-arrived-a...
I'm not sure describing it in words is very helpful, and there's probably a good amount of such data available already.
I would think the way to do it is build the touch sensors first (and it seems they're getting pretty close) then just tele-operate some robots and collect a ton of data. Either that, or put gloves on humans that can record. Pay people to live their normal lives but with the gloves on.
I think he meant to write "Prologue" instead of "PROLOG".
I spent a few minutes excitedly trying to figure out how one of my favourite declarative programming languages was used to solve modern robotic sensing problems, only to realise it was probably just a misspelling ... :(
Q: How many Prolog programmers does it take to change a lightbulb?
A: False.
Yeah, he also immediately starts talking about GOFAI, to further confuse the Enemy...
If you asked someone 300 years ago what an automated dishwashing machine would've looked like, it would be a lot more like a person than the wet cupboard we have now. I'm assuming many tasks will be like that -- it's more of a lack of imagination for why we say we need a humanoid robot to solve that task. I'm assuming it'll be the minority of tasks where it make it makes sense for that
A robot that isn't stationary, in a home or in a factory, wants legs. Wheels are fine for cars but not great for stepping over things (like on cluttered floors) and stairs. So legs, assuming we've got the compute and algorithms to get them to work well, only make sense. The rest allows for application of creativity. As a human, have a head, my brain is in it, as are my eyes. Humanoid robot doesn't need a head, and can have cameras in its chest and on its back, and then also have its brain in the chest. Depending on what's useful, it doesn't need to be limited to two arms. It could have one centrally mounted in its chest, with two cheaper ones on both sides. Or four, two on each side. I've wished for three hands before. The problem though is that they look weird. Any non-traditional design is going to fall into the uncanny valley, so that no matter how much better your non-traditionally armed robot is technically, it's just not gonna sell to the mass market. We only have to look at weird cars/vehicles which have a history of being boondoggles. So it's not a failure of imagination, and more a matter of practicality.
This needs to be some sort of maxim: “The most useful robots are the ones that don’t look like (or try to be) humanoids.”
For some reason (judging by Fritz Lang, Gundam, etc.) humanity has some deep desire or curiosity for robots to look like humans. I wonder if cats want robot cats?
> humanity has some deep desire or curiosity for robots to look like humans.
I don't think you can draw that conclusion. Most people find humanoid robots creepy. I think we have a desire for "Universal Robotics". As awesome as my dishwasher is, it's disappointing that it's nearly useless for any other task. Yeah, it washes my dishes, but it doesn't was my clothes, or put away the dishes. Our desire for a humanoid robot, I think, largely grows out of our desire for having a single machine capable of doing anything.
The vast majority of “universal robots” are portrayed as humanoids in science fiction. Perhaps part of the reason is that true “universality” includes socializing, companionship, human emotions, and of course love.
Or, alternatively, general-purpose robots tend to be human-shaped because the world as it already is has already been fully designed for humans. Single doors are tall because that's the size of a human. Tools are designed to be held in something like a hand, sometimes two of them. Stairs are designed to be walked on, and basically any other traversal method just falls apart.
Of course, there is also the thing where authors and artists tend to draw anything with human intelligence as humans, from robots to aliens. Maybe it's the social reason you mention, or they just unconsciously have assumed humans to be the greatest design to ever exist. But even despite this, in a human world, I expect the first true general-purpose robots to be "standing" upright, with one or several arm-like limbs.
If you want to make a more general-purpose robot, then approximating a human form is rational, because our spaces and systems are designed for human interaction. At the moment, though, no-one has really succeeded at that, and all the successful robots are much more specialised.
This.
One robot that rules them all is preferable from many perspectives, but we're simply not there yet.
By that definition we have massive numbers of those robots already. But that brings up the Sortie's paradox of when does a machine become a robot.
If you can create a human you become god in a way.
Also it would just be compatible with our current world.
You _can_ create a human. Or at least participate in its creation.
A non-humanoid robot is called a *machine".
We have lots of those.
[flagged]
[flagged]
Right but that's very task specific, and what many people want is a single robot which can do many different tasks, and do so without modifying the existing environment. I would love a robot which could cook and clean and do laundry (including folding) but I still need to live in the same space it would use. The most obvious way to do that is a humanoid robot, which is why nanny companies are working on it, and here he's arguing that's not going to work.
The other obvious way to do it is to centralize it, have a lift in your house that brings up meals and clean laundry to order, where you can put your dirty dishes in when you're done, and a central space where staff and robots take care of things.
I'm actually surprised or interested that this isn't more of a thing, it doesn't take any high tech either. I suppose people like having their own stuff, or people can't be trusted, or it's prohibitively expensive to outsource food / laundry (even if especially in the US ordering food or eating out is very common).
> Before too long (and we already start to see this) humanoid robots will get wheels for feet, at first two, and later maybe more, with nothing that any longer really resembles human legs in gross form. But they will still be called humanoid robots.
Totally agree. Wheels are cheaper, more durable and more effective than legs.
Human would have wheels if there was an evolution pathway to wheels.
The world is full of curbs, stairs, lips, rugs, vehicles, etc. If you're a human-scale robot then your wheels need really wide base to not tip over all the time, so you are extremely awkward in any kind of moderately constrained space. I wouldn't exchange my legs for wheels. Wheelchair users have to fight all the time for oversights to be corrected. I can see maybe see a wheel-based humanoid robot, but only as a compromise.
On the other hand there is not much reason to constrain ourselves to the unstable and tricky bipedal platform or insist on having a really top-heavy human-like torso. You could have 3-4 little legs on a dog scale body with several extra long upwards reaching arms for eg.
> if there was an evolution pathway to wheels
It's hard to see one. Even a nice flat world with ample incentive and taking good "bearings" for granted, how can you evolve a wheel-organ that maintains a biological connection as well as being able to rotate an indefinite number of time?
A few difficult and grotesque endpoints:
* The wheel only rotates a fixed number of times before the creature must pivot and "unwind" in the opposite direction. This one seems most plausible, but it's not a real wheel.
* The main body builds replacement wheels internally (like tooth enamel) and periodically ejects a "dead" wheel which can be placed onto a spoke. This option would make it easier to generate very tough rim materials though.
* A biological quick-release/quick-connect system, where the wheel-organ disconnects to move, but then reconnects to flush waste and get more nutrients.
* A communal organism, where wheel-creatures are alive and semi-autonomous, with their own own way to acquire nutrients. Perhaps they would, er... suckle. Eeugh.
In one of Philip Pullmans His Dark Material novels there is a race of creatures that have a symbiosis with a tree whose huge perfectly round nut can be grasped by their fore and hind limbs and they roll around that way.
One of the Animorphs spin offs had them too, it was meant to be specifically genetically engineered or something from distant memory.
There are lizards or beetles that tumble down sand dunes.
Maybe cartwheeling humans could lead to some adaptation where the whole body becomes the wheel.
Wheels are not balls. Balls are common in nature. Wheels are not. The difference is that wheels need roads, which are not common nature and a large scale artificial objects.
That makes sense. Even in the Pullman book there were natural roadways for rolling.
Wheels are great until the robot encounters uneven surfaces, such as a stairway, or a curb. So some kind of stepping functionality would still be necessary.
What would be the evolutionary pressure to grow wheels? They are useless without roads
Roads, i think you answered your own question, would be an evolutionary pressure, hypothetically speaking.
I sometimes imagine wheeled creatures evolving in a location with a big flat hard surface like the Utah salt flats or the Nazca desert, but I guess there's not much reward for being able to roll around since those places are empty as well as flat. Tumbleweed found some success that way though, maybe?
The golden wheel spider lives in the sand dunes of the Namib Desert. When confronted by a spider-hunting wasp, it can perform a "cartwheeling" maneuver to escape. By tucking in its legs and turning onto its side, it can roll down a sand dune.
Humans can also do this, sometimes they use tools like a tractor wheel to conver themselves into downhill wheels, often to hillarious effects.
Is there any biological examples of freely rotating power systems? We have nice rotating joints with muscles to provide power, but I can't think of any joint that would allow the sort of free rotation while also producing torque, a wheeled animal would require.
Some microorganisms have cilia that rotate like a propeller. With complex molecular structures to provide a rotor effect
Something internal to some shellfish, I believe, a kind of mixing rod that rotates. Hold on, I'll check if it's powered. (Also rotifers but they're tiny.)
Hmm, no, it sounds like it's externally powered:
> The style consists of a transparent glycoprotein rod which is continuously formed in a cilia-lined sac and extends into the stomach. The cilia rotate the rod, so that it becomes wrapped in strands of mucus.
https://en.wikipedia.org/wiki/Rotating_locomotion_in_living_...
Or maybe the cilia ( = wiggly hairs) could be seen as a kind of motor. Depends how you count it and exactly what the set-up is, I can't tell from this.
I think I would count internal power created by the rotating component itself. I hadn't though of that possibility, since human made machinery usually has the power producing component located in the main body and transferring that power to a freely rotating component is quite hard. Biological systems wouldn't necessarily look like that, and could feasibly be powered by the wheels themselves deforming as if the wheels were a separate, but connected, biological system.
That's quite interesting.
> Wheels are [...] more effective than legs.
Maybe in your living room. But step into a dense forest (which is what we are made for) and that statement will be far away from reality.
Power to wheel sensors are the paws. Where is the CNS/brain going?
In any case, it seems like a "simple" problem to solve. An accelerometer chip costs a few cents and the data rates can be handled by a very light wiring harness, ex I2C.
So embedding such a sensor in every rigid component, wiring a single data line to all of them (using the chassis as electrical ground) and feeding the data back to the model seems a trivial way to work around this problem without any kind of real pressure sensitivity. The model knows the inputs it gives to the actuators/servos, so it will quickly learn to predict the free mechanical behavior of the body, and use any deviation to derive data equivalent to pressure and force feedback.
Another possible source of data is the driving current of the motors/actuators which is proportional to the mechanical resistance the limb encounters. All sorts of garbage sources of data that were almost useless noise in the classical approach become valuable with a model large enough.
> they will likely have to collect the both the right data, and learn the right thing.
The "bitter lesson" says to stop trying to find simple rules for how to do things - stop trying to understand - and instead to use massive data and massive search to deal with all the incredibly fussy and intractable details magically.
But the article here is saying that the lesson is false at its root, because in fact lots of understanding is applied at the point of choosing and sanitising the data. So just throwing noise the model won't do.
This doesn't seem to match experience, where information can be gleaned from noise and "garbage sources of data ... become valuable with a model large enough", but maybe there's something illusory about that experience, IDK.
Natural language wasn't solved by brute force until we started using trillion parameter models and using the whole internet, every book and every article ever published as training data.
I don't know of anyone spending tens of billions on this problem like Microsoft did for OpenAI. First you'd have to build up a dataset of trillions of token equivalents for motion. What that looks like alone is largely guess work. Then you'll need to build a super computer to scale up the current sota motion model to 100 times the size of the biggest model today. Then you'll have to pretrain and finetune the models.
If after all that dexterity still isn't solved all we can say is that we need more data and bigger models.
People seriously don't understand how big big data for AI is and what a moonshot GPT3 and 4 were.
Tesla's approach is "start with motion captured data, move on to first person view video demonstrations, then move on to any video demonstrations - i.e. feed YouTube into the system and hope it learns something from that".
And that led to a car that kills people.
While the companies that have cars moving around safely all used a very diverse mix of human and ML created models. Completely on the face at the Bitter Lesson.
Isn’t one of the main functions of the brain / nervous system to “filter” noisy sensory input data to provide a coherent “signal” to perception? Perhaps a smaller or more specialized model could do that if we ended up packing the “skin” of the humanoid with various sensors.
In theory yes, a small(er) model to do almost anything exists in the problem space of all models.
The problem is getting to that model state. Evolution found these models by creating a ridiculously huge number of experiments with the cut off function being 'can it breed before it dies on a limited number of calories'.
At least at this point it doesn't seem likely we can find a shortcut beyond that necessary computation. Evolution did it with time and parallelism. We do it differently with scale and rapid energy usage.
> Another possible source of data is the driving current of the motors/actuators which is proportional to the mechanical resistance the limb encounters.
The problem is precisely the actuators. A lot of a human's muscles actually come in pairs - agonist and antagonist muscles [1], and it's hard to match the way human muscles work and their relatively tiny size in a non-biological actuator.
Just take your elbow and angle it to 90 degrees, then rapidly close it so your upper and lower arm are (almost) in parallel. An absolutely easy, trivial task to do for your pair of muscles controlling the tendons. But now, try to replicate even this small feat in a motor based actuator. You either use some worm gear to prevent the limb from going in the wrong direction but lose speed, or you use some sort of stepper motor that's very hard to control and takes up a lot of space.
[1] https://en.wikipedia.org/wiki/Anatomical_terms_of_muscle
> Just take your elbow and angle it to 90 degrees, then rapidly close it so your upper and lower arm are (almost) in parallel.
That's trivial with the modern flat motors and position feedback. In fact, motors can do it faster and with more precision than we.
The only reason it was ever hard was because motors didn't have a lot of torque/volume.
The reason our muscles come in pairs is because they can only really apply force in one direction. Motors don't have this limitation, and don't need to be paired.
Anyway, motors still don't have enough torque density for making fine manipulators, and the lack of sensorial data will still stop you from interacting well with the outside world.
From the article: a human hand has about 17,000 low-threshold mechanoreceptors in the glabrous skin (where hair doesn’t grow) of the hand, with about 1,000 of them right at the tip of each finger, but with much lower density over the rest of each finger and over the palm. These receptors come in four varieties (slow vs fast adapting, and a very localized area of sensitivity vs a much larger area) and fire when they sense pressure applied or released.
Where can you buy the artificial equivalent?
Naturalistic fallacies will only carry you so far. For example, my 12 year old car has none of the incredibly adapted limbs and muscles of a cheetah, but can still easily exceed the animal land speed.
The article makes a compelling case that a certain kind of sensory input and learning is necessary to crack robotic movement in general, it remains to be seen if such a fine array of sensors as the human hand is useful outside very specific use-cases. A robot that can stock shelves reliably would still be immensely useful and very generalizable, even if it can't thread the needle due to limited fine sensory abilities.
You are moving the goalpost.
Title of the article you're commenting: Why Today’s Humanoids Won’t Learn Dexterity
Thesis the article is contradicting: The idea is that humanoid robots will share the same body plan as humans, and will work like humans in our built for human environment. This belief requires that instead of building different special purpose robots we will have humanoid robots that do everything humans can do.
You are now arguing that a specialized robot lacking dexterity would still be immensely useful. Nobody is disputing that. It's just not what the article is about.
Not sure if I buy the argument that touch sensitivity is a prerequisite for dexterity.
I can put on a thick glove (losing touch and pressure sensitivity all together) and grab a fragile glass without breaking it.
Because you have learnt it already and you can make predictions. And you don’t lose pressure sensitivity, you still feel the pressure of your hand to the glove, a better example would be using an exoskeleton or robotic arm or inactivate certain nerves. Still you risk more of breaking it imo and you have to be more careful in the beginning until you learn again.
Edit: and you probably are not gonna be as fast doing it
You don't lose pressure or touch sensitivity from wearing even thick welding gloves. You can still feel how hard you are gripping the rod quite easily.
The same is true for a motor controller.
Depends heavily on the use case. Indeed many tasks humans carry out are done without touch feedback - but many also require it.
An example of feed-forward manipulation is lifting a medium-sized object. Classic example is lifting a coffee cup. If you misjudge a full cup for empty you may spill the contents before your brain manages to replan the action based on sensory input. It takes around 300ms for that feedback loop to happen. We do many thing faster than that would allow.
The linked article has a great example of a task where a human needs feedback control: picking up and lighting a match.
Sibling comments also make a good point on that touch may well be necessary to learn the task. Babies do a lot of trial-and-error manipulation and even adults will do new tasks slower first.
The industry's approach to "trial and error to learn the task" is to have warehouses of robots perform various tasks until they get good at them. I imagine that you'd rely on warehouses less once you have a real fleet of robots performing real tasks in real world environments (and, at first, failing in many dumb and amusing ways).
Robots can also react much faster than 300ms. Sure, that massive transformer you put in charge of high level planning and reasoning probably isn't going to run at 200 tokens a second. But a dozen smaller control-oriented networks that are directly in charge of executing the planned motions can clock at 200 Hz or more. They can adjust fast if motor controllers, which know the position and current draw of any given motor at any given time, report data that indicates the grip is slipping.
Only because of your training otherwise.
This is a good point, but I’m not convinced it negates the author’s argument.
Consider whether you could pick up that same fragile glass with your eyes closed? I’d wager you could, as you’d still receive (diminished) textile feedback despite the thick gloves.
What about full control of a claw machine to pick up said glass?
The scientists at Oak Ridge National Labs develop a lot of dexterity working with robotic manipulators in the radioactive hot cells there
https://youtu.be/B-Lj7xAXJpc
If I can see the said glass, absolutely. I can even do it remotely.
And yet, when you pick up a glass expecting it to be full but it turns out to be empty, you'll overdo the motion; part of dexterity is expectations.
I don't follow (possibly through my own limitations) the main argument.
> The center piece of my argument is that the brute force learning approaches that everyone rightfully touts as great achievements relied on case-specific very carefully engineered front-ends to extract the right data from the cacophony of raw signals that the real-world presents.
In nearly each of the preceding examples, isn't the argument really about the boundaries that define the learning machine? Just because data preparation / formatting / sampling / serialization is more cost-effective to do externally from the learning machine, doesn't mean that boundary is necessary. One could build all of this directly inside the boundary of the learning machine and feed it the raw, messy, real world signals.
Also, humans having plentiful learning aids doing "tokenization", as anyone who helped a child learn to count has experienced first hand.
If you understand "cost-effective" to mean the same thing as "feasible with today's tech", maybe. As in, if we feed it all the raw data, we'd need more powerful, expensive devices and they would take years or decades to complete any training on the raw data set.
But without it being done, it's an unproven hypothesis at best.
It wouldn't take decades or years of compute to train a language model that doesn't tokenize text first. It's not an 'unproven hypothesis' because it's already been done. It's just a good deal more cost effective to tokenize so those exercises aren't anything more than research novelty.
It did not sound like that's the only preprocessing step, but even with that, how "costly" would that be for a model comparable to ChatGPT 4 or 5?
Also, the comment was not related to LLMs only.
Note that the goal is to get comparable performance, iow to compare like for like.
> No sense of touch. Human hands are packed absolutely full of sensors. [...] We store energy in our tendons and reuse it on the next step
Side-rant: As cool as some cyberpunk/sci-fi ideas are, I can't imagine a widespread elective mechanical limb replacement within the lifetime of anyone here. We dramatically under-estimate how amazing our normal limbs are. I mean, they're literally swarms of nanobots beyond human comprehension. To recycle an old comment against mechanical limbs:
________
[...] just remember that you're sacrificing raw force/speed for a system with a great deal of other trade-offs which would be difficult for modern science to replicate.
1. Supports a very large number of individual movements and articulations
2. Meets certain weight-restrictions (overall system must be near-buoyant in water)
3. Supports a wide variety of automatic self-repair techniques, many of which can occur without ceasing operation
4. Is entirely produced and usually maintained by unskilled (unconscious?) labor from common raw materials
5. Contains a comprehensive suite of sensors
6. Not too brittle, flexes to store and release mechanical energy from certain impacts
7. Selectively reinforces itself when strain is detected
8. Has areas for the storage of long-term energy reserves, which double as an impact cushion
9. Houses small fabricators to replenish some of its own operating fluids
10. Subsystems for thermal management (evaporative cooling, automatic micro-activation)
_______________
I predict the closest thing we might see instead will be just growing replacement biological limbs, followed by waldoes where you remotely control an arm without losing your own.
Per 5, it says here "Human hands are packed absolutely full of sensors. Getting anywhere near that kind of sensing out of robot hands and usable by a human puppeteer is not currently possible."
Then another quote, "No one has managed to get articulated fingers (i.e., fingers with joints in them) that are robust enough, have enough force, nor enough lifetime, for real industrial applications."
So (3) and (7) are relevant to lifetime, but another point, related to sensors, is that humans will stop hurting themselves if finger strain occurs, such as by changing their grip or crying off the task entirely. Hands are robust because they can operate at the edge of safe parameters by sensing strain and strategizing around risk. Humans know to come in out of the rain, so to speak.
I have come to realize that we barely understand complexity. I've read a lot on information theory, thermodynamics, many takes on entropy. Not to mention literature on software development, because a lot of this field is managing complexity.
We severely underestimate how complex natural systems are. Autonomous agents seem like something we should be able to build. The idea is as old as digital computers. Turing famously wrote about that.
But an autonomous complex system is complex to an astronomical degree. Self driving vehicles, let alone autonomous androids, are several orders of magnitude more complex that we can even model.
Seems related to:
https://en.wikipedia.org/wiki/Variety_(cybernetics)
Yes! Thank you!
I have read Wiener and Ashby to reach this conclusion. I've used this argument before. A piece of software capable of creating any possible software would be infinitely complex. Also the reason I don't buy the "20 w general intelligence exists". The wattage for generally intelligent humans would be the entire energy input to the biosphere up to the evolution of humans.
Planetary biospheres show general intelligence, not individual chunks of head meat.
That knowledge held in evolution equates to "training" for an AGI, I guess. Mimicking 4 billion years of evolution shouldn't take that long ... but it does sound kind of expensive now you mention it.
[flagged]
Now I'm imagining a brain in a jar, but with every world-mimicking evolved aspect of the brain removed. Like, it has no implicit knowledge of sound waves or shapes or - well, maybe those low-level things are processed in the ears and retinas, but it has no next-stage anticipation of audio or visual data, either, and no body plan that relates to the body's nerves, and no relationship to digestion or hormones or gravity or jump scares or anything else that would prepare it for being monkey-shaped and living in the world. But, it has the key thing for intelligence, the secret sauce, whatever that is. So it can sit there and be intelligent.
Then you can connect it up to some input and output, and ... it exhibits intelligence somehow. Initially by screaming like a baby. Then it adapts to the knowledge implicit in its input and output systems ... and that's down to the designer. If it has suction cup end effectors and a CCD image sensor array doobrie ... I guess it's going to be clumsy and bewildered. But would it be noticeably intelligent? Could it even scream like a baby, actually? I suppose our brains are pre-evolved to learn to talk. Maybe this unfortunate person would only be able to emit a static hiss. I can't decide if I think it would ever get anywhere and develop appreciable smarts or not.
[flagged]
I feel like I can intuit these things pretty well but others can't. For example I see everyone talking about LLMs replacing developers and I'm over here thinking there is absolutely no way an LLM is replacing me any time soon. I'll be using it to do my job faster and better sure, but it won't replace me. It can barely do a good job while I hold it's hand every step of the way. It often goes crazy and does all kinds of dumb stuff.
Similarly reading this article I agree with the author and I feel like what they're saying seems obvious. Of course making robots that can match humans' abilities is an absolutely insurmountable task. Yes, insurmountable as in I don't think we will ever do it.
Automating specific tasks in a factory is one thing, making a robot that can just figure out how to do things and learn like a human does is many orders of magnitude beyond. Even LLMs aren't there, as we can see from how they fail at basic tasks like counting the Rs in Raspberry. It's not intelligence it's just the illusion of intelligence. Actual intelligence requires learning. Not training. Actual intelligence won't run a command, fail to read it's output, make up the output and continue as if everything is fine while in fact nothing is fine. But LLMs will because they're stupid stochastic parrots, basically fancy search engines. It's really strange to me how everyone else seems blind to this.
Maybe if we some day figure out real artificial intelligence we will have a chance to make humanoids that can match our own abilities.
Also to prevent breaking other things or hurting others. That’s also why robots will have tons of safety issues for a while
I'd add an 11th point to expand on #1: supports a very wide range of movement speeds, movement force/torque and movement precision.
Take the elbow joint and the muscles it's connected to. It supports very fine precision, slow speed operations as well as high speed but at the same time the same operation at high speeds - say, lifting yourself up on a horizontal bar, assuming adequate strength you can either do a slow or a fast lift, and both at enough precision and torque to prevent your body mass from impacting to the bar which is another feat in itself.
Now try to replicate that with a classic mechanical mechanism, you'll always lose either precision, speed or torque.
Yeah, it's cool and all, but I more that once was frustrated that it can't rotate freely, it has only one elbow joint, it can't extend.
You want telescopic rotary jazz hands?
Sure. I even had a simulated experience of having extendable arms in my dream. So, the control machinery is probably there for some reason.
One of the things that is true of humans is that we have an extremely mutable body plan and sensorium.
https://plasticity-lab.com/body-augmentation
https://www.carlosterminel.com/wearable-compass
https://www.madsci.org/posts/archives/mar97/858984531.Ns.r.h...
https://www.sciencedirect.com/science/article/pii/S096098220...
Bolting on extra senses, tools, limbs is no big deal.
Humans are also some of the most physically adaptable animals on the planet, in terms of being able to remodel our bodies to serve new tasks. "specific adaptation to imposed demand" is one of the things that really sets us (and a few other animals) apart in a remarkable way. Few animals can practice and train their bodies like we can.
In addition, I understand research shows that people with amputations very quickly adapt both practically and psychologically, as a general principle (some unfortunate folks are stuck with phantom pain and other adaptive issues).
The old discussion about "adding 20 minutes to your commute is worse than losing a leg below the knee" takes into account the fact that most people underestimate how large a negative effect commuting has, but also overestimate how large a negative effect losing a portion of a limb has.
It's likely that humans beat basically every other animal at this - because humans are social tool users. Most animals learn their body plan once and almost never change it. Humans have to learn to use new tools or work with other humans all the time.
Which seems to reuse the same brain wiring as what's used for controlling the body. To a professional backhoe operator, the arm of the backhoe is, in a very real way, his arm.
Curiously enough, most current neural interfaces don't seem to expose much of this flexibility. It's likely that you'd have to wire into premotor cortex for that - but for now, we're mostly using the primary motor cortex instead, because it's much better understood. The signals found there are more human-comprehensible and more prior work was done on translating them into useful motions.
A cool approach to digitizing touch that I read about a few days ago: https://www.wired.com/story/this-clever-robotic-finger-feels...
(I'm not disagreeing with the author, just sharing an article that is interesting/relevant.)
Who is kidding who? Just watch a film of a single cell critter, approach something and one , go yum! and engulf it or two go ahhhhhhhh!, and run away I believe the full technical explanation for that goes, mumble,mumble,chemical receptors something, mumble mumble.Humans are sensitive to certain chemicals in the parts per billion, and your finger can detect surface roughness down to 1/1000'th of an inch, thats the standard issue, exceptional indivuals with training will perform significantly better.
> When an instability is detected while walking and the robot stabilizes after pumping energy into the system all is good, as that excess energy is taken out of the system by counter movements of the legs pushing against the ground over the next few hundred milliseconds. But if the robot happens to fall, the legs have a lot of free kinetic energy, rapidly accelerating them, often in free space. If there is anything in the way it gets a really solid whack of metal against it. And if that anything happens to be a living creature it will often be injured, perhaps severely.
The fine article has a carefully crafted set of media queries. They react to every increase in the zoom level by shrinking the text. I would have read the article but my tired old eyes were unable to squint hard enough. Thanks web designers!
How can I invest in humanoid robot companies now? I do believe this will be tried to be made into the next hype cycle.
It's been tried a number of times already though, robotics companies have been around for decades, Sony, Boston Dynamics, Hyundai and many others are already in the space (and some of those are on the stonks market). I don't think it'll become any bigger than what it is. Also since many have tried to make it hype already, Tesla being the latest.
There's a number of "robotics and embodied AI" ETFs out there that should show up with a quick search. I don't have an opinion as to their quality so you'd have to do your own research.
Pshaw, that's nothing, you need to invest in the companies promising to make a Greater Fool robotic investor, now that's where the market'll take off. :P
Unitree is planning to IPO in Q4 of this year, that should be a pretty hot property.
We have some great work digitizing touch and grasp: https://www.nature.com/articles/s41586-019-1234-z
Does this make telepresence / human controlled robotics either always on or step in for complex tasks, more relevant in the near to mid term?
Good question, it's already a thing for some use cases; what they can do in surgery is pretty amazing, but that's a use case where the robotic things have a huge benfit over people themselves. But that's a clear usecase of a complex task. For banal tasks like folding laundry... it'll remain more practical to just do it yourself, in the flesh. It can be done remotely but due to the limitations of robots and the internet it'll be slower and more expensive.
I misread the title and I thought it was about humans.
And I could see it. With prevalence of screens kids already don't learn a lot of dexterity that previous generations have learned. Their grip strength is weak and capacity for fine 3d motions is probably underdeveloped as well.
Last week I've seen an intelligent and normally developing 7 year old kid asking mum to operate a small screwdriver to get to the battery compartment of a toy because that apparently was beyond his competence.
Now with recent developments in robotics, fully neural controllers and training in simulated environments there could be that modern babies will have very little tasks requiring dexterity left when they grow up.
> because that apparently was beyond his competence.
This has almost nothing to do with nature (barring a development issue).
This has to do with nurture. Every time they went to do something with a tool a helicopter gunship of a parent showed up to tell them no. Now they have a learned helplessness when it comes to these things.
But that's not really any different then when I was a kid so very long ago. At 4 or 5 I was given a stack of old broken radios and took them to the garage for a rip and tear session. I got to look at all their pretty electronic guts that fascinated me. There were plenty of other parents of that time that would have been horrified to see their kids do something similar.
Doesn't Carmack go that route?
Letting AIs play games to learn, but by using physical controllers, etc.
Isn't another hardware problem being ignored here? Pound-for-pound muscle fibers are just superior to what you can achieve with electric motors or pneumatics.
Take size, strength, precision, longevity, and speed. It's not hard to match or beat organic muscle fibers on one or two of these dimensions with an electrically driven system, but if it does, it's going to neglect other dimensions to such a degree as to put building a humanoid robot that achieves parity with a human completely out of reach.
You can slather as much AI as you want on top of inadequate hardware - it's not going to help.
Electric motors are significantly better at the requirement that matters: endurance.
Sure it takes a bigger motor to produce the same torque, but speed and precision are actually the strengths of electric motors. The fundamental problem with them is that reducers are not impact resistant and they have internal inertia, which is something muscles do not have. Another problem is building actuators with multiple degrees of freedom. The ideal configuration for legs is a ball joint, not two consecutive rotary joints.
For the few who may not know:
Rodney Brook's achievements include Lucid, the Roomba and Baxter.
Which gives him great credibility on the history of the field, but is maybe a hinderance in advancing it. Advancement comes from people who are too ignorant to know that what they want to do is considered impossible by experts in the field. Not always, of course, but enough times that being the expert that created the Roomba back in the augts doesn't automatically mean he's right.
I lost a ton of respect for the author when he started talking about speech recognition.
He makes a few claims:
(1) That speech recognition isn't end to end because it requires highly sophisticated mathematically crafted preprocessing.
(2) That this is evidence human learning is more sophisticated than deep learning.
So (1) is just nonsense. It was true 10 years ago but wasn't true 6 years ago. And if he's that far out of date, that really poisons my ability to trust him.
And (2) misses some important knowledge about how humans work, which most speech recognition researchers know about. The human ear actually does it's own version of Fourier decomposition by using different length hairs in the ear. The human body does a ton of evolved preprocessing. Given that we could develop in decades audio preprocessing that took evolution millenia to build, we seem to be doing pretty well.
> [preprocessing] was true 10 years ago but wasn't true 6 years ago
Can you say more? What are some examples of speech recognition systems that don't need this preprocessing?
Confusing title because of the choice for the word "humanoid". When I see that, I expect we're talking about a creature shaped like a human. The word for human-shaped robots has always been "android". Can we please just continue using that?
Quick googling brings smaller than a penny strain/pressure sensors under $10. That is a sense of touch enough for most of the tasks.
True, most of the tasks can be done with off-the-shelf hardware already. But single task robotics is already a solved problem, what the humanoid robots are about is multi-task, aimed at replacing the tasks that still require human hands / legs / eyes / brains / etc.
But I think most of those can be replaced by existing robotics as well anyway. I mean take car manufacturing, over time more and more humans were replaced by robots, and nowadays the newest car factories are mostly automated (see lights-out manufacturing: https://en.wikipedia.org/wiki/Lights_out_(manufacturing)). Interestingly a Japanese robot factory has been lights-out since 2001, where they can run for 30 days on end without any lights, humans, heating or cooling.
The article points out that the human hand has over 10000 sensors with specific spatial layout and various specialised purposes (pressure / vibration / stretching / temperature) that require different mechanical connections between the sensor and the skin.
You don't need all those for most tasks modern tasks though. Sure if you wanna sew a coat or something like that, but most modern day tasks require very little of that sort of skill.
the Nature limited us to just 2 hands for all tasks and purposes. The humanoids have no such limitations.
>10000 sensors with specific spatial layout and various specialised purposes (pressure / vibration / stretching / temperature) that require different mechanical connections between the sensor and the skin.
mechanical connection wouldn't be an issue if we lithograph the sensors right onto the "skin" similarly to chips.
Sorry, I meant to emphasize _different_ mechanical connections. That a sensor that detects pressure has a different mechanical linkage than the one detecting vibration. So you need multiple different manufacturing techniques to replicate that at correspondingly higher cost.
The “more than 10000” also has a large impact in size (sensors need to be very small) and cost (you are not paying for one sensor but 10000).
Of course some applications can do with much less. IIUC the article is all about a _universal_ humanoid robot, able to do _all_ tasks.
Fails Clark's law?