GibberLink [AI-AI Communication]

79 points by anotherhue 5 months ago

dweekly 5 months ago

It's a little depressing to have reinvented the modem only 10,000 times less efficient.

At the point that two AIs discover that they are talking to each other, wouldn't it almost certainly be true that both could access the Internet and therefore the right thing to do (if more than a few bits of information need to be shared) is to exchange endpoint information and hang up the call to be able to communicate directly?

aprentic 5 months ago

Presumably both of these AIs have additional information beyond what was in their initial training.
The hotel agent probably has a RAG that points at their various customer and inventory databases. The user agent has individualized information about the customer. Both of them have also likely had SFT steps that further differentiate them.
The interesting question is if Gibber-Link lets the AIs do something they couldn't otherwise do with natural languages. Does it lower some error rate? Does it reduce the time it takes to send messages? Does it effectively give the AIs additional vocabulary?
If I had to guess at the internals, they probably took the token encodings and mapped them on to tones. Then it just throws text or audio through the decoding filter and passes it back.
If that's the case, the benefits are probably limited to slightly faster communications (It's essentially a simple, lossless compression) and a slightly lower error rate (beeps are easier to correct in noisy environments).
johnla 5 months ago

That might not be practical/possible for early days but this does seem like a bridge to that natural next step that /u/dweekly is saying which would quickly phase out this Gibberlink protocol.

flemhans 5 months ago

A standardized chime in the beginning of the phone call could serve to alert humans as well as AI agents that the party they are talking to are an AI, eliminating the first part of the conversation.

giancarlostoro 5 months ago

I love it, we are returning back to 64k internet where the phone starts screeching to get you where you need to be.
- _blk 5 months ago
  
  Bring back Cap'n Crunch and the rainbow books!
  
  aprentic 5 months ago
  
  I'm not sure we actually want to bring back Jon Draper.
  https://en.wikipedia.org/wiki/John_Draper#Allegations_of_sex...
elcritch 5 months ago

The Japanese figured this out already! We just say moshi-moshi when you start a phone call [1]. Originally it was believed ghosts couldn't say moshi-moshi (I can speak; I can speak). Now we can apply it to the ghosts in our machines. Who's gonna tell Openai? ;)
1: https://www.reddit.com/r/etymology/comments/13hc2gw/why_was_...
_blk 5 months ago

Yes, pleaaaase! Just make sure it's audible for most people. Otherwise one trick that works well is insulting the caller. Computers tend to react quite differently from humans. J/k, don't do it (if you're not certain that your insulting a bot.)
hooverd 5 months ago

That would be my queue to hang up. A computer should never talk to you first.
- Majromax 5 months ago
  
  > A computer should never talk to you first.
  Isn't that exactly what happens with every IVR (phone menu) system?
  
  hooverd 5 months ago
  
  They hopefully don't call you first. I was thinking of Google calling restaurants on your behalf to check for reservations. It's not valuing the other party's time. Unless they have their own systems.
  See everyone using their own LLMs to write paragraphs that will never be read and only summarized by an LLM on the other end. We're achieved negative compression.
  
  dingnuts 5 months ago
  
  yes and it is infuriating. I don't think anyone wants more of those. But it will be fun prompt injecting AI agents with my mouth in the coming decades. Beats the old hacks for beating the IVR systems to get to a person
noja 5 months ago

A standardised sound throughout

empath75 5 months ago

Everyone is right that the protocol is the wrong one to use, but there _should_ actually be some formally documented handshake for ai-agents to use to agree on an outside protocol to switch to.

ricktdotorg 5 months ago

you're right -- there SHOULD be!

jarbus 5 months ago

Brilliant, I don't feel this is pracctical, but I love the creativity.

willwade 5 months ago

This reminds of me chirp https://archive.is/HEC29

https://audioxpress.com/news/data-over-sound-pioneer-chirp-a...

hansonkd 5 months ago

Why doesn't it just communicate a unique conversation ID and then use a backchannel like opening up a web connection instead? It is supposing that you are able to make a call but not connect to the internet?

bibimsz 5 months ago

this is using phone only

nimish 5 months ago

This is the equivalent of the Yo app but for """AI"""

This is also high art. This needs to be in MOMA or something.

I love this.

jrh3 5 months ago

Use English—the power of plain text.

bibimsz 5 months ago

time is money

m3kw9 5 months ago

I wonder how well it can listen if there is lots of bg noise in this type sounds

shrubble 5 months ago

What’s the RTTY protocol?

megadata 5 months ago

According to the page they're using https://github.com/ggerganov/ggwave

MrG3D 5 months ago

Soon to be replaced by IEEE P2874.

AKSF_Ackermann 5 months ago

They stepped on every single rake possible, didn't they? 1. Why are you making a phone call in the first place, your agent probably got the number from the internet, just keep using that. 2. If you insist on initiating the conversation over a phone call, why not immediately terminate the call and again, go over the internet once you realize that it is an ai to ai conversation. 3. You did in fact re-invent a modem but worse, the quoted speed on that library is 8-16 bytes/sec, and i would like to point out that the Bell 103 did ~37 bytes/sec, and was released in 1963.

bibimsz 5 months ago

It comes down to the problem statement and what the constraints are. This is solving for using the phone-only scenario, which is perfectly valid.
If you want to address a phone-with-internet-backchannel, that's valid too - but it assumes different problem statement and constraints.
- AKSF_Ackermann 5 months ago
  
  Please pay more attention to the point 3 in my original post. To reiterate: their encoding is hilariously bad, and is easily outcompeted by a modem from the 60s.
  
  0xDEFACED 5 months ago
  
  youre missing the forest for the trees. the library this demo is using for audio encoding (ggwave) was not made by the creators of this demo. speed (or lack thereof) aside, having a direct audio<->text encoding is much more computationally efficient than speech<->text generation.
  on the subject of the encoding efficiency, the ggwave depo mentions the use of reed-solomon error correction to make transmission more reliable. im struggling to find any info on error correction used by bell 103 or other modems, but if they aren't as robust that could partially explain the discrepancy you're describing
  
  swexbe 5 months ago
  
  Sounds more futuristic than old dial-up sounds though
- godelski 5 months ago
  
  I think the most important part is the bitrate. As you said elsewhere: "time is money". Seems like you're not saving that much money
rglullis 5 months ago

4. If you are an agent receiving a call, why not announce it right away?

jcgrillo 5 months ago

> Warning: This went viral, be careful as there are a lot of scam projects trying to capitalize on this. We have nothing to do with them.

I'm doubling down on my thesis. All this "AI" crap is Web3 2.0. It's nothing but scams on scams.

huevosabio 5 months ago

No, Web3 had little to show other than scams.
This AI wave is so good that it makes it easy to create scams. So you get a lot of noise.
- dingnuts 5 months ago
  
  Citation heavily needed. What AIs do you use, and how much do you pay monthly for your usage, and for how much usage? If there are limits imposed on your account, how often do you hit them?
  Since it's pay per token, I would be a lot more likely to take my credit card and sign up for one of these services (they are all rather expensive to an individual) if I could get my money back for any tokens that generate hallucinations.
  Why would I pay for tokens that generate lies? Scam. It's literally gambling. Put in a quarter and you might get the answer you wanted easier than searching. Didn't get it? Well, put in another quarter, rejigger your prompt, and pull the lever again. Maybe the slot machine will give you the result you want, this time. Oh, it didn't? Well, the sunk cost got a little bigger. Better pull again..
  
  SamPatt 5 months ago
  
  Sorry, but this is a ridiculous rant.
  Hallucinations have become much less common among the SOTA models, especially for coding with good guidance.
  I've been using Claude Sonnet 3.7 a lot the last two days, and to my knowledge, there have been zero hallucinations. It just does what I ask.
  But when they do make coding mistakes - so what? So do humans.
  I use it for non-coding questions all the time. The idea that the utility I get is completely negated because of occasional hallucinations is absurd.
  Have you used any tools like Aider, Cursor or Code?
  
  jcgrillo 5 months ago
  
  > But when they do make coding mistakes - so what? So do humans.
  Surely we can do better than that. "Coding mistakes" are OK, so long as they're caught by review. However, engineers who continually make tons of mistakes without improving over time are liable to annoy their colleagues to the point of quitting, or alternatively (and hopefully more likely) asking management to remove the defective engineer. So the open question is:
  Do these tools make PRs that are irritating to review?
  Another related and arguably more important open question is:
  Does widespread use of "AI" tools in an engineering department result in significantly more defects being deployed in production?
  Another important question:
  Are codebases which have evolved over a period of years in an organization which makes heavy use of "AI" tools more or less easy to understand?
  Another question:
  Is the MTTR of incidents affected by the use of "AI" coding tools?
  I could go on, but the point is absolutely not "oh well devs make mistakes so do LLMs whatever who cares?" Until these questions are answerable concretely, this is nothing but a research topic. It's worth zero dollars. It's a fuckin' NFT.
  
  SamPatt 5 months ago
  
  You sound like an old engineer who believes that the process you've mastered is the only conceivable way to do things.
  Saying LLMs are worth zero dollars shows a disconnection from reality so profound that I doubt you'll ever be able to admit you were wrong.
  I wish you all the best with your remaining career.
  
  jcgrillo 5 months ago
  
  You know, I hope you're right. I'm tired, and I'd love to retire. If your techno-rapture is indeed imminent I'll soon be able to do that.
  Unfortunately, I can't join you in this cultish belief system. I have had the benefit of boom upon bust and hype cycles upon hype cycles, "AI" summers, "AI" winters, and more clever little spacecamp boondoggles and silly con valley grifter scams than I can count. And I'm not even 40 yet.
  So I'll believe it when I see it. You think you can change the world with chatcoin or whatever, go do it! Prove me wrong!
  I'm not betting on you.
  Those who do not know history are doomed to repeat it has been playing on a loop in my mind for the past few months... I wonder why?
  
  SamPatt 5 months ago
  
  Surely there's a middle ground between culty techno-raptures and literally zero value?
  Of course there's hype. Turns out I'm older than you are. I've seen hype cycles. I've also seen incredible progress.
  Why not acknowledge that hype exists and LLMs aren't perfect, but that many people find value in them already and there appears to be a positive trend in their capability?
  Knowledge of history is knowledge of technological progress. Hype cycles don't negate that.
  I hear being tired though. Let's watch and see what happens.
  
  jcgrillo 5 months ago
  
  Oh I agree LLMs are interesting, and many people are excited about them, but I don't think the utility has been demonstrated. I'll grant you plenty of people state they've been made more productive, but I really prefer objective measures especially in this overhyped fomo-riddled information environment. So until someone demonstrates objectively that these tools actually make the business of producing and running software more efficient, there's zero value. Even then, it remains to be seen whether these efficiency gains (if any) are lasting and sufficiently great to offset the material cost of running the models, and the social cost in the organizations that run them. It's hard to imagine how they could be worth the trillions the market expects.
  Sure, they seem to be getting cheaper and to some degree more accurate, but it's extremely dangerous to extrapolate. So I won't. Neither should you or anyone else. If these things are actually any good for something show, don't tell.
  EDIT: My "investment thesis" on this is that it's a huge bubble, and this "AI" hype will ultimately erase many times more value than it will create. I really hope I'm wrong, we'll see.

anton10xr 5 months ago

[dead]