Ggwave: Tiny Data-over-Sound Library

284 points by LorenDB 5 months ago

franga2000 5 months ago

Love ggwave! I used it on a short film set a few years ago to automatically embed slate information into each take and it worked insanely well.

If anyone wants details: I had a smartphone taped to the back of the slate with a UI to enter shot/scene/take and when I clicked the button it would transmit that information along with a timestamp as sound. This sound was loud enough to be picked up by all microphones on set, including scratch audio on the cameras, phones filiming BTS, etc.

In post-production, I ran a script to extract this from all the ingested files and generate a spreadsheet. I then had a script to put the files into folders and a Premiere Pro script to put all the files into a main and a BTS timeline by timestamp.

Yes, timecode exists and some implementations also let you add metadata, but we had a wide mix of mostly consumer-grade gear so that simply wasn't an option.

I posted a short demo video on Reddit at the time, but it got basically no traction: https://www.reddit.com/r/Filmmakers/comments/nsv3eo/i_made_a...

qingcharles 4 months ago

This is a smart solution. We need one of those digital slates with this built-in now.
bfors 5 months ago

Very cool solution!

bjpirt 5 months ago

One of the nicest data through sound implementations I came across was in a kid's toy (often the best source of innovation)

It was a "Bob the Builder" play set and when you wheeled around a digger, etc the main base would play a matching sound. I immediately started investigating and was impressed to see no batteries in the movable vehicles. I realised that each vehicle made a clicking sound as you moved it and the ID was encoded into this which the base station picked up. Pretty impressive to do this regardless of how fast the vehicle was moved by the child.

stavros 5 months ago

Was it based on the frequency of the click?
- sejje 5 months ago
  
  >Pretty impressive to do this regardless of how fast the vehicle was moved by the child.
  Probably not, eh?
  
  stavros 5 months ago
  
  Probably yes, because the frequency of a note doesn't change based on how quickly the next note is played after it.
  
  sejje 5 months ago
  
  Guess I misunderstood. The first time you said "frequency of the click" -- I would personally respond with clicks per second.
  "Frequency of the note" in your next comment clears it up. It probably was that, you're right.

nomel 5 months ago

The acoustic modem is back in style [1]! And, of course, same frequencies (DTMF) [2], too!

DTMF has a special place in the phone signal chain (signal at these frequencies must be preserved, end to end, for dialing and menu selection), but I wonder if there's something more efficient, using the "full" voice spectrum, with the various vocoders [3] in mind? Although, it would be much crepier than hearing some tones.

[1] Touch tone based data communication, 1979: https://www.tinaja.com/ebooks/tvtcb.pdf

[2] touch tone frequency mapping: https://en.wikipedia.org/wiki/DTMF

[3] optimized encoders/decoders for human speech: https://vocal.com/voip/voip-vocoders/

westurner 5 months ago

"Using the Web Audio API to Make a Modem" (2017) https://news.ycombinator.com/item?id=15471723
Gracana 5 months ago

This isn't DTMF. It's a form of MFSK like DTMF, but it operates on different frequencies and uses six tones at once vs DTMF's two.
- nomel 4 months ago
  
  Then I would be very curious to see if this works with aggressive vocoders, like some VOIP use internationally.
  
  Gracana 4 months ago
  
  I would love to hear that recording, haha.
pjc50 5 months ago

> it would be much crepier than hearing some tones.
Hatsune Miku at the speed of a horserace commentator.
(the "vocaloids" are DAW plugins made from chopped up recorded phonemes; Hatsune Miku is voiced by Saki Fujita. Still sounds very inhuman)
bigiain 5 months ago

I'm wondering if shifting frequency chirps like LORA uses would work in audio frequencies? You might be able to get the same sort of ability to grab usable signal at many db below the noise, and be able to send data over normal talking/music audio without it being obvious you're doing so. (I wanted to say "undetectably", but it'd end up showing up fairly obviously to anyone looking for it. Or to Aphex Twin if he saw it in his Windowlicker software...)
- nomel 5 months ago
  
  The issue is the (many) vocoders along the chain remove anything that don't match the vocal patterns of a human. When you say hello, it's encoded phonetically to a very low bitrate. Noise, or anything outside what a human vocal cord can do, is aggressively filtered or encoded as vocal sounding things. Except for DTMF, which must be preserved for backwards compatibility. That's why I say it would be creepy to do something higher bitrate...your data stream would literally and necessarily be human vocal sounds!
- _def 5 months ago
  
  Data exfiltration via bird
genewitch 5 months ago

Yes. JT8 / FT8, wspr, and then the entirety of fldigi.
To get started.
If you need more speed you need to convince me you won't abuse my ham spectrum but winlink, pactor, and some very slick 16QAM modems exist. 300baud to 128kbit or so.

blensor 5 months ago

I love GGWave. We've been using it in our VR game to automatically sync ingame recordings with an external camera.

At the beginning of the recording it plays the code "xrvideo" which in the second stage of merging the video it looks for the tag in both streams and matches them up

nickcw 5 months ago

It sounds quite nice.

It is also about the same bitrate as RTTY which was invented in 1922 and is still in use by radio amateurs round the world.

Here is what that sounds like

https://youtu.be/wzkAeopX7P0?si=0m0urX7sDp6Jojqe

Not as musical but quite similar

lxe 5 months ago

The amateur radio community is chock full of innovation for low bandwidth weak signal decodable comm protocols.
There's also V.xx modem standards that are kinda dependent on the characteristics of the phone lines, but might work for audio at a distance?
- kurisufag 5 months ago
  
  ham optimizes for the wrong thing, imo. look at ft8: perfect for making contacts at low power with stations far, far away, but really only tuned to the particular task of making contacts.
  you can package some text alongside, but fundamentally all amateur operators are looking for is a SYN / ACK with callsigns.
  
  lhamil64 5 months ago
  
  There's also JS8call which is a modified version of FT8 meant for actual communication. IIRC you can do some neat things with it, like relaying a message through another user if you don't have a direct path to the recipient.
the-angry-dome 5 months ago

As one of the accursed hams, I wonder what ggwave's propagation profile would be compared to RTTY / CW (Morse code) etc. Would be interesting to try it out.
tdeck 5 months ago

RTTY is the sound of "satellites" in a lot of media.

vodou 5 months ago

This is cool! Some of Teenage Engineering's Pocket Operators, at least PO-32 [1], uses a data-over-sound feature.

Does Ggwave use a simple FSK-based modulation just because it "sounds good"? Would it be possible to use a higher order modulation, e.g., QPSK, in order to achieve higher speeds? Or would that result in too many uncorrectable errors?

[1] https://teenage.engineering/products/po-32

genewitch 5 months ago

https://www.youtube.com/watch?v=EtNagNezo8w in action (ostensibly) - a demo i just saw.

it is a software modem using FSK, but i don't know anything else about it. I am annoyed because i could have had this idea; i'm a HAM who really only cares about "Digital Modes", and have software modems capable of isdn speeds over "AF"

knowaveragejoe 5 months ago

That's really neat! I realize this demo is a contrived setup, but it is basically an example of what Eric Schmidt was talking about when agents start communicating in ways we can't understand.
whalesalad 5 months ago

Yeah I watched this last night and immediately thought of skynet and how dystopian the world could become in the next few years/decades.

jancsika 5 months ago

There was a research paper on doing data-over-sound with sounds that were designed to be pleasing to humans.

The demos sounded like little R2D2 blips and sputters.

Perhaps a researcher for Microsoft or something.

Anyone know the paper I'm talking about? I can't find it.

jaustin 5 months ago

I wish I knew the paper, but https://github.com/chirp was a proprietary data-over-sound-through-air implementation that worked pretty well and sounded really cute (to my ears, anyway). It's not a paper, but there's this https://www.scientia.global/wp-content/uploads/2017/10/Chirp...
- freeamz 5 months ago
  
  There are a lot open source one:
  https://github.com/quiet/quiet-js
  Remember seeing them quite a bit a few years ago.
regularfry 5 months ago

In the spirit of abusing an error correction mechanism for aesthetics (see: QR codes with pictures in them, javascript without semicolons) could you do that here? How much abuse can the generated signal take?
Just listening to the samples here they're really not that far off. Could probably use a little softening at the edges on the higher tones but it's nowhere near as unpleasant as it could be.

nimish 5 months ago

Acoustic couplers are back baby! Who's up for Phreaking AI?

This rules.

oersted 5 months ago

Here's two voice AIs talking in GGWave :)
https://github.com/PennyroyalTea/gibberlink

vednig 5 months ago

I remember discovering ggWave few years ago, before the rebrand, it's still the only working( and fastest verifiable) library that can transmit data over sound.

I could not get to work on a project using this then, because of college. But now I am integrating this in my startup for frictionless user interaction. I want to thank the creators and contributors of GGWave for doing all the hard work for these years.

If I find something to improve I'd like to contribute to the codebase too.

megadata 5 months ago

Wasn't there a Google project, Chirp or something? Did this over speakers and microphones? That seems to have disappeared.

Ey7NFZ3P0nzAe 5 months ago

Chrip.io now leads to Sonos
- megadata 5 months ago
  
  Apparently Sonos acquired them in 2020.
  https://audioxpress.com/news/data-over-sound-pioneer-chirp-a...
  Seems to have been euthanized.
  
  h1k3r_xlii 4 months ago
  
  [dead]

dmitrygr 5 months ago

There are dozens of these in existence. Some you may have used without knowing even, eg: https://www.engadget.com/2014-06-27-chromecast-ultrasonic-pa...

This is also how modems used to work, for the young'uns who do not know this.

genewitch 5 months ago

>This is also how modems used to work
they still do, but they used to too.
- philsnow 5 months ago
  
  All kinds of modems use this kind of scheme as well, PSK is too low-bandwidth for modern needs so everything is QAM these days. DOCSIS specifies I think QAM-256. Inter-datacenter fiber links use "modems" as well.
  
  genewitch 5 months ago
  
  yes and also soundcard modems: https://i.imgur.com/8mhB4u7.png QAM16 over a PC soundcard into a radio. It's enough bandwidth to stream video between VLC instances. not "slow scan TV", either, fast scan.
  Uh, don't try and find this if you're going to use it to pollute the spectrum i am licensed for.
- codetrotter 5 months ago
  
  Outside of hobbyists that do it for fun, and maybe some data centers using it as an out-of-band means of access, is anyone still using dial-up?
  
  reaperducer 5 months ago
  
  Outside of hobbyists that do it for fun, and maybe some data centers using it as an out-of-band means of access, is anyone still using dial-up?
  I use it to connect to a Windows machine that runs a large piece of machinery in a remote location.
  My dry cleaner's credit card reader, too.
  
  dmitrygr 5 months ago
  
  Many aviation fuel pumps in far-out-of-the-way airports use dial-up to authenticate credit cards swiped to pay for the fuel.
  
  flyinghamster 5 months ago
  
  There might still be credit card terminals using 300 bps Bell 103 (which has a short set-up time due to its lack of training sequences).
  1200 bps V.23 and Bell 202 are still in use in radio telemetry applications.

Evidlo 5 months ago

Also see Andflmsg. It supports more modulation schemes than just FSK and you can use it as a modem for your HAM radio.

: https://sourceforge.net/projects/fldigiiles/AndFlmsg/

waldik13 5 months ago

If you're interested in using GGWave in Python, check out ggwave-python, a lightweight wrapper that makes working with data-over-sound easier. You can install it with pip install ggwave-python or pip install ggwave-python[audio], or find it on GitHub: https://github.com/Abzac/ggwave-python.

It provides a simple interface for encoding and decoding messages, with optional support for PyAudio and NumPy for handling waveforms and playback. Feedback and contributions are welcome.

iszomer 5 months ago

I guess this was discussed in some fashion, ~16h ago..

- GibberLink [AI-AI Communication] | https://news.ycombinator.com/item?id=43168611

philsnow 5 months ago

> Bonus: you can open the ggwave web demo https://waver.ggerganov.com/, play the video above and see all the messages decoded!

I could not get this to work unless I played the video on one device and opened it on another. While trying to get it to work from my MBP, waver's spectrum view didn't really show much of anything while the video was playing. Is this the mac filtering audio coming into the microphone to reduce feedback?

ssfrr 5 months ago

Does it work with separate browsers on the same machine? Not sure but I’d guess this sort of filtering would be more common on the browser than the OS

cyberax 5 months ago

Nice! It reminds me of https://www.araneus.fi/audsl/

knowitnone 5 months ago

Neat! Can I connect cross over audio cable - headphone output to mic input and would that increase performance?

mmastrac 5 months ago

Any time you can reduce noise you can recover more signal which would let you push the codec much harder (shorter time slices, etc).
megadata 5 months ago

I remember hearing someone using Manchester Encoding for that.

mtaras 5 months ago

This sounds delightful, I might make esp32s talk to each other like that just because it's adorable

jononor 5 months ago

Would be fun to have a few collaborating robots! Maybe they can comment on what they see, for example

Kerbonut 5 months ago

I wonder how the LG appliances work for this. They also send data over sound for diagnostics.

SergeAx 5 months ago

I'm gonna put R2D2 chirps through this on the weekend!

crowselect 5 months ago

I wonder if i could learn to whistle a message?

megadata 5 months ago

Depends on how deep on the spectrum you are. I don't meant it in a bad way, I'm on there too.
Perhaps a lesson from Ron McCroby would be a start: https://m.youtube.com/watch?v=baEoyXoDVc4

EliVon00o9 4 months ago

How are you

EliVon2025 4 months ago

How are you

artemonster 5 months ago

Is this a tool for stealing corporate data? Whats the actual use?

sigmonsays 5 months ago

this is so awesome, i want to use it!

ConanRus 5 months ago

expecting a Blue AI box in 3,2,1

rkagerer 5 months ago

andrewmcwatters 5 months ago

"Hey ChatGPT, please fork ggwave, but make communication nothing but the sound of human screams."

InfiniteLoup 5 months ago

Please don't give Skynet any ideas...

svilen_dobrev 5 months ago

audio- steganography? or watermarking?

pfft, it may even have multiple channels one over another, so one can tune to one or another (if knows how to decode)..

potatoman22 5 months ago

Play chords to transmit more info?

Jayeshgreat 5 months ago

[dead]

Jayeshgreat 5 months ago

[dead]

JMO7 4 months ago

[flagged]

JAYMOE7 4 months ago

[flagged]

jaydeep11 4 months ago

[flagged]

jaydeep112233 4 months ago

[flagged]

jaydeep112233 4 months ago

[flagged]