franga2000 3 hours ago

Love ggwave! I used it on a short film set a few years ago to automatically embed slate information into each take and it worked insanely well.

If anyone wants details: I had a smartphone taped to the back of the slate with a UI to enter shot/scene/take and when I clicked the button it would transmit that information along with a timestamp as sound. This sound was loud enough to be picked up by all microphones on set, including scratch audio on the cameras, phones filiming BTS, etc.

In post-production, I ran a script to extract this from all the ingested files and generate a spreadsheet. I then had a script to put the files into folders and a Premiere Pro script to put all the files into a main and a BTS timeline by timestamp.

Yes, timecode exists and some implementations also let you add metadata, but we had a wide mix of mostly consumer-grade gear so that simply wasn't an option.

I posted a short demo video on Reddit at the time, but it got basically no traction: https://www.reddit.com/r/Filmmakers/comments/nsv3eo/i_made_a...

  • bfors 2 hours ago

    Very cool solution!

bjpirt 11 hours ago

One of the nicest data through sound implementations I came across was in a kid's toy (often the best source of innovation)

It was a "Bob the Builder" play set and when you wheeled around a digger, etc the main base would play a matching sound. I immediately started investigating and was impressed to see no batteries in the movable vehicles. I realised that each vehicle made a clicking sound as you moved it and the ID was encoded into this which the base station picked up. Pretty impressive to do this regardless of how fast the vehicle was moved by the child.

  • stavros 5 hours ago

    Was it based on the frequency of the click?

    • sejje 5 hours ago

      >Pretty impressive to do this regardless of how fast the vehicle was moved by the child.

      Probably not, eh?

      • stavros 5 hours ago

        Probably yes, because the frequency of a note doesn't change based on how quickly the next note is played after it.

        • sejje 3 hours ago

          Guess I misunderstood. The first time you said "frequency of the click" -- I would personally respond with clicks per second.

          "Frequency of the note" in your next comment clears it up. It probably was that, you're right.

nomel 2 days ago

The acoustic modem is back in style [1]! And, of course, same frequencies (DTMF) [2], too!

DTMF has a special place in the phone signal chain (signal at these frequencies must be preserved, end to end, for dialing and menu selection), but I wonder if there's something more efficient, using the "full" voice spectrum, with the various vocoders [3] in mind? Although, it would be much crepier than hearing some tones.

[1] Touch tone based data communication, 1979: https://www.tinaja.com/ebooks/tvtcb.pdf

[2] touch tone frequency mapping: https://en.wikipedia.org/wiki/DTMF

[3] optimized encoders/decoders for human speech: https://vocal.com/voip/voip-vocoders/

  • Gracana 3 hours ago

    This isn't DTMF. It's a form of MFSK like DTMF, but it operates on different frequencies and uses six tones at once vs DTMF's two.

  • pjc50 10 hours ago

    > it would be much crepier than hearing some tones.

    Hatsune Miku at the speed of a horserace commentator.

    (the "vocaloids" are DAW plugins made from chopped up recorded phonemes; Hatsune Miku is voiced by Saki Fujita. Still sounds very inhuman)

  • bigiain a day ago

    I'm wondering if shifting frequency chirps like LORA uses would work in audio frequencies? You might be able to get the same sort of ability to grab usable signal at many db below the noise, and be able to send data over normal talking/music audio without it being obvious you're doing so. (I wanted to say "undetectably", but it'd end up showing up fairly obviously to anyone looking for it. Or to Aphex Twin if he saw it in his Windowlicker software...)

    • nomel a day ago

      The issue is the (many) vocoders along the chain remove anything that don't match the vocal patterns of a human. When you say hello, it's encoded phonetically to a very low bitrate. Noise, or anything outside what a human vocal cord can do, is aggressively filtered or encoded as vocal sounding things. Except for DTMF, which must be preserved for backwards compatibility. That's why I say it would be creepy to do something higher bitrate...your data stream would literally and necessarily be human vocal sounds!

    • _def 20 hours ago

      Data exfiltration via bird

  • genewitch 18 hours ago

    Yes. JT8 / FT8, wspr, and then the entirety of fldigi.

    To get started.

    If you need more speed you need to convince me you won't abuse my ham spectrum but winlink, pactor, and some very slick 16QAM modems exist. 300baud to 128kbit or so.

waldik13 an hour ago

If you're interested in using GGWave in Python, check out ggwave-python, a lightweight wrapper that makes working with data-over-sound easier. You can install it with pip install ggwave-python or pip install ggwave-python[audio], or find it on GitHub: https://github.com/Abzac/ggwave-python.

It provides a simple interface for encoding and decoding messages, with optional support for PyAudio and NumPy for handling waveforms and playback. Feedback and contributions are welcome.

vednig 2 hours ago

I remember discovering ggWave few years ago, before the rebrand, it's still the only working( and fastest verifiable) library that can transmit data over sound.

I could not get to work on a project using this then, because of college. But now I am integrating this in my startup for frictionless user interaction. I want to thank the creators and contributors of GGWave for doing all the hard work for these years.

If I find something to improve I'd like to contribute to the codebase too.

blensor 10 hours ago

I love GGWave. We've been using it in our VR game to automatically sync ingame recordings with an external camera.

At the beginning of the recording it plays the code "xrvideo" which in the second stage of merging the video it looks for the tag in both streams and matches them up

vodou 21 hours ago

This is cool! Some of Teenage Engineering's Pocket Operators, at least PO-32 [1], uses a data-over-sound feature.

Does Ggwave use a simple FSK-based modulation just because it "sounds good"? Would it be possible to use a higher order modulation, e.g., QPSK, in order to achieve higher speeds? Or would that result in too many uncorrectable errors?

[1] https://teenage.engineering/products/po-32

nickcw a day ago

It sounds quite nice.

It is also about the same bitrate as RTTY which was invented in 1922 and is still in use by radio amateurs round the world.

Here is what that sounds like

https://youtu.be/wzkAeopX7P0?si=0m0urX7sDp6Jojqe

Not as musical but quite similar

  • lxe a day ago

    The amateur radio community is chock full of innovation for low bandwidth weak signal decodable comm protocols.

    There's also V.xx modem standards that are kinda dependent on the characteristics of the phone lines, but might work for audio at a distance?

    • kurisufag 14 hours ago

      ham optimizes for the wrong thing, imo. look at ft8: perfect for making contacts at low power with stations far, far away, but really only tuned to the particular task of making contacts.

      you can package some text alongside, but fundamentally all amateur operators are looking for is a SYN / ACK with callsigns.

      • lhamil64 3 hours ago

        There's also JS8call which is a modified version of FT8 meant for actual communication. IIRC you can do some neat things with it, like relaying a message through another user if you don't have a direct path to the recipient.

  • tdeck an hour ago

    RTTY is the sound of "satellites" in a lot of media.

  • the-angry-dome 18 hours ago

    As one of the accursed hams, I wonder what ggwave's propagation profile would be compared to RTTY / CW (Morse code) etc. Would be interesting to try it out.

jancsika a day ago

There was a research paper on doing data-over-sound with sounds that were designed to be pleasing to humans.

The demos sounded like little R2D2 blips and sputters.

Perhaps a researcher for Microsoft or something.

Anyone know the paper I'm talking about? I can't find it.

  • regularfry 2 hours ago

    In the spirit of abusing an error correction mechanism for aesthetics (see: QR codes with pictures in them, javascript without semicolons) could you do that here? How much abuse can the generated signal take?

    Just listening to the samples here they're really not that far off. Could probably use a little softening at the edges on the higher tones but it's nowhere near as unpleasant as it could be.

genewitch 2 days ago

https://www.youtube.com/watch?v=EtNagNezo8w in action (ostensibly) - a demo i just saw.

it is a software modem using FSK, but i don't know anything else about it. I am annoyed because i could have had this idea; i'm a HAM who really only cares about "Digital Modes", and have software modems capable of isdn speeds over "AF"

  • knowaveragejoe a day ago

    That's really neat! I realize this demo is a contrived setup, but it is basically an example of what Eric Schmidt was talking about when agents start communicating in ways we can't understand.

  • whalesalad a day ago

    Yeah I watched this last night and immediately thought of skynet and how dystopian the world could become in the next few years/decades.

Kerbonut 6 hours ago

I wonder how the LG appliances work for this. They also send data over sound for diagnostics.

philsnow 20 hours ago

> Bonus: you can open the ggwave web demo https://waver.ggerganov.com/, play the video above and see all the messages decoded!

I could not get this to work unless I played the video on one device and opened it on another. While trying to get it to work from my MBP, waver's spectrum view didn't really show much of anything while the video was playing. Is this the mac filtering audio coming into the microphone to reduce feedback?

  • ssfrr 3 hours ago

    Does it work with separate browsers on the same machine? Not sure but I’d guess this sort of filtering would be more common on the browser than the OS

knowitnone 21 hours ago

Neat! Can I connect cross over audio cable - headphone output to mic input and would that increase performance?

  • mmastrac 21 hours ago

    Any time you can reduce noise you can recover more signal which would let you push the codec much harder (shorter time slices, etc).

  • megadata 11 hours ago

    I remember hearing someone using Manchester Encoding for that.

mtaras a day ago

This sounds delightful, I might make esp32s talk to each other like that just because it's adorable

  • jononor 6 hours ago

    Would be fun to have a few collaborating robots! Maybe they can comment on what they see, for example

SergeAx 10 hours ago

I'm gonna put R2D2 chirps through this on the weekend!

dmitrygr 2 days ago

There are dozens of these in existence. Some you may have used without knowing even, eg: https://www.engadget.com/2014-06-27-chromecast-ultrasonic-pa...

This is also how modems used to work, for the young'uns who do not know this.

  • genewitch 2 days ago

    >This is also how modems used to work

    they still do, but they used to too.

    • philsnow 20 hours ago

      All kinds of modems use this kind of scheme as well, PSK is too low-bandwidth for modern needs so everything is QAM these days. DOCSIS specifies I think QAM-256. Inter-datacenter fiber links use "modems" as well.

      • genewitch 16 hours ago

        yes and also soundcard modems: https://i.imgur.com/8mhB4u7.png QAM16 over a PC soundcard into a radio. It's enough bandwidth to stream video between VLC instances. not "slow scan TV", either, fast scan.

        Uh, don't try and find this if you're going to use it to pollute the spectrum i am licensed for.

    • codetrotter a day ago

      Outside of hobbyists that do it for fun, and maybe some data centers using it as an out-of-band means of access, is anyone still using dial-up?

      • flyinghamster a day ago

        There might still be credit card terminals using 300 bps Bell 103 (which has a short set-up time due to its lack of training sequences).

        1200 bps V.23 and Bell 202 are still in use in radio telemetry applications.

      • dmitrygr a day ago

        Many aviation fuel pumps in far-out-of-the-way airports use dial-up to authenticate credit cards swiped to pay for the fuel.

      • reaperducer a day ago

        Outside of hobbyists that do it for fun, and maybe some data centers using it as an out-of-band means of access, is anyone still using dial-up?

        I use it to connect to a Windows machine that runs a large piece of machinery in a remote location.

        My dry cleaner's credit card reader, too.

sigmonsays 16 hours ago

this is so awesome, i want to use it!

artemonster 7 hours ago

Is this a tool for stealing corporate data? Whats the actual use?

ConanRus a day ago

expecting a Blue AI box in 3,2,1

svilen_dobrev a day ago

audio- steganography? or watermarking?

pfft, it may even have multiple channels one over another, so one can tune to one or another (if knows how to decode)..

  • potatoman22 14 hours ago

    Play chords to transmit more info?

andrewmcwatters a day ago

"Hey ChatGPT, please fork ggwave, but make communication nothing but the sound of human screams."

  • InfiniteLoup 9 hours ago

    Please don't give Skynet any ideas...