areoform 6 days ago

> The company said the users who were affected chose the option in Facebook’s Messenger app to have their voice chats transcribed. The contractors were checking whether Facebook’s artificial intelligence correctly interpreted the messages, which were anonymized.

Where exactly is this setting? I've looked through Facebook's settings and Messenger's settings, but this option is rarer than a cheap white truffle. Does anyone know?

minimaxir 7 days ago

Unlike previous articles about tech-companies-listening-to-user-audio, this is over voice transcription rather than smart speaker QA.

Facebook does have a smart speaker (Portal) with voice commands (https://portal.facebook.com/help/2149102838698668/) but that isn't mentioned in the article.

pmantas 6 days ago

Am I the only one who see’s a problem with them actively working to convert a non-indexable data source into indexable and searchable one?

kerng 7 days ago

Is it from WhatsApp audio conversations? Or just random recordings through the apps - which Zuckerberg denied before?

  • Calvin02 7 days ago

    RTFA - "The company said the users who were affected chose the option in Facebook’s Messenger app to have their voice chats transcribed. The contractors were checking whether Facebook’s artificial intelligence correctly interpreted the messages, which were anonymized."

    • dekhn 6 days ago

      This sounds totally reasonable to me? Low quality machine learning algorithm needs human labellers?

      • vokep 5 days ago

        But why sample on realworld data from non-employees?

        • dekhn 4 days ago

          Huh? Because that's the product you're trying to improve!

sgt101 7 days ago

Is it fair to wonder why they are using people when automagical Ai transcription should do this for them like the man from Google/deepmind/amazon/IBM/Microsoft said? Or is FAIRs really not up to much?

  • ipsum2 7 days ago

    Not exactly sure what you're asking, but all tech companies hire people to transcribe audio precisely to gather data to train ML models to do transcription.

    • derefr 7 days ago

      Is there a reason that these ML models are being hoarded as "secret sauce" when, for these companies, all the rivals they're concerned about also have all the resources required to build one that's nearly as good? It feels strange that we've got six different tech giants that have all independently spent tons of capital building up the training data required to sell people smart speakers/mobile speech control/etc. with these ML models, without any of them entering into cross-licensing agreements.

      It seems like it'd make a lot more sense for Apple, Google, Amazon, Facebook, etc. to all pool their training data in an "industry working group" to build and license out one "best" model, the way that IWGs are formed to build and license out e.g. AV codecs.

      • Calvin02 7 days ago

        > "to all pool their training data"

        The press would skewer them alive and politicians will have a field day about tech companies violating privacy and sharing data.

      • Smithalicious 7 days ago

        It's bad enough that one BigCorp has my data. I'd rather not have them also give it out to every other BigCorp

      • etaioinshrdlu 7 days ago

        ML is an extremely competitive field right now and everyone's trying to get an advantage over everyone else. Not too ripe for cooperation right now.

      • solarkraft 7 days ago

        It's the same reason car makers don't all use the same platform. Everyone is hoping to get a slight edge over the others to preform better in the market.

    • sgt101 6 days ago

      Hang on, everyone's been at this for years now - are we seriously saying that Facebook et-al don't have large training sets for speech transcription? Why are they still labelling this? Why do they need 100's of contractors?

      I can see that a couple of folks might be engaged in carefully reviewing low confidence transcription events, but 100's?