Hey nice one! I'm trying to innovate in the STT and TTS space as well (completely different field), feel free to contact me in my profile email if you want to exchange some knowledge :). I hear some bugs in portuguese, but I'm guessing the trained model has some issues. Congrats and Good luck with your tool! Ps: give users some preview of the speakers voice so we can test it before convert audio, it can save you some resources.
Hello, we can meet on twitter: grsl_en :) thanks for your comment, it's still a v1 in beta but I will continue to improve it.
And yes as you can see in the ui there are buttons for preview but not yet functional :)
It does not handle apostrophes good. It says "we L L" instead of we'll or "Andrea S" instead of Andrea's. It also has some problem with pacing around dashes, three points and quotes. It speeds up a lot and connects the words before and after those. Overall would not use it in this state to turn articles into podcasts or something like that.
What about something that can do the opposite? Like converting video and/or audio to articles?
Most of the content I consume fits best (for me) in article format, so I can read it at my own speed, but some really good information can only (annoyingly) be found in videos or podcasts.
This is a good idea, I just started the project :) So this is a feature that can be added in the future.
Here is the roadmap for the future:
- Audio sharing
- Convert Text To Audio
- Convert PDF To Audio
- Convert Photo To Audio
- Chrome extension - to convert while browsing
- Mobile App - to manage audios everywhere, simply
and adding the possibility to do the opposite is also a great idea!
On the same topic, in Latvian all the characters with diacritics are stripped out completely. For example, "iedzīšana" is pronounced as "iedzana", making the audio pretty funny, though hard to understand.
I'm working on solving these pronunciation problems for the special characters if I unlock one everything else will follow. Don't hesitate to sign up to be kept informed!
Huh, wow, really cool! For the most part it's excellent (in English), however I notice that quotation marks (both single and double) are handled strangely. The leading quotation mark is pronounced as a short A, and the trailing as a long A. This can be rather confusing! But otherwise, I'm incredibly impressed by the results.
Thank you very much for your comment!! It's still a beta version and I have to improve some pronunciations! Especially for the special characters... don't hesitate to signup to be kept informed :)
It's certainly that all UTF-8 characters (like UTF-8 fancy quotes and double-quotes) aren't properly interpreted, everything seems to go through as ISO8859-15.
Nice! The german version breaks Umlauts though. Apparently the preprocessing converts e.g. "ä" to "ae", "ö" to "oe" and so on and the text2voice model subsequently pronounces them as e.g. "a-e", instead of what "ä" would actually sound.
Yes, I currently use the Read Aloud Android App and it allows me to use any installed TTS. The Google free network TTS voices are quite okay, but I know that there are better premium voices unfortunately I didn't found any high quality human like TTS in the play store as of yet.
Text-to-Speech (TTS) technology uses artificial intelligence (AI) translate written information in a given language into a sound, voice or speech with a human accent.to learn the AI had to learn with many parameters, so that the pronunciation improves from version to version :)
I'm working on solving these pronunciation problems for the special characters if I unlock one everything else will follow. Don't hesitate to sign up to be kept informed!
Here is the roadmap for the future:
· Audio sharing
----> FOR YOU · Convert Text To Audio
· Convert PDF To Audio
· Convert Photo To Audio
· Chrome extension - to convert while browsing
· Mobile App - to manage audios everywhere, simply
I see that for "French" it selects "Alain" as a voice in Chromium, but "Joe" in Firefox. However in Firefox I must first change the language, it then select another voice, and switching back to French, it switches properly to Alain then it works. It probably doesn't initialise properly the voice selector when loading the page in Firefox (if I don't change the language first, I can't select any voice, the selector isn't working).
Thanks for the great feedback. I think we can still improve the result, it's only a v1 in beta. I use nothing very advanced for rendering, a stack and tools rather simple. :)
Hey nice one! I'm trying to innovate in the STT and TTS space as well (completely different field), feel free to contact me in my profile email if you want to exchange some knowledge :). I hear some bugs in portuguese, but I'm guessing the trained model has some issues. Congrats and Good luck with your tool! Ps: give users some preview of the speakers voice so we can test it before convert audio, it can save you some resources.
Hello, we can meet on twitter: grsl_en :) thanks for your comment, it's still a v1 in beta but I will continue to improve it. And yes as you can see in the ui there are buttons for preview but not yet functional :)
It does not handle apostrophes good. It says "we L L" instead of we'll or "Andrea S" instead of Andrea's. It also has some problem with pacing around dashes, three points and quotes. It speeds up a lot and connects the words before and after those. Overall would not use it in this state to turn articles into podcasts or something like that.
What about something that can do the opposite? Like converting video and/or audio to articles?
Most of the content I consume fits best (for me) in article format, so I can read it at my own speed, but some really good information can only (annoyingly) be found in videos or podcasts.
I've been using whisper from openAI to transcribe stuff this month, and its incredibly accurate. would be a good base for something like this
This is a good idea, I just started the project :) So this is a feature that can be added in the future.
Here is the roadmap for the future: - Audio sharing - Convert Text To Audio - Convert PDF To Audio - Convert Photo To Audio - Chrome extension - to convert while browsing - Mobile App - to manage audios everywhere, simply
and adding the possibility to do the opposite is also a great idea!
Working on this very topic right now! But specific to Podcast audio content
https://readable.fm/
oh mate I so agree and as a deaf person this would be a godsend. way too much shit is in videos or podcasts. please just let me read...
I got super psyched to try this. Always wanted a good TTS extension or app.
However I cannot get it to work. I've logged in, input an article but nothing happens after I click "Convert article to audio" or preview.
Linux Mint/Firefox 107/Chrome
Edit:
I checked devtools and it shows a 500 error with the message "Something is broken. Please let us know what you did"
The link I was trying to convert was:
https://www.daemonology.net/blog/2020-09-20-On-the-use-of-a-...
Here is a bookmarklet that does TTS: https://locserendipity.com/Speaker.html
Also: https://www.locserendipity.com/TTS.html
Except that the quality of the audio rendering is... not crazy.
It defaults to whatever your browser defaults to. It is easy to change that setting in Chrome to a more natural voice: https://support.google.com/chromebook/answer/11221616?hl=en
I'm looking into how this can happen. Feel free to sign up to come back later and convert your article! I can't wait for you to use Article.Audio
Hey, your tool is not working great with german articles. It can't pronounce Umlauts and also has trouble with some pretty standard simple words.
I used this article as a test: https://www.saarbruecker-zeitung.de/saarland/landespolitik/s...
I will look at why the charactere that Umlauts are not well pronounced. Do you have examples of other words? I will investigate :)
Danke für dein Feedback, das hilft mir wirklich weiter .
On the same topic, in Latvian all the characters with diacritics are stripped out completely. For example, "iedzīšana" is pronounced as "iedzana", making the audio pretty funny, though hard to understand.
I'm working on solving these pronunciation problems for the special characters if I unlock one everything else will follow. Don't hesitate to sign up to be kept informed!
Huh, wow, really cool! For the most part it's excellent (in English), however I notice that quotation marks (both single and double) are handled strangely. The leading quotation mark is pronounced as a short A, and the trailing as a long A. This can be rather confusing! But otherwise, I'm incredibly impressed by the results.
Thank you very much for your comment!! It's still a beta version and I have to improve some pronunciations! Especially for the special characters... don't hesitate to signup to be kept informed :)
It's certainly that all UTF-8 characters (like UTF-8 fancy quotes and double-quotes) aren't properly interpreted, everything seems to go through as ISO8859-15.
I also noticed the same issue with quotation marks. But other than that this is a really nice application
Nice! The german version breaks Umlauts though. Apparently the preprocessing converts e.g. "ä" to "ae", "ö" to "oe" and so on and the text2voice model subsequently pronounces them as e.g. "a-e", instead of what "ä" would actually sound.
Indeed we have a problem with special characters in German (and also Polish… :( ) I am investigating why they are not pronounced correctly.
I studied German at school but not enough.
Vielen Dank für Ihren Kommentar, er hilft mir, das Tool zu verbessern :)
Could you make a paid TTS engine App on Android and iOS?
Yes it's in the roadmap. and also a chrome extension :) would you be interested ?
Yes, I currently use the Read Aloud Android App and it allows me to use any installed TTS. The Google free network TTS voices are quite okay, but I know that there are better premium voices unfortunately I didn't found any high quality human like TTS in the play store as of yet.
Dumb question: how is “AI” used for text-to-speech?
Text-to-Speech (TTS) technology uses artificial intelligence (AI) translate written information in a given language into a sound, voice or speech with a human accent.to learn the AI had to learn with many parameters, so that the pronunciation improves from version to version :)
Tested icelandic! sound really good except for ignoring all special icelandic character such as ð, ó, á ,ö and so on
Thanks for your comment! I'm actually working on this point to fix it as FAST as possible! sorry for that...
I had the same issue with hungarian language.
I'm working on solving these pronunciation problems for the special characters if I unlock one everything else will follow. Don't hesitate to sign up to be kept informed!
I will! Great stuff
Great work! I'm testing if I could use it in my project. It would be good to be able to just paste some text.
Thank you so much! What's your project?
Here is the roadmap for the future: · Audio sharing ----> FOR YOU · Convert Text To Audio · Convert PDF To Audio · Convert Photo To Audio · Chrome extension - to convert while browsing · Mobile App - to manage audios everywhere, simply
I had this idea a few months ago, obviously I never got around to executing it. I’m glad someone else did
That famous moment when you have the idea of a new side project, you buy the domain name and then... you have a new idea (loop).
This time I developed it, I'm happy with this 1st version (which must be improved).
Anyway, thanks for your comment
And you, why didn't you develop it in the end?
I can’t remember the list of thousands of excuses I came up with :)
Haha. destroy this list and go for it :D
I would like to have a tool that does it the other way around. Audio to a somewhat cohesive article.
I can only get "200 internal server error" entries in the dev console :)
Wow... :( Still now ?
It works on Chromium though. However it has trouble with UTF-8 obviously: it interprets "é" as "é" i.e. ISO8859-15.
I see that for "French" it selects "Alain" as a voice in Chromium, but "Joe" in Firefox. However in Firefox I must first change the language, it then select another voice, and switching back to French, it switches properly to Alain then it works. It probably doesn't initialise properly the voice selector when loading the page in Firefox (if I don't change the language first, I can't select any voice, the selector isn't working).
Same problem with accents as Chromium though :)
It works amazingly fast, are you using GPUs and QNNX to reach such performance?
Thanks for the great feedback. I think we can still improve the result, it's only a v1 in beta. I use nothing very advanced for rendering, a stack and tools rather simple. :)
I believe it's using Microsoft TTS voices (at least for some of them)
That's right we use Azure TTS! :)
Nice, what’s the tech stack?
Simple: PHP (Symfony), JS (Vanilla), HTML, TailwindCSS :)
Not usable with Hungarian.
I'm looking into it, a Polish user is having the same problems. Don't hesitate to sign up so I can let you know when it's fixed :)
Köszönöm a hozzászólásodat, ez segít nekem! :)