Monday, October 30, 2017

Text-to-speech, multiple card

Overview: Single sound card setup isn't always easy, but this post has more than enough info for single card text-to-speech. As for multiple cards in ALSA, things can become complicated. My prior espeak post on multiple cards is beneficial if all the answers aren't here. Also a great multiple card site for the /usr/share/alsa/alsa.conf file. The deepest problem with two cards and HDMI is that, the cards can be re-ordered for default and less ALSA errors, but then HDMI errors may occur from conflicts of this re-ordering. What does the user want, HDMI errors, or ALSA errors? I prefer ALSA errors, given how difficult HDMI is to re-configure. I mostly take the easy way out and simply comment the lines inside /usr/share/alsa/alsa.conf.
I used this hack, and subbed espeak for festival, since I typically use espeak. Espeak works fine, but there are several potential problems once one gets to the point of xbindkeys. Not mentioned in these tutorials is that both ~/.xbindkeysrc AND ~/.xbindkeysrc.scm may be required for consistent operation.
  1. Verify espeak, eg...
    $ espeak "this is a test"
    This site has loads of options for the line.
  2. Install xsel. Xsel takes any selected text (stdout) and pipes it to any subsequent process we want, in this case, espeak.
  3. Select some text and then:
    $ xsel | espeak -s 100
    ... and whatever other espeak modifiers you want.
  4. Make a script from the two actions.
    $ nano talk.sh
    #!/bin/bash
    xsel | espeak -s 110
    eof
  5. Chmod talk.sh to activate it.
    $ chmod +x talk.sh
  6. Test it. Select/highlight some webpage or document text and then, in a terminal,
    $ ./talk.sh
    This speaks the selected text as many times as it's executed, that is, it's not a "one off" of the selected text.
  7. Here is where it becomes difficult: binding the script to a hotkey, eg, with xbindkeys. We first need to detect the numbers associated with pressing keyboard keys.
    # pacman -S xbindkeys
    $ xbindkeys -k
    xbindkeys -k allows a person to detect the numeric codes needed inside the ~/.xbindkeysrc file. One difficulty was the super-key, aka "windows" key. Eventually, I gave-up on this key and opted for the CTRL key in combination with the letter "k". I was then able to get numbers which worked within the configuration ~/.xbindkeysrc file (see below).
  8. Tie talk.sh to the key combo by creating an ~/.xbindkeysrc file.
    $ nano .xbindkeysrc
    "./talk.sh"
    m:0x4 + c:45
  9. Now start xbindkeys
    $ xbindkeys
  10. When I tried CTRL + k the selected text was spoken only once; the text had to be selected with the mouse each time I wanted it to speak. Somehow the text was being "lost" between usage.
  11. I troubleshot using
    $ xbindkeys -n -v
    which runs it in the foreground and provides info. Here, I received a message that there was no ~/.xbindkeysrc.scm. Accordingly, I next ran xbindkeys_config, the GUI front-end (obtained via yaourt xbindkeys_config-gtk2). Through the GUI, I reran the configuration, which tweaked the ~/.xbindkeysrc file. After that, I had consistent repetitive behavior from the hotkey combination.
  12. (Optional) I could have continued to call the script with the hotkey combination but, since the script was only one line, I didn't truly need a script in this case. I deleted the script. Within .xbindkeysrc, I replaced the script call with the script's contents, "xsel | espeak -s 100".
  13. Once consistent, I added xbindkeys to my windows-manager startup file.
    $ nano .icewm/startup
    xbindkeys &
  14. Exit and restart X, (or do, $ killall xbindkeys and $ xbindkeys) to be certain xbindkeys initializes using the ~/.xbindkeysrc configuration.
  15. Testing any of the configurations above is always the same: select text with the mouse, b) hit your key combination (eg., CTRL + k), and verify it repetitively audibles the text.

other

If there are audible delays for a word or two and you get errors like these...
$ xsel | espeak -s 100
ALSA lib pcm_dsnoop.c:638:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.HDA-Intel.pcm.front.7:CARD=1'
ALSA lib conf.c:4554:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5033:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2501:(snd_pcm_open_noupdate) Unknown PCM front
ALSA lib pcm.c:2501:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2501:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2501:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
...then your speech datastream is probably moving along as ALSA attempts unsuccessfully to connect to JACK and so on. I have never been able to stop these JACK calls -- they seem embedded in the binary lib files, I've searched with grep throughout the ALSA file kludge. There's also speculation SPDIF sleep modes can do this. I think in my case it's a multiple card issue since I never have the delay on single soundcard issues. Secondly, HDMI validation with the display monitor also handshakes with the sound card and must have the same device numbers named in ALSA, as by UDEV. Think of the HDMI process as one which cascades from the ALSA setup, but has a dotted line to the kernel through the ELD verification. ALSA can rule it: scroll down to the ELD verification process in this prior post.

more about audible delays
Whatever delays from the speaker, the most important delays are to WAV files. We should be able to write...
$ espeak -s 100 "cut back" -w cutback.wav
...without delays. If we have ALSA speaker delays however, we typically have them writing to files also.

Add an extra word. I find the word "delay" takes the right amount of time to initialize SPDIF or perhaps multiple card JACK issues, so that...
$ espeak -s 100 "delay cut back" -w cutback.wav
...will create the file with only "cut back" spoken.
I renamed the unresponsibe ALSA libs for PulseAudio -- I don't ever use PulseAudio anyway...
# cd /usr/lib/alsa-lib/
# mv libasound_module_ctl_pulse.so zz_libasound_module_ctl_pulse.so
# mv libasoud_module_pcm_pulse.so zz_libasound_module_pcm_pulse.so
...this caused error of them not found, but better than not responding.


No comments: