Howto make your own HQ sounds library with audacity - 1.0.1

After several tests, I found this manner to make High Quality sounds (.sln) for Asterisk:

  • Recording is done with a $5 helmet w/micro, directly into audacity.
  • Micro is positionned very close to lower lip (voice has more presence this way) but musn’t touch it!
  • I tried to make it easy for people without sound skills, however you still have to tell Audacity which device you use for sound I/O at least!
  • This is working fine with Audacity V. 1.3.3-beta on Debian Sid. However mine is in french so options names may vary before my poor translation :wink:

Check preferences > Sampling > 48000Hz 32-bit float,
Check preferences > Conversion > HQ interpolation & Dither Wave

RECORDING:

  • First (an very important to get a very good interpolation while downsampling), go to the windows’s lower left corner, and set Project Rate to 48000 (Hz),
  • Click on the arrow close to the micro’s picture, and start monitoring,
  • Speak normaly and check that monitoring isn’t going farther to ~80-90% of the scale (I usually raise the micro’s level to 50%)
  • Get your paper (I known very few people able to clearly speak without hesitations without a paper), launch record, read normally, stop recording,
  • Check that you don’t have ANY peak touching the edges (clipping), if so, lower the micro’s level, and record again,
  • If needed, remove silences before and after your speech (watch-up! always leave at least 250ms of silence, otherwise voice assembling will be disgusting; after a sentence, leave 500ms),

EDITING:

  • First select the whole stuff: one click on sound’s graphic then CTRL-A,
  • Menu Effect > Normalize @ -3.0db,
  • Menu Effect > [31 - 45] > Declipper,
  • Now, find a silence (the longer one), click-and-hold at the beginning, then go to its end and release (silence selction):
    ** Menu Effect > Noise Killing (parms: don’t care), click on the profile button,
    ** Re-select the whole sound,
    ** Menu Effect > Noise Killing (parms: 48dB, 100Hz, 0.15sec),
  • Menu Effect > [106 - 120] > Multiband EQ:
    • Here are the parms (they are freely adapted from an acoustic mesurement device: MLSSA 10W - and shall fit to male and female’s voices):
      Hz dB
      ** 50 -7.4
      ** 100 -7.0
      ** 156 -5.2
      ** 220 -3.2
      ** 311 -1.7
      ** 440 -1.5
      ** 622 -1.3
      ** 880 -0.45
      ** 1250 -0.4
      ** 1750 4.6
      ** 2500 4.7
      ** 3500 5.2
      ** 5000 6.4
      ** 10000 8.0
      ** 20000 8.0
  • Menu Effect > Normalize @ -0.0db,
  • Menu Effect > [31 - 45] > Declipper,
  • Menu Track > Resample @ 8000 (Hz)
  • Click the upper file’s name (“audio track”) in the grey part, left to the sound’s graphic > Choose Sampling Rate > 16-bit PCM,
  • Go to the windows’s lower left corner, and set Project Rate to 8000 (Hz),
  • Menu File > Export (parms: Other, RAW, Signed 16-bit PCM)
  • Add the “.sln” extension to the file’s name and click OK.

That’s it folks.

RE-OPEN A .sln FILE

Menu File > Import > RAW Data (parms: Signed 16-bit PCM, Little-Indian Mode, 1 Channel (Mono), 0 (byte), 100 (%), 8000 (Hz) )

I hope this will help you to get good results (my two cents :smile: )

Remember: Speak nor-ma-lly (not like a robot), don’t rub your mike on your lips, triple-check your recording isn’t clipping, leave a tiny bit of silence before and after your words.

Last advice: I found much more natural to read some sentences and cut the words from them instead of recording word-by-word; you’ll have to speak a little bit slower, in order to avoid close phonetic liaisons.

BTW: if anybody knows how to batch this process , tell me!