I am using sox to convert mp3 files, generated in Amazon Polly to sln. I’m finding that the output file always has a sampling rate of 8,000 Hz and I can’t increase that sampling rate. Does it even matter? Would I get a better sound quality at a higher sampling rate? I have tried using the mp3 file directly in asterisk and the sound seems smoother at 20,050 Hz but the volume is much lower.
https://wiki.asterisk.org/wiki/display/AST/Asterisk+Audio+and+Video+Capabilities
Any comments or recommendations appreciated.
root@asterisk:~/sounds# soxi input.mp3
Input File : 'input.mp3’
Channels : 1
Sample Rate : 22050
Precision : 16-bit
Duration : 00:00:01.67 = 36846 samples ~ 125.327 CDDA sectors
File Size : 10.1k
Bit Rate : 48.2k
Sample Encoding: MPEG audio (layer I, II or III)
root@asterisk:~/sounds# sox input.mp3 -t raw -b 16 -r 32000 output.sln
root@asterisk:~/sounds# soxi output.sln
Input File : 'output.sln’
Channels : 1
Sample Rate : 8000
Precision : 16-bit
Sample Encoding: 16-bit Signed Integer PCM
Maybe it’s just an issue with the sox mp3 decoder? I don’t typically start w/ mp3 files, I think most of the time I’m dealing with .wav files that were spat out by avconv, but I’ve never had an issue getting sox to create sln files with a sampling rate of something other (more fidelity) than 8000.
Yes, you’ll get better sound quality by not down-sampling to 8000Hz.
.sln is really an Asterisk only file extension, and is always 8kHz,16 bit, mono, signed, raw PCM. It looks like sox has been updated to recognize that extension.
Asterisk also supports .sln16, which may or may not be recognized by sox. That is sampled at 16kHz If not recognized by sox, you can always specify all the parameters explicitly.
The PSTN operates at 8kHz, and most VoIP codecs do as well, so increasing the sampling rate will not necessarily improve the audio, as it will have to be down sampled before it is used.
In sox land, you have to output to .sln and then rename, based on the output rate you chose, to the proper .slnXY format name for Asterisk.
It doesn’t matter whether the input file is mp3 or ogg. sox conversion to .sln always results in an 8,000 Hz output file. If anyone can suggest a way to convert either mp3 or ogg to real sln16, I’m all ears.
root@asterisk:~/sounds# soxi input.ogg
Input File : 'input.ogg’
Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:00:11.73 = 187670 samples ~ 879.703 CDDA sectors
File Size : 46.2k
Bit Rate : 31.5k
Sample Encoding: Vorbis
root@asterisk:~/sounds# sox input.ogg -t raw -b 16 -r 16000 output.sln
root@asterisk:~/sounds# soxi output.sln
Input File : 'output.sln’
Channels : 1
Sample Rate : 8000
Precision : 16-bit
Sample Encoding: 16-bit Signed Integer PCM
There is no metadata in .sln files, so soxi has no information, other than the filename, to determine the sample rate. It may well be sampled at 16kHz.
If not, why not use .sln16 directly, or .raw, an rename.
Similarly, Asterisk only has the filename so will always assume 8kHz.
1 Like
Yes. soxi lies. It doesn’t know. Take an Opus-capable phone and have Asterisk playback an .sln file that you downsampled, using sox, from 48 to 8 and then have it playback an .sln48 file that you don’t downsample. You’ll hear the difference.
1 Like