I’ve spent many hours in reading the source code and the logs (see att.).
Can’t figure out what exactly causing the write factory to be exhausted, or its pointer is invalid?
It looks to me there are no packets in the buffer while audiohook reads frames from it.
The reading call stack:
apps/app_mixmonitor.c:mixmonitor_thread()
main/audiohook.c:ast_audiohook_read_frame_all()
main/audiohook.c:audiohook_read_frame_helper()
main/audiohook.c:audiohook_read_frame_both()
And here the line
usable_write = (ast_slinfactory_available(&audiohook->write_factory) >= samples ? 1 : 0);
sets usable_write to 0, and then the debug message is being generated:
ast_debug(3, "Failed to get %d samples from write factory %p\n", (int)samples, &audiohook->write_factory);
And here is the call stack for putting frames into the write buffer:
main/channel.c:ast_write_stream()
main/audiohook.c:ast_audiohook_write_list()
main/audiohook.c:audio_audiohook_write_list()
main/audiohook.c:ast_audiohook_write_frame()
main/slinfactory.c:ast_slinfactory_feed()
include/asterisk/linkedlists.h:AST_LIST_INSERT_TAIL()
Unfortunately this bug is being reproduced only on production system under the load, so it is not possible to debug with gdb and catch the real stack traces and variables’ values.
The calling context in extensions.conf is using MixMonitor() and Dial() as follows:
[DLPN_Internet_prov1_company2] ;company2 OUT from prefix *2
exten => _X.,1,NoOp(${EXTEN:-10},${CALLERID(num)},${CALLERID(name)},${CHANNEL},${SIPCALLID} ----- company2 OUT CALL XXXX to ${EXTEN})
same => n,Gosub(sub-recordcall-ecom,s,1(${UNIQUEID},out))
same => n,Set(CDR(userfield)=company2 OUT)
same => n,Set(ORIGCALLERID=${CALLERID(num)})
same => n,Set(QUEUENUMBER=company2)
same => n,Set(EXTENNUMBER=${EXTEN:-10})
same => n,Set(CALLERID(num)=4051234567) ; MAIN FOR 0800XXXXXX
same => n,Dial(SIP/prov1_Trunk/${EXTEN:-10},,Tt)
same => n,StopMixMonitor()
same => n,Hangup
exten => h,1,StopMixMonitor()
exten => h,n,AGI(pyagi_is_out.py)
[sub-recordcall]
;
; Basic call recording subroutine
;
; ${ARG1} - UNIQUEID
; ${ARG2} - path
; ${ARG3} - Call direction: [in|out]
;
exten => s,1,Set(CALLFILENAME=${ARG1})
same => n,Set(CALLFILEPATH=${ARG2})
same => n,Set(FULLPATH=${RECORDS_DIR}/${CALLFILEPATH})
same => n,Set(CALLFILENAME_MP3=${CALLFILENAME}.mp3)
same => n,Set(CALLFILENAME_WAV=${CALLFILENAME}.wav)
same => n,Set(WAV_R=${IF($["${ARG3}" = "in"]?R:L)})
same => n,Set(WAV_L=${IF($["${ARG3}" = "in"]?L:R)})
same => n,NoOp(${WAV_R} -> ${WAV_L})
same => n,Set(CDR(filename)=${CALLFILEPATH}/${CALLFILENAME_MP3})
same => n,Set(MONOPT=/usr/bin/nice -n 19 /usr/bin/ffmpeg -y -i ${FULLPATH}/${CALLFILENAME}.R.wav -i ${FULLPATH}/${CALLFILENAME}.L.wav -filter_complex "[0:a][1:a]join=inputs=2:channel_layout=stereo[a]" -map "[a]" -codec:a libmp3lame -qscale:a 5 ${FULLPATH}/${CALLFILENAME_MP3} && rm -f ${FU
LLPATH}/${CALLFILENAME}.[LR].wav)
same => n,MixMonitor(,br(${FULLPATH}/${CALLFILENAME}.${WAV_R}.wav)t(${FULLPATH}/${CALLFILENAME}.${WAV_L}.wav),${MONOPT})
same => n,Return()
I would appreciate if someone will drop a hint how to find a root cause of this problem and fix it.
PS: Configs and logs are obfuscated due to data privacy
log_bad.txt (2.6 MB)
log_good.txt (3.0 MB)