Asterisk 13.18.cert2 (c++ server) - FRACK

Hello,
I have been working with Asterisk 13 in the last months, I believed i have passed all doubts and problems but right now i am facing something very difficult to identify.

History:
I am using Asterisk 13.18.cert2. I have developed a server responsible for communicating with Asterisk using ARI. In my first version, everything seems to be working fine and its already in production, but now i refactored my server and the classes responsible for communicating with asterisk and I am getting FRACK errors and Asterisk server seems to be crashing (and the service restarts).

  • ERROR[11660] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x7fe66c003c68 (0)

I already read a few related issues in this community to make sure it wouldn’t be a duplicate question, and as far as I understood, the problem is related to memory being accessed after it was freed. And I understand I need to send you detailed information of the backtrace logs and the configuration files.
Anyway, I can send you that in the future as soon as you request them.

In my investigations I also saw a common request to upgrade the servers because maybe the problem was already fixed.

Questions:

  1. Do you think I should upgrade my asterisk 13 ? To what version should I upgrade?

  2. My server sends a lot of async http requests to Asterisk. I believe it’s related to the problems happening. Do you think Asterisk 13.18 might be crashing because of multiple requests being sent by different threads at the same time? For example I sometimes send a channel to be muted and to be put on hold. Or I send a create bridge and I add a channel to that bridge at the same time. Should Asterisk handle this easily? Or should I wait for a ChannelCreated respose before adding a channel to that bridge for example?

ps. I mentioned the first version of my server is running fine already in production, and this problem is only happening in my new refactored version. The difference between them is basically my httpclient. In my initial version I also had threads firing different POSTs/DELETEs but i was sure they would happen in the order I send it.

(1)server->createBridge(“10”);
(2)server->addChannelToBridge(“10”, “1232342.123”);

in my new implementation, my http client is using c++ beast library to send the requests and receive the responses. So it is a little bit more complex but more flexible and with better performance. But sometimes the order of requests I send are not sequential.

(2)server->createBridge(“10”);
(1)server->addChannelToBridge(“10”, “1232342.123”);

In this case (and its a real case), I saw the add channel being sent by the socket before the create bridge request. I fixed that but that showed me I have a potential problem while sending posts in sequence that needs to keep an order.

My doubt is if that behavior can crash the Asterisk. Maybe not the createBridge and addChannelToBridge example, but maybe a recording on a snoop channel, with a mute, with adding a new channel to this bridge to monitor the call, with moving a call from one bridge to another while all that is happening.

Thanks in advance.
Fabio

Today I restarted asterisk service and it seems to be working much better. I ran a lot of tests and i didn’t see any FRACK.
It appears to me that yesterday and 2…3 days ago Asterisk was in a faulty state, so easily it was throwing errors.
I am not sure if asterisk was really crashing, but I was inside CLI and when things were not working, i was kicked out of CLI (so i assumed it was crashing and the service was being restarted).
but today (the first time i decided to restart the service manually), things worked much better.

Still… I got one error that I haven’t seen before (probably didn’t pay attention to it).
Please check the error below:

– Executing [h@ac_agent:1] Verbose(“PJSIP/clickproxytrunk-00000027”, “1,Agent hanging up. Agent ID: 28”) in new stack
Agent hanging up. Agent ID: 28
– Executing [h@ac_agent:2] Hangup(“PJSIP/clickproxytrunk-00000027”, “”) in new stack
== Spawn extension (ac_agent, h, 2) exited non-zero on ‘PJSIP/clickproxytrunk-00000027’
[Mar 15 08:41:55] NOTICE[8870]: ari/ari_websockets.c:180 ast_ari_websocket_session_write: Problem occurred during websocket write to 192.168.127.22:37842, websocket closed
[Mar 15 08:41:55] NOTICE[8870]: ari/ari_websockets.c:180 ast_ari_websocket_session_write: Problem occurred during websocket write to 192.168.127.22:37842, websocket closed

There was an error while asterisk websocket tried to write to my server and it closed the connection. So after that I didn’t receive any evets from asterisk and I had to restart my server to create a new connection.
Maybe this indicates the initial problem with the FRACK.

Someone who might want to help…

Activating Stasis app ‘agent-request-fabio’
[Mar 16 09:07:45] ERROR[1090]: astobj2.c:131 INTERNAL_OBJ: FRACK!, Failed assertion bad magic number 0x35 for object 0x7fd094001018 (0)
Got 17 backtrace records
#0: [0x45b5c4] /usr/sbin/asterisk(__ao2_lock+0x154) [0x45b5c4]
#1: [0x7fd0a462c487] /usr/lib64/asterisk/modules/res_stasis.so(+0x10487) [0x7fd0a462c487]
#2: [0x7fd0a4629c07] /usr/lib64/asterisk/modules/res_stasis.so(+0xdc07) [0x7fd0a4629c07]
#3: [0x7fd0a4625a90] /usr/lib64/asterisk/modules/res_stasis.so(stasis_app_unsubscribe+0x210) [0x7fd0a4625a90]
#4: [0x7fd0095baf43] /usr/lib64/asterisk/modules/res_ari_applications.so(+0x1f43) [0x7fd0095baf43]
#5: [0x7fd0095bab59] /usr/lib64/asterisk/modules/res_ari_applications.so(+0x1b59) [0x7fd0095bab59]
#6: [0x7fd0bde01c3b] /usr/lib64/asterisk/modules/res_ari.so(ast_ari_invoke+0x37b) [0x7fd0bde01c3b]
#7: [0x7fd0bde02e2e] /usr/lib64/asterisk/modules/res_ari.so(+0x6e2e) [0x7fd0bde02e2e]
#8: [0x5244e0] /usr/sbin/asterisk() [0x5244e0]
#9: [0x524a69] /usr/sbin/asterisk() [0x524a69]
#10: [0x5d9afd] /usr/sbin/asterisk() [0x5d9afd]
#11: [0x5e84ca] /usr/sbin/asterisk() [0x5e84ca]

More logs…

I fixed issues with increasing file descriptors on my socket connections. Now it is stable and all the connections opened are being closed properly. I have no memory leaks in my program

Do you experience the same problem under latest Asterisk 13 as well? If you do have a problem then unless you have a support agreement the change would not be available in 13.18-cert.

Hi Joshua,
I haven’t tried in a different version yet.
I have a few updates for that problem.

It seems to me all the FRACKs I get are related to
stasis_app_subscribe and stasis_app_unsubscribe

Whenever my server runs, I send:
DELETE /ari/applications/agent-request-fabio/subscription?eventSource=endpoint:PJSIP
POST /ari/applications/agent-request-fabio/subscription?eventSource=endpoint:PJSIP

I have right now 4 asterisk servers running, So when my server runs, I connect to all 4 asterisks and send them those 2 http messages.
If I only run my server (agente-request-fabio), it will be all fine, but if another developer runs his server (agent-request-joshua), then we start facing FRACK issues. And the problem only happens in the newest version I developed using the subscriptions. In my old version, I didn’t use subscription and we could all run our servers without problems.

For now, we will try to use 1 asterisk server for each developer. And i will try to check an alternative. Maybe I will only send a subscription in case the application is not registered there yet. To avoid sending a subscription over an already subscrived application.

Have you seen problems related to subscriptions in the past? What would you recommend me to do?
I changed the code a few minutes ago to only subscribe when my server starts running, and I am now unsubscribing when my server is stopping (to avoid doind unsubscribe/subscribe at the same time), but that didn’t work.
My next step will be to check if it’s already subscribed. If so, I won’t try to subscribe again).
I will try also to check a variation for the eventSource=endpoint:PJSIP. Maybe to set only “agent-request-fabio” as subscription (i don’t want to receive events coming from “agent-request-joshua”'s app

I don’t recall anything in regards to subscriptions.