ExternalMedia/AudioSocket devicestate topics accumulate and lead to "Excessive refcount 100000" on ao2 object

Hi all.

I am using Asterisk 20.1.0 in a production environment. I am running into an issue that appears to be related to Stasis devicestate topics and ao2 reference counting, and I was hoping you might have some insight or suggestions.

In my scenario, calls are initiated from an external application via ARI using the originate operation. After the call is created, it is passed into a Stasis application, where the business logic is implemented. This Stasis application uses AudioSocket as part of the media handling.

After each call, when I run stasis show topics, I see a new entry like:

devicestate:all/AudioSocket/MP2D66QC:62330-98f1f148-0714-4384-b1ec

These entries keep accumulating and never seem to disappear. Once the number of such topics reaches around 100,000, Asterisk starts logging messages of the form:

FRACK!, Failed assertion Excessive refcount 100000 reached on ao2 object

From my understanding of astobj2 and the Stasis code in main/stasis.c, this suggests that there is a reference counting leak related to these devicestate topics or their subscriptions. Over time, this looks like it could lead to instability or incorrect behavior of the system, especially under high load.

I have tested this behavior not only on Asterisk 20.1.0 but also after upgrading to Asterisk 22.5.0, and the problem still occurs in the same way (the devicestate topics continue to accumulate and eventually trigger the excessive refcount assertion).

Right now, I am investigating how to access and inspect the underlying ao2 containers (for topics and subscriptions) from a custom module, so that I can see exactly what is being kept alive and why. The idea is to build a module that iterates over the relevant Stasis topics (those matching the devicestate:all/AudioSocket/... pattern), inspects their subscribers and reference counts, and helps identify what is leaking.

My questions are:

  • Have you seen this pattern before with devicestate topics (especially in combination with ARI‑initiated calls, Stasis applications, and AudioSocket) where stasis show topics grows without bound and eventually hits the Excessive refcount 100000 assertion?

  • Do you know of any existing fixes, configuration changes, or best practices that prevent these devicestate topics from accumulating (for example, known patches, options to avoid per‑call topic creation, or recommended unsubscribe/cleanup patterns)?

  • From your perspective, is the approach of writing a dedicated inspection/cleanup module to work with the Stasis topic/subscription containers reasonable, or would you recommend a different direction (e.g., changing how the ARI application uses originate and Stasis, or integrating with existing Stasis APIs in a specific way)?

If you have any pointers to relevant commits, issues, or documentation, or any high‑level guidance on how you would approach this, that would be extremely helpful.

Yes, when ephemeral channels are created and device state information is created alongside it in the core but nothing tells the core that it has gone away. It’s happened in the past, I don’t recall any further specifics as it was long ago. This is also not ARI or Stasis related.

This is jumping immediately to code or some kind of resolution without actually understanding things. Look at how the device state topic is created and managed. Understand that. Look at other usage and history.

Thank you so much for your reply! Could you suggest any research paths? I do not know where to go next.

I would start where I stated in my previous response, looking at what actually creates the topic, look at other users in Asterisk, look at past commits and see if this same issue was resolved for something else already, and gain an understanding of the expectations of the devicestate topic and usage.

If you’re asking me for specifics I don’t have anything further.

I’ll also add that if you’re trying to use AI to resolve or solve this issue, I wouldn’t expect much from it.

1 Like

You’re right, AI can’t really help me with this. That’s why I came to the people. :slight_smile:

i think this is the core issue

The Problem:

  • AudioSocket channels create device state topics like devicestate:all/AudioSocket/MP2D66QC:62330-...

  • When channels hangup, these topics are never cleaned up

  • After ~100,000 calls, you hit Excessive refcount 100000 and Asterisk becomes unstable

Root Cause: The audiosocket_hangup() function in chan_audiosocket.c doesn’t notify Asterisk’s device state system when channels are destroyed. This leaves orphaned device state topic subscriptions that accumulate indefinitely.

The Fix: Add device state notification to the hangup handler:

static int audiosocket_hangup(struct ast_channel *ast)
{
    struct audiosocket_instance *instance;

    instance = ast_channel_tech_pvt(ast);
    
    if (instance != NULL) {
        /* Notify device state system that this channel is gone */
        ast_devstate_changed(AST_DEVICE_UNKNOWN, AST_DEVSTATE_NOT_CACHABLE,
                           "AudioSocket/%s-%s", instance->server, instance->id);
        
        if (instance->svc > 0) {
            close(instance->svc);
        }
    }

    ast_channel_tech_pvt_set(ast, NULL);
    ast_free(instance);

    return 0;
}

This single call triggers cleanup of the device state topic subscriptions, preventing the leak.

I don’t think that would resolve the issue or that its understanding is quite correct, but I could be wrong.

The code now looks like this:

/*! \brief Function called when we should hang the channel up */
static int audiosocket_hangup(struct ast_channel *ast)
{
    struct audiosocket_instance *instance;

    /* The channel should always be present from the API */
    instance = ast_channel_tech_pvt(ast);
    if (instance != NULL && instance->svc > 0) {
        close(instance->svc);
    }

    ast_channel_tech_pvt_set(ast, NULL);
    ast_free(instance);

    return 0;
}

Do you propose to supplement it or replace it with the code that you propose?

replace it with the suggested and test if you could !

1 Like

There is no server definition in struct audiosocket_instance.

make
CC="cc" CXX="g++" LD="" AR="" RANLIB="" CFLAGS="" LDFLAGS="" make -C menuselect CONFIGURE_SILENT="--silent" makeopts
make[1]: Entering directory '/usr/src/asterisk-20.1.0/menuselect'
make[1]: 'makeopts' is up to date.
make[1]: Leaving directory '/usr/src/asterisk-20.1.0/menuselect'
   [CC] chan_audiosocket.c -> chan_audiosocket.o
chan_audiosocket.c: In function ‘audiosocket_hangup’:
chan_audiosocket.c:133:106: error: ‘struct audiosocket_instance’ has no member named ‘server’
  133 |         ast_devstate_changed(AST_DEVICE_UNKNOWN, AST_DEVSTATE_NOT_CACHABLE, "AudioSocket/%s-%s", instance->server, instance->id);
      |                                                                                                          ^~
make[1]: *** [/usr/src/asterisk-20.1.0/Makefile.rules:165: chan_audiosocket.o] Error 1
make: *** [Makefile:396: channels] Error 2

Maybe that’s why there’s such a mistake? I’m not good at C programming or asterisk source code.

It doesn’t exist in 20.1.0, but does in more recent versions.

1 Like

Indeed, it worked fine on Asterisk 22.5.0, but it didn’t solve the problem. There was still a topic left after the call.

stasis statistics show topic devicestate:all/AudioSocket/arn-ext-dev-lg1:44701-3b94d19d-4c4b-41ce-9fec
Topic: devicestate:all/AudioSocket/arn-ext-dev-lg1:44701-3b94d19d-4c4b-41ce-9fec
Pointer Address: 0x14e85801ca20
Number of messages published that went to no subscriber: 0
Number of messages that went to at least one subscriber: 4
Lowest amount of time (in milliseconds) spent dispatching message: 0
Highest amount of time (in milliseconds) spent dispatching messages: 0
Number of subscribers: 5
Subscribers:
        app_queue.c:devicestate:all-3
        devicestate.c:devicestate:all-1
        pbx.c:devicestate:all-2
        res_stasis_device_state.c:devicestate:all-5
        stasis_cache.c:devicestate:all-0

And it doesn’t disappear.
Are there any other options?

The fact that this topic stays after the call:

devicestate:all/AudioSocket/arn-ext-dev-lg1:44701-3b94d19d-4c4b-41ce-9fec

does not automatically mean the fix failed.
Asterisk’s devicestate code keeps one topic per device string and basically never deletes it. For ephemeral device names (each call has a unique AudioSocket/...-UUID), this means one permanent topic per call, by design.

ast_devstate_changed() in audiosocket_hangup() fixes the subscription / refcount issue, but it doesn’t remove the topic object itself – the core doesn’t really GC those.

How to tell if you’re still leaking

After your fix, check:

  • Does stasis show subscriptions stay roughly constant as calls come and go?

  • Does the FRACK! Excessive refcount 100000 still show up under heavy load?

If subscriptions stay flat and the FRACK is gone, the leak is fixed, even though topics remain listed.

The forum is not an issue tracker, and noone has filed a Github issue.

@qa87 @locateparcels

can you open a issue on github

i do not use ari to test !

and also please confirm this issue is on the latest version !

I wasn’t sure if this problem had already been solved, so I came to the community with a question. I’m not sure if this is a problem at all. Since the community does not know the solution, I will submit an issue.

Here’s what I got as a result of the experiments. but it does not clear the topic.

static int audiosocket_hangup(struct ast_channel *ast)
{
    struct audiosocket_instance *instance;
    const char *chan_name;
    char *copy = NULL;
    char *dev = NULL;
    char *last_dash = NULL;
    instance = ast_channel_tech_pvt(ast);
    if (instance != NULL) {
        chan_name = ast_channel_name(ast);
        if (chan_name && *chan_name) {
            ast_log(LOG_DEBUG,"AudioSocket: hangup — requesting devstate change for '%s'\n",chan_name);
            copy = ast_strdup(chan_name);
            if (!copy) {
                ast_log(LOG_WARNING, "AudioSocket: strdup failed for chan_name\n");
                goto cleanup;
            }
            last_dash = strrchr(copy, '-');
            if (last_dash) {

                dev = ast_strndup(copy, last_dash - copy);
            }
            if (!dev) {

                dev = ast_strdup(copy);
            }
            ast_log(LOG_DEBUG, "AudioSocket: resolved device id '%s'\n", dev);

            ast_devstate_changed(AST_DEVICE_UNKNOWN,AST_DEVSTATE_NOT_CACHABLE,"%s",dev);
        } else {
            ast_log(LOG_WARNING,
                "AudioSocket: hangup — channel name empty\n");
        }
cleanup:
        if (dev) {
            ast_free(dev);
        }
        if (copy) {
            ast_free(copy);
        }
        if (instance->svc > 0) {
            close(instance->svc);
        }
    }
    ast_channel_tech_pvt_set(ast, NULL);
    ast_free(instance);
    return 0;
}

I wrote an issue. If someone can add or fix something, please do so. [bug]: ExternalMedia/AudioSocket devicestate topics accumulate and previously led to "Excessive refcount 100000" on ao2 object · Issue #1638 · asterisk/asterisk · GitHub
maybe @locateparcels ?

To close the topic: benphone has made a commit that works absolutely great and completely resolves the issue. Huge thanks to him for the fix and his help

Thanks! And big thanks to @jcolp as well - his insights from the chan_iax2/chan_local history made finding the right solution much easier. The AST_FLAG_DISABLE_DEVSTATE_CACHE approach is exactly what was needed here.

1 Like