AMI client programs & socket blocking

General question about the resiliency of the AMI interface.

FWIW (not blaming any particular tech - just observation) I’ve seen scenarios where a dead CR database causes blocking within the asterisk process leaving call threads un-terminated, etc.

Also seen things like this:
https://issues.asterisk.org/jira/browse/ASTERISK-855

With AMI, there can be a lot of events. What happens if my client program does not “keep up”?

Will Asterisk give me any indication? i.e. will it close the socket? From that jira post, this is difficult. So what do we do?

To test this, I opened an AMI session via telnet, and used the escape in attempt to block the telnet receive.

On the asterisk system I can see the send buffer grow and reset to zero on the AMI socket. Is the data being discarded? Maybe I’m not blocking successfully?

If I need to have an intermediary buffer in my program - to ensure that AMI will never be blocked by my program I want to know this is required, and hopefully know a way I can test the situation to ensure my app does not block the server / crash the system.

I’d also like an idea of which versions are affected and which if any are fixed - I don’t want to mis-interpret the jira post.

Thanks very much!

m

Threads won’t get blocked on AMI stuff these days as it is asynchronously queued up and has been for a long time (I think as of 1.8). As for disconnecting the session, if we attempt to write a message out and it times out we will disconnect the client.

Cool - Thanks for the reply! In striving for universality is there anything to worry about in 1.4.x and so on ?

Do you know what happens to the overflow or how I know or configure how big the buffer is (i.e. is it simply the OS TCP Send Q? Or is there another internal buffer?)

I’m just trying to think of how disaster proof I need to be and how to know / detect if I’m not disaster proof enough :wink:

In this scenario if my database writes are slow, at some point if I don’t handle the situation, my socket will block.

Presumably asynchronous queuing will prevent asterisk from blocking, but at some point that buffer will fill at which point you disconnect us? Seems reasonable :slight_smile: But before 1.8 thar be monsters / a full pipe could crash asterisk?

AMI -> My Tool -> Log DB

If I had to I could look more at separating the process:

AMI -> My Tool A -> FIFO -> My Tool B -> Log DB

Adding that FIFO would provide a file system stored buffer if I needed time (for example a loss of DB connection / fail over or restart).

Basically if I understand correctly, still a good idea - but won’t crash Asterisk 1.8+ - could crash it 1.4-?

Thanks again :slight_smile:

I really can’t remember back in 1.4… that’s far far far out of my memory.

Generally though you should accept and read events as fast as you can.

Thanks again! I’ll keep an eye on this thread and see if anyone remembers the old days…