Given that Asterisk uses non-blocking writes for AMI responses, and would report a timeout if the requestor was failing to read responses, I think we have to assume that Asterisk is failing to obtain a lock, rather than a round trip flow control issue. That will be very sensitive to the actual command being used, but is possible that it is trying for a conditional lock but the resource is locked so much of the time that it can never catch it when it is free.
Unfortunately you need to compile with thread debugging enabled, which has a significant performance penalty, in itself, to be able to positively confirm this. Just in case it has been built with thread debugging, you could try to see whether the CLI command “core show locks” is accepted.
Actually, you may be able to get some way to seeing what is happening by forcing a core dump. This can done with the gcore command. It will probably freeze the application for a fraction of a second, but will have to be done when there is a high load, so there is going to be some disruption. You can then use gdb to identify the AMI thread and see what it is doing. It is rather easier to do this if the code was built with optimisation disabled.
The exact command that is stalling is likely to be significant in terms of trying to get a black box diagnosis.
If it is related to a conditional lock, one would expect the delays to start rising quite quickly beyond a certain level of load.
By the way, do you mean 240 calls or 240 channels. If you have all the channels busy on point to point calls, that would be 120 calls, but if they were all on IVR or voicemail, it could be 240. This makes a difference to the number of processes running,
Also, do you use parking. There was an issue, which may or may not have been fixed, that the parking code used the select system call, which is limited to 1024 file descriptors. 240 calls would, almost certainly exceed that and cause strange behaviour.