I had installed asterisk on my server but it hangs up occassionaly. When I tried to debug the isssue, I found that something is locking up asterisk. Following is the output of ‘core show locks’ on it:-
=======================================================================
=== 11.10.2
=== Currently Held Locks
After I restart asterisk, it starts working fine for a day or two and then again goes down. I tried to study about what’s happening but could not really find a particular reason. Could someone please help me find the reason behind this issue?
There is no deadlock evident there, but deadlocks can involve CPU as well as locks. Does this pattern of locks stay constant? Is Asterisk running 100% on at least one CPU?
The only unusual lock is the one in the logging subsystem. I’ve never seen one of those before, but they are possible in normal operation.
Do you have a debug build? If so, you need to obtain backtraces.
I assume you understand what a lock is. You really need to look at the source code to understand a particular lock.
The one you mention suggests that something has stalled whilst trying to write to one of the asterisk logs. That may mean that you have a very large amount of logging going on and nothing is actually wrong, or you are using some form of logging that has limited throughput.
You don’t seem to be short of memory or CPU.
You can get a core file by using the gcore command, or by manually attaching gdb and either debugging the process in RAM, or forcing a dump. (gcore just automates the latter.)
The one you mention suggests that something has stalled whilst trying to write to one of the asterisk logs. That may mean that you have a very large amount of logging going on and nothing is actually wrong, or you are using some form of logging that has limited throughput.
I had rotated the asterisk logs yesterday to handle the lock related to logger. But again asterisk is showing locks today and my number is coming busy.
=======================================================================
=== 11.10.2
=== Currently Held Locks
The problem is that, since yesterday no calls have been placed on the number mapped to this server. The only calls that are going on are the one’s generated via qualify settings. But still something locks up asterisk and the number starts showing busy.
The problem with diagnosing these locks is that nothing is waiting for these locks. Locks are only interesting when there is something shown as waiting. It could be that something is busy waiting by doing conditional locks, but you still don’t have the makings of a deadlock based purely on locks. For a start that requires more than one thread.
There has to be something else that is causing the code to stall. That’s not a tight CPU loop, although it could be one with waits in it. You really need the core dump and backtraces to find the problem.
Perhaps if you can provide more information about your environment. Is this hardware or a VPS? If VPS what kind? Are you compiling Asterisk or installing RPM’s? 32bit or 64bit OS?
If you are compiling are you just running ./configure or adding any config options to that. Did you check your config.log after running ./configure to see if you have any missing dependencies?
This is a hardware and we had compiled asterisk on our 64 bit machine.
We had not added any config option while compiling and all dependencies had been installed then.
Also, I have a copy of this server with same configuration, on which I have mapped a different number with a different service provider but I do not encounter such issues on that server. Can I somehow detect if this is being caused by the provider I am using for this server.
What do you mean by “copy”? Did you recompile on the new hardware? You need to recompile Asterisk if you move/copy/restore the software to different hardware.
Other than that I suppose it could be a hardware issue but most likely software.
Packages like AsteriskNOW would not work if you had to recompile.
Of course, what the packages do is compile for a machine which only supports common processor options, whereas the default build process tries to compile for maximum performance on the processor on which it is being compiled.
However in this case there was no suggestion that the hardware was different, and hardware incompatibilities generally show as illegal instruction exceptions.
If you compile from source with default settings you should recompile if you move to different hardware. AsteriskNOW uses RPM’s and the payload was probably compile with the “Build Native” flag disabled.