How to distribute MeetMe conf on separate CPU processes?

Hi, I am setting up a non-profit conference server for church services.

With 225 conference participiants split on 44 MeetMe conferences I get the CPU reading (using top) as shown below. The hardware has 8 CPU kernels, Intel Xeon X5355 2.66 GHz, 2 GB RAM, using g711 only on a 50 Mbit IAX trunk to the phone network. Software is Asterisk 1.6.0.10 / trixbox 2.8.0.3.

It all runs very fine, but I fear to be near the CPU limit as it seems asterisk is using only one CPU:

trixbox web interface shows 10-11% CPU. (Load Average = 2.46)
This corresponds fine to what “top” shows in its header: 89.1% idle time (see “top” output below)

But as it is seen below, one asterisk process (PID 3652) uses 77.2% CPU (I guess it must be percent of one kernel).
I fear if this process reaches 100% (which will leave the total system with 7/8 = 87.5% idle time) calls will start dropping.

[b]It seems all conferences are run by this single asterisk process, although e.g. ntop shows that more than 25 asterisk processes are running, but not doing much.

What do I need to run conferences on separate processes, so I will be able to use all 8 CPU kernels?[/b]

top - 10:57:28 up 1 day, 21:52, 3 users, load average: 2.20, 2.46, 2.15
Tasks: 179 total, 1 running, 178 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.2%us, 8.9%sy, 0.0%ni, 89.1%id, 0.0%wa, 0.2%hi, 0.6%si, 0.0%st
Mem: 2054252k total, 1285236k used, 769016k free, 171108k buffers
Swap: 779144k total, 0k used, 779144k free, 814424k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3652 asterisk 15 0 146m 71m 9420 S 77.2 3.6 32:17.73 asterisk
3399 mysql 25 0 138m 21m 4888 S 2.0 1.1 1:15.70 mysqld
18466 asterisk 15 0 36116 14m 4636 S 0.7 0.7 0:08.42 httpd
18467 asterisk 15 0 36112 14m 4588 S 0.7 0.7 0:07.54 httpd
12071 asterisk 15 0 36100 14m 4576 S 0.3 0.7 0:04.24 httpd
18465 asterisk 15 0 31080 9244 4020 S 0.3 0.4 0:07.98 httpd
1 root 15 0 2072 632 540 S 0.0 0.0 0:01.84 init

Peter :smile:

Type “H” to top and make sure that it reponds “Show Threads On”. Then see if you really have a problem.

If you are using a 2.6 kernal, which you need to be doing for a supported build of recent Asterisk versions, top and ps consolidate all the thread processes into a single overall process. It is the individual thread processes that can be scheduled to a CPU. That is not to say that Asterisk doesn’t have one thread for all conferences; I haven’t looked at the code.

“core show threads” at an Asterisk CLI prompt will tell you which thread processes do what, at an overview level.

/proc/stat contains, raw, per CPU loading data.

Thanks for the tip about “H” in top. I also found another really nice tool, “htop”, which shows threads in a fine way and shows CPU percentage for each CPU.

htop shows 37 asterisk threads running.

After a few hours of conferencing i read what htop shows in accumulated CPU time for the diffent asterisk threads:
There are 13 asterisk threads which have been working more than a minute accumulated:

One thread: 140 minutes
Another thread: 22 minutes
Ten other threads each: 6 minutes
One thread: 2 minutes

Additionally, another task has been running 68 minutes. It is the script op_server.pl which is html-related, I guess it is because I have a web page updating each 15 seconds for each conference showing who is listening.

My conclusion is that asterisk is doing most of its work (measured in CPU time) in a single thread, at least in my configuration. I guess it means that I cannot use my 8 CPU cores at all? Two would be just as good, but then I need more servers instead to host more conferences. Which is bad, because this nice second-hand server was donated for this purpose.

There must be a way to tweak it to start a new thread for each conference… - any ideas are welcome!

By the way, how can I be sure it is the conferences which is using most of the CPU time?
Could it be the IAX trunk with 200+ simultaneous lines in use?

Anyway, my suggestion is to make asterisk more multithreaded in the near future - because all new servers are at least quad core, and it would increase the number of simultanous lines asterisk can handle a lot, as far as I understand it.

Peter :smile:

Use core show threads to find out what each thread is doing.

Yes, right “core show threads” shows them, but I don’t see any reference to meetme and also not to the PID’s shown in top/htop - so it is difficult to know which thread is what compared to what top/htop is showing:

0xb4027b90 netconsole started at [1088] asterisk.c listener()
0xb7044b90 ast_make_file_from_fd started at [161] tcptls.c ast_tcptls_server_root()
0xb7080b90 monitor_sig_flags started at [3483] asterisk.c main()
0xb72e3b90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb70c7b90 network_thread started at [10406] chan_iax2.c start_network_thread()
0xb7103b90 sched_thread started at [10405] chan_iax2.c start_network_thread()
0xb713fb90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb717bb90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb71b7b90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb72a7b90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb731fb90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb726bb90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb735bb90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb722fb90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb71f3b90 iax2_process_thread started at [10395] chan_iax2.c start_network_thread()
0xb7397b90 do_monitor started at [8419] chan_dahdi.c restart_monitor()
0xb73d3b90 do_monitor started at [3515] chan_mgcp.c restart_monitor()
0xb7487b90 network_thread started at [2233] pbx_dundi.c start_network_thread()
0xb744bb90 process_precache started at [2234] pbx_dundi.c start_network_thread()
0xb740fb90 process_clearcache started at [2235] pbx_dundi.c start_network_thread()
0xb74c3b90 do_monitor started at [19560] chan_sip.c restart_monitor()
0xb74ffb90 do_monitor started at [4601] chan_unistim.c restart_monitor()
0xb7c4bb90 device_state_thread started at [6869] app_queue.c load_module()
0xb764ab90 scan_thread started at [525] pbx_spool.c load_module()
0xb7686b90 do_monitor started at [2763] chan_ooh323.c restart_monitor()
0xb76c2b90 ooh323c_stack_thread started at [44] ooh323cDriver.c ooh323c_start_stack_thread()
0xb76ffb90 do_monitor started at [1145] chan_phone.c restart_monitor()
0xb7c87b90 do_monitor started at [5803] chan_skinny.c restart_monitor()
0xb7cc3b90 accept_thread started at [6036] chan_skinny.c reload_config()
0xb7cffb90 do_parking_thread started at [4059] features.c ast_features_init()
0xb7e41b90 device_state_thread started at [8361] pbx.c load_pbx()
0xb7e7db90 do_devstate_changes started at [532] devicestate.c ast_device_state_engine_init()
0xb7eb9b90 desc->accept_fn started at [346] tcptls.c ast_tcptls_server_start()
0xb7ef5b90 logger_thread started at [928] logger.c init_logger()
0xb7f31b90 listener started at [1144] asterisk.c ast_makesocket()
0xb7f6db90 ast_event_dispatcher started at [818] event.c ast_event_init()
36 threads listed.

That’s not very useful. However, it does suggest that there might not be a thread that owns a conference.

The code is too complex for me to work out what actually happens without a personal need to know.

The numbers may be the same ones that gdb uses, so it might be possible to get a gdb threads listing and correlate that to the core show threads and top output.

I found the answer myself, actually trixbox/asterisk spilts fine over all 8 cpus just with a standard installation.

The problem I saw with top was not real, when using the thread option (H) I can see that no thread ever comes over 15% CPU. The reading I had with top showing an asterisk process using 77% CPU, is a sum of the underlaying threads which is spread over several CPU’s.

Using htop, the graphs for the 8 CPU’s shows the same, the CPU with the highest load with an asterisk thread, never goes over 15% CPU.

But a problem with htop is that there seems to be no way to distinguish between processes and threads, that makes its listing of processes and threads a little confusing.