Backwards time leap issue

It seems asterisk starts to consume nearly 100% cpu if system time in Linux jumps backwards. Granted this is a bigger problem than just asterisk, but maybe it should still tolerate this.

This was tested with asterisk version 21.2.0 by setting the date to year 2027 (NOTE: forwards time jump doesn’t cause any issues) and then setting the date back to 2025. After this in few seconds asterisk start to consume CPU although all SIP activity is idle. Only way to recover is restart asterisk or issue '“core reload” command.

One clock_gettime() call was found in /include/asterisk/time.h:
clock_gettime(CLOCK_REALTIME, &ts);

When this is changed to following, the issue seems to be fixed:
clock_gettime(CLOCK_MONOTONIC, &ts);

Could that be a possible solution? Could it result to some other issues?

No I take it back.

Changing that one clock_gettime() call doesn’t fix the issue.

One thing I notice that once time leap backwards has happened, “core show uptime” no longer prints anything. I don’t know if it has any relation to the CPU issue though, but the utilization still goes sky high.

Welcome back to the future, wait, hmm, FORUMS! :slight_smile:

Thought to ask about theoretical negative leap seconds… but even the positive leap seconds are done in ten years, then might be another century until we have to worry about it:

Yeah, as I said, this is a bigger problem than just asterisk.

I can just say to our customer (who provide the NTP), which I have already told them, that it’s not okay. I was just wondering if asterisk should somehow tackle this. All they see in their eyes is that our sw (asterisk) is the culprit.

I have seen oddball problems like this with PC’s that have motherboards that the RTC is running significantly faster or slower than actual time. What happens is NTP either injects or removes seconds to keep the RTC clock synchronized. A number of times the actual root cause was that the 3.3v coin lithium battery on the motherboard had worn out. Replacing the battery fixed the motherboard clock.

Down the server, and replace the coin battery and see if the problem goes away. If that does not help then disable the NTP daemon on the Asterisk system and setup ntpdate to run once a night out of a cron script, that runs ntpdate, pauses for a few seconds then does a service stop/service start on asterisk

In my view, any daemon that is started automatically need to be able to tolerate one negative time step.

I think we all agree on that one :-/ What I don’t think we all agree on is that this problem even exists in Asterisk. I’ve never seen it myself. I was trying to be nice and polite to the OP and gently encourage him to check out his hardware first - before immediately assuming asterisk is the issue, or assuming the upstream time source is dirty, etc.

I’m afraid the problem very much exists. I can repeat it over and over just setting clock in the future and then back again with Linux ‘date’ command.

In production environment this happens because external NTP server gives time from future. As I said that is clearly the problem that needs to be fixed and I have explained this to our customer.

I just brough this up as asterisk is currently the only application having this sort of problem in our system when the time leaps backwards. Anyway, maybe we device just some work around to restart asterisk if the time jumps and it starts consuming cpu.

Ah. Another one asking for help and ignoring the polite and less polite directives to check the coin battery and throughly check out the HARDWARE. Gotcha.

Do as I suggest, replace the battery, and then if you are STILL seeing problems, I’ll test it myself and post the results.

I can gin up software workarounds that make it look like flaky hardware is working correctly, also. Just like you are doing here and depending on as “proof” there’s a problem.