Page MenuHomeFeedback Tracker

Server performance / Scripting performance loss after ~5 hours of runtime
Need More Info, NormalPublic

Description

What?
Server scripting / callLater perforamnce impact after ~5hours of server runtime.
This might impact ALL scripts or other game functionality

Why?
No idea.

How?
I ran a vanilla server with only 1 mod, which logs the scripting performance / server FPS using callLater to a file.
The FPS is usually in sync with the value seen in the server console,
but after ~5 hours, the FPS readout using callLater will start to show numbers extremly low (4 FPS - 50 FPS)

During the time when server console displays FPS values in thousands, the scripting one displays extremly low numbers
The average FPS from scripintg during normal runtime (until ~5 hours), is returned as ~1000

Details

Severity
Major
Resolution
Open
Reproducibility
Always
Operating System
Windows 7 x64
Operating System Version
Windows 7 Ultimate
Category
Scripting
Steps To Reproduce
  1. Run server without FPS LIMIT
  2. Log time between CallLater calls
  3. At a certain point the script calls start getting delayed, causing the "FPS measurement" to return low values <--- this usually happens ~4-5 hours in

This works for me 100% of the time, with 0 players ever connecting to the server, with players on the server and with and without mods (only using the framelogger when running without mods).

add the frame logger function to the CallLater:

// add the frame log to the que
GetGame().GetCallQueue(CALL_CATEGORY_SYSTEM).CallLater(dayxSrv_frameLogs_logFrames, 0, true);

Code used inside the frame logger function

dayxSrv_frameLogs_m_Counter ++;

// only check every 100 times it runs
if (dayxSrv_frameLogs_m_Counter % 100 != 0) {
    return;
};

// calculate time from last tick
dayxSrv_frameLogs_m_FPS = (GetGame().GetTickTime() - dayxSrv_frameLogs_m_LastTick);
dayxSrv_frameLogs_m_LastTick = GetGame().GetTickTime();


// calculate the average over time
dayxSrv_frameLogs_m_FPS = dayxSrv_frameLogs_m_FPS / 100;  // time it took on avg for 1 frame
dayxSrv_frameLogs_m_FPS = 1/dayxSrv_frameLogs_m_FPS;


// then I write this value into a file:
//dayxSrv_frameLogs_m_FPS
Additional Information

I've been running my tests on a very weak machine.
8GB ram and old CPU (i5-2410m), maybe this is a reason why this happens to me this fast (I haven't tested on other machines)

I've been loggin time it's taken to reach the point when the callLater starts getting delayed:
It's always around 4-5 hours
https://i.imgur.com/RqsrJdu.jpg
https://i.imgur.com/rZTrasU.jpg
https://i.imgur.com/Lw6xCQj.jpg

Also Attached: server logs from the time of running

Event Timeline

nigel created this task.May 29 2020, 7:18 PM
nigel edited Steps To Reproduce. (Show Details)
Geez changed the task status from New to Need More Info.Jun 1 2020, 4:30 PM
Geez added a subscriber: Geez.

Hello nigel.
Do you notice any increase in memory usage after these five hours?
Regards,
Geez

nigel added a comment.Jun 2 2020, 9:43 PM

If it's any help, the server will stay up and running without hitches for 20+ hours (restart due to auto-updating mods) when using the launch parameter -limitFPS=200

nigel added a comment.Jun 3 2020, 4:24 PM

Is there any way to see what is currently in the call queue?

tjensen added a subscriber: tjensen.Jan 1 2021, 5:21 PM

Hello @Geez! It seems like ScriptCallQueue completely stops working after a server has been running for over 4.5 hours. My mods depend on ScriptCallQueue so I have to restart my servers every 4 hours to work around the problem.

I have created a small, self-contained mission init.c to exhibit the issue:

Also, here is the server config file I used:

Here is the command line I used to run the server with the attached mission and server config: (note that no mods are loaded)

DayZServer_x64.exe "-config=test.cfg" "-profiles=testprofile" "-mission=P:\test.chernarusplus"

Here are the log files generated by the server:


My example code uses ScriptCallQueue.CallLater with both repeat=false and repeat=true as I wasn't sure if the issue was specific to a particular argument value. As you can see, I started the server at about 2:20 PM and every 30 minutes there are three log lines showing the CallLater functions being invoked.

SCRIPT       : 1826909 | 1826.91 | CallSelfLater
SCRIPT       : 1826909 | 1826.91 | Repeated
SCRIPT       : 1827909 | 1827.91 | Delayed

The final grouping of those log lines is at about 6:50 PM, but the server continues running for another 2 hours. For some reason, the CallLater calls do not happen again at 7:20 PM, 7:50 PM, 8:20 PM, or 8:50 PM.

I also monitored the process memory usage while the server was running: (each line is 10 minutes apart)

PS C:\Users\tjensen> while (1) {get-process -Name dayzserver_x64; start-sleep -seconds 600}

Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
-------  ------    -----      -----     ------     --  -- -----------
    584    5106  1058088    1042736       7.61  22276   1 DayZServer_x64
    602   32653  4049412    3669440     769.75  22276   1 DayZServer_x64
    602   32671  4049988    3669996   1,386.70  22276   1 DayZServer_x64
    594   32678  4047056    3668124   2,003.45  22276   1 DayZServer_x64
    594   32678  4047056    3668124   2,619.03  22276   1 DayZServer_x64
    594   32678  4047056     420004   3,234.61  22276   1 DayZServer_x64
    594   32679  4049176     423960   3,850.70  22276   1 DayZServer_x64
    594   32683  4049304     433700   4,466.42  22276   1 DayZServer_x64
    594   32683  4047232     449436   5,078.22  22276   1 DayZServer_x64
    594   32684  4047248     451824   5,690.77  22276   1 DayZServer_x64
    594   32684  4047248     453464   6,302.61  22276   1 DayZServer_x64
    594   32684  4047248     458908   6,915.45  22276   1 DayZServer_x64
    594   32684  4047260     467044   7,529.86  22276   1 DayZServer_x64
    594   32693  4047540     480772   8,146.36  22276   1 DayZServer_x64
    594   32717  4048276     491376   8,763.69  22276   1 DayZServer_x64
    594   32717  4050372     493744   9,381.63  22276   1 DayZServer_x64
    594   32717  4050372     494752   9,996.59  22276   1 DayZServer_x64
    594   32756  4049480     507600  10,613.22  22276   1 DayZServer_x64
    594   32773  4050008     515768  11,229.61  22276   1 DayZServer_x64
    594   32775  4050088     520964  11,846.11  22276   1 DayZServer_x64
    594   32776  4050108     523752  12,462.56  22276   1 DayZServer_x64
    594   32777  4050140     527044  13,077.41  22276   1 DayZServer_x64
    594   32781  4050272     530428  13,692.38  22276   1 DayZServer_x64
    594   32782  4050280     533964  14,308.48  22276   1 DayZServer_x64
    594   32782  4050280     536660  14,925.13  22276   1 DayZServer_x64
    596   32782  4050280     539452  15,542.25  22276   1 DayZServer_x64
    596   32788  4050488     546108  16,159.23  22276   1 DayZServer_x64
    596   32788  4050488     551048  16,775.41  22276   1 DayZServer_x64
    596   32788  4050488     554100  17,391.50  22276   1 DayZServer_x64
    596   32788  4050488     556968  18,008.09  22276   1 DayZServer_x64
    596   32788  4050488     562180  18,624.89  22276   1 DayZServer_x64
    596   32796  4050728     568276  19,242.14  22276   1 DayZServer_x64
    596   32818  4051420     580804  19,858.11  22276   1 DayZServer_x64
    596   32819  4052476     581928  20,475.58  22276   1 DayZServer_x64
    596   32819  4052480     583796  21,093.30  22276   1 DayZServer_x64
    596   32820  4052520     590832  21,711.88  22276   1 DayZServer_x64
    596   32832  4052880     602804  22,329.72  22276   1 DayZServer_x64
    596   32836  4053024     610648  22,949.66  22276   1 DayZServer_x64
    596   32838  4055148     622132  23,565.20  22276   1 DayZServer_x64
    596   32838  4056200     626960  24,183.02  22276   1 DayZServer_x64

As you can see, there is no significant memory usage increase. There is a sudden drop in the WorkingSet (WS) column, but it happens during the first hour and doesn't seem related.

Please let me know if you have any questions.

Thanks for the post @tjensen
Try adding a fps limit to the server and see if that circumvents the problem!
It does for me, but this does not fix the problem as running the server at limited fps is not ideal

-limitFPS=200

Maybe this will somehow help nail down the issue!

Yes, adding the -limitFPS=200 command line argument appears to work around the issue.

Here are the logs from a 12-hour run with it added to the command line:


Interestingly, for the first 4.5 hours, the CallLater calls happen like clockwork, almost exactly 1800 seconds apart when you check the printed GetTime() or GetTickTime() values. However, the difference between the 4.5-hour and 5-hour values is 1733 seconds apart:

4.5-hour:

SCRIPT       : 16227908 | 16227.9 | CallSelfLater
SCRIPT       : 16227908 | 16227.9 | Repeated
SCRIPT       : 16228918 | 16228.9 | Delayed

5-hour:

SCRIPT       : 17961129 | 17961.1 | CallSelfLater
SCRIPT       : 17961129 | 17961.1 | Repeated
SCRIPT       : 17962076 | 17962.1 | Delayed

The values after 5 hours are all a minute or so short of 30 minutes from the previous values. Is it a coincidence that the time deltas shrink at roughly the same time that CallLater stops working without the FPS limit?

Is the GetTime() value stored internally as a 32-bit float? That could explain the issue, since 32-bit floats have only 24 bits of mantissa and therefore integer values above 16777216 can lose precision when converted to 32-bit floats.