Page MenuHomeFeedback Tracker

Network message XXXXXX is pending (SOLVED !)
Closed, ResolvedPublic

Description

Arma 3 dedicated server start freezing and as result kicked all players with messages BattlEye: Query Timeout and BattlEye: Client Not Responding. After that mission continue working, players can reconnect and continue playing until next freeze. No mission, settings or server hardware were changed when issue start occurs.

After 2 months of investigation I found what next things are not related: mods usage (with/withot extDB3, without mods at all), operation systems (Windows/Linux), CPU vendor (Intel/AMD), Virtualization(QEMU, direct hardware usage), Arma 3 settings and startup parameters.

With latest profiling build (200_146952 v06) BattlEye: Query Timeout is gone. But players still kicked with BattlEye: Client Not Responding, game freezes at this time and Network message XXXXXX is pending still in logs.

Server and HC placed on one machine with 12 cores (24 threads). HC handle AI groups without vehicles and their waypoints, nothing else. At freeze time CPU load is about 35-45%, disk (nvme) used as usual and network has a huge peak. This issue could occurs even if players run in the field with no AI or other objects near this place.
All time server and HC FPS is about 98-100 (limited to 110). Players FPS about 40-50.

Detailed logs show next:

Server: Network message 779b01 is pending (player: HC1, (2:2285) - B Charlie 2-1:2)
Server: Network message 779b01 is pending (player: HC1, (2:2283) - B Charlie 2-1:1)
Server: Network message 779b01 is pending (player: HC1, (2:118) - 21aeac99600# 1813719: pier_f.p3d)
Server: Network message 779b01 is pending (player: HC1, (2:1385) - 21aeebc4b80# 1814887: lampharbour_f.p3d)
Server: Network message 779b02 is pending (player: HC1, (2:1585) - 21aeeadf580# 1815087: lampharbour_f.p3d)
Server: Network message 779b02 is pending (player: HC1, (2:1441) - 21aeeb83580# 1814943: lampharbour_f.p3d)
Server: Network message 779b02 is pending (player: HC1, (2:1449) - 21aeeb7a080# 1814951: lampharbour_f.p3d)

Strange thing here is indexes, they are not unique. For example, look at 779b01 - one message index related to groups, pier and lampharbour.
After 1-2 seconds indexes become unique:

Server: Network message 779d58 is pending (player: HC1, (2:2442) - 21b03bc4b80# 1818853: dummyweapon.p3d)
Server: Network message 779d64 is pending (player: HC1, (2:1952) - 21af6770100# 1815730: czechhedgehog_01_new_f.p3d)
Server: Network message 779d75 is pending (player: HC1, (2:348) - 21aeb34a080# 1813949: portablelight_02_double_f.p3d)
Server: Network message 779d7e is pending (player: HC1, (2:59) - 21ae838cb80# 1813660: runway_edgelight.p3d)
Server: Network message 779d83 is pending (player: HC1, (2:530) - 21aeb194b80# 1814131: tentlamp_01_standing_f.p3d)
Server: Network message 779d87 is pending (player: HC1, (2:379) - 21aeb256080# 1813980: mil_wallbig_4m_f.p3d)

So server try to send information to headless client (player: HC1) but it's failed to send because something huge eat all network bandwith.

Let's look players networking diagnostics at freeze time (-networkDiagInterval=30):

*** Start: Players Networking Diagnostics ***
  HC1 bandwidth=<16777216,16777216,16777216> ping=<0,0,0> desync=0
  Хьюи  bandwidth=<11951,12443,12195> ping=<60,65,62> desync=0
  Sezam Sky bandwidth=<10777,10997,10777> ping=<73,74,73> desync=0
  DiDra bandwidth=<5039,7987,7825> ping=<55,58,56> desync=0
  SOL bandwidth=<5181,9729,9535> ping=<51,53,52> desync=0
  Talos bandwidth=<5175,9642,9449> ping=<81,82,82> desync=0
  keiZ bandwidth=<4206,9709,9514> ping=<59,60,59> desync=0
  ToxaBes bandwidth=<3768,9604,9411> ping=<1,1,1> desync=0
  [RT] Bit-RamoN bandwidth=<4508,9812,9812> ping=<42,43,42> desync=0
  gr1nv1ch bandwidth=<5115,10575,10363> ping=<31,31,31> desync=0
  Banka_RedBulla bandwidth=<3866,9961,9762> ping=<48,52,49> desync=0
*** End: Players Networking Diagnostics ***

All fine, even HC1. So it's something very private, between server and headless client.

Next, look at network usage on Server at freeze time (-NetStrStatsExport):

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
         player | sent msg | sent msg/s |     sent B |   sent B/s | rec msg | rec msg/s |     rec B |   rec B/s |        BW min |        BW max |        BW avg |
 headlessclient |   562739 |     346.24 |  148883164 |  407146.09 | 1409146 |    425.54 | 302626726 |  86533.77 | 2147483648.00 | 2147483648.00 | 2147483648.00 |
           Хьюи |   240664 |      77.59 |   96131803 |  353779.06 |   12750 |     11.70 |   1451879 |    931.80 |     445132.00 |    1592764.00 |    1506949.00 |
      Sezam Sky |   223164 |      63.90 |   92343089 |  351080.34 |   12338 |     13.43 |   1287035 |    988.53 |     242795.00 |    1407564.00 |     839165.00 |
          DiDra |   187093 |      29.60 |   84818081 |  345087.31 |   23235 |     28.03 |   2777526 |   4039.36 |     228513.00 |    1002295.00 |     363873.00 |
            SOL |   216055 |      35.20 |   52840695 |  346318.50 |   10373 |     11.60 |   1214854 |    895.61 |     121391.00 |    1245368.00 |     374198.00 |
          Talos |   176393 |      22.53 |   44954244 |  343675.22 |   17433 |     14.10 |   2275027 |   1382.54 |     133401.00 |    1234210.00 |     373729.00 |
           keiZ |   225959 |      33.23 |   55033326 |  346607.47 |   10090 |      8.20 |   1195383 |    672.79 |     106201.00 |    1049556.00 |     303739.00 |
        ToxaBes |   209428 |      26.60 |   52106671 |  345149.81 |   11375 |      6.47 |   1332064 |    463.70 |      95102.00 |     940235.00 |     272072.00 |
 [RT] Bit-RamoN |   163201 |      34.80 |   42466062 |  346863.66 |   19485 |     18.87 |   2606303 |   1849.18 |     125256.00 |    1124727.00 |     325513.00 |
       gr1nv1ch |   217349 |      37.80 |   53861338 |  347110.66 |   16362 |     16.57 |   1980491 |   1625.49 |     142182.00 |    1276289.00 |     369415.00 |
 Banka_RedBulla |   210721 |      27.36 |   52954773 |  345186.25 |   12984 |      8.23 |   1454944 |    655.29 |     118169.00 |     964748.00 |     279172.00 |
          Total |  5324337 |     734.85 | 1688568268 | 3878004.25 | 1780211 |    562.73 | 343323830 | 100038.06 |                                               | 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

And network usage on server by object types:

-----------------------------------------------------------------------
Server - Sent
-----------------------------------------------------------------------
Type       |      msg |    msg/s |          B |          B/s | avg. B
Type_10    |    40289 |    11.13 |  704037391 |   3729904.25 | 17474.7
Type_92    |  1880702 |   801.38 |   96531615 |     41600.86 |   51.3
Type_127   |  6110441 |   458.34 |  454597656 |     30634.19 |   74.4
Type_120   |   710787 |   420.17 |   36493523 |     20708.72 |   51.3
Type_462   |   122874 |    46.26 |   54125905 |     20092.79 |  440.5
Type_113   |   899612 |   104.46 |   59327649 |      6888.47 |   65.9
Type_453   |   447791 |    48.00 |   33046968 |      3538.06 |   73.8
Type_160   |   249597 |   127.12 |    5048211 |      2535.26 |   20.2
Type_468   |    19581 |     8.00 |    6095213 |      1526.90 |  311.3
Type_126   |   119986 |    13.27 |   13068884 |      1441.34 |  108.9
-----------------------------------------------------------------------
Server - Received
-----------------------------------------------------------------------
Type       |      msg |    msg/s |          B |          B/s | avg. B
Type_127   |  2666170 |   862.04 |  196683132 |     55845.51 |   73.8
Type_114   |  1130251 |   452.97 |   29821401 |     11961.27 |   26.4
Type_113   |   323537 |    84.66 |   20276566 |      5335.08 |   62.7
Type_462   |    17097 |     5.80 |    8405195 |      2822.41 |  491.6
Type_126   |    89285 |    23.73 |    9646303 |      2599.93 |  108.0
Type_92    |   179203 |    42.23 |    9384610 |      2077.76 |   52.4
Type_453   |    18462 |    12.63 |    1510719 |      1040.86 |   81.8
Type_163   |    88241 |    34.50 |    2400551 |      1014.63 |   27.2
Type_120   |    58590 |    12.37 |    4154599 |       897.17 |   70.9
Type_130   |    31748 |    15.33 |    2046373 |       869.41 |   64.5
-----------------------------------------------------------------------

Type_10 and Type_127 are quite high in server sent part.

Object types are under witness protection program so we don't really know who is who.
But I want to beleive, what Type_127 is related to vehicles and always quite high.

Type_10 is related to public variables, and this is really too high for it. This situation with Type_10 occurs in 100% of cases when players freezes and kicked out.
I doublechecked mission - no any huge data sending via public variables or some mass sending in cycles.

I don't know answers on next questions:
1: if this issue is public variable related, why it send it to HC only?
2: why this issue occurs only with HC using no matter of amount AI on it?

Event Timeline

ToxaBes created this task.Nov 25 2020, 2:44 AM
ToxaBes updated the task description. (Show Details)
ToxaBes added a comment.EditedNov 25 2020, 2:08 PM

Added detailed logs, server.cfg, CPU usage screenshots etc.

MaxQ added a subscriber: MaxQ.Nov 25 2020, 9:46 PM
MaxQ removed a subscriber: MaxQ.

UPDATE

After adding next lines to basic.cfg (for 100Mb/s channel):
`
MinBandwidth = 90000000; 90mbit guaranteed available bandwidth
MaxBandwdith = 100000000;
more than 100mbit will never be available
`

issue become more smooth but not gone. So I start checked public variables (Type_10) on server sent via publicVaruable and found only one variable sending complex type (array of arrays) in cycle.

Example of code on server related to issue:

[] spawn {
    while {true} do {
        SOME_DATA = [];
        {
            SOME_DATA pushBack [_x,_x,_x];
        } forEach [1,2,3];   
        publicVariable "SOME_DATA"; // array of 3 arrays: [[1,1,1],[2,2,2],[3,3,3]]
        sleep 60;
    };
};

This code should send array with 3 elements each is also array (x,y,z) every 60 seconds but after 30-60 mins SOME_DATA = []; stop clearing array and it increases 3,6,9,etc. As result, this code produce huge newtwork traffic from server to headless client and kick all other players. This issue occurs only when Headless client used. Resulting array on players PCs is correct all the time.

I removing this variable and amount of Type_10 messages decreased in ~25 times. No more freezes and no players kicking.

I think this behaviour is somehow related to complex types serialization logic like serialization for array of arrays.

Solution:

  1. Use correct basic.cfg for decrasing lag
  2. Do not share complex types via network with publicVariable if you use headless clients. Use only string, number and boolean types.

Solution tested on Windows and Linux - working fine.

I want to say thank you to @Dwarden and @dedmen for their help with catching this issue. Guys, without you I'd never catch this.

ToxaBes renamed this task from Network message XXXXXX is pending to Network message XXXXXX is pending (SOLVED !).Nov 26 2020, 7:49 PM
Dwarden closed this task as Resolved.Nov 27 2020, 6:17 PM
Dwarden changed Resolution from Open to Not A Bug.