Page MenuHomeFeedback Tracker

Network congestion / server chokes
Closed, ResolvedPublic

Description

Since the 1.52 I have been getting server chokes / network congestion on our A3Wasteland servers. This causes major lag / non responsiveness.
Have update to perf2 binary and still get these errors.

log is filled with lines like:

2015/10/02, 15:16:36 NetServer::SendMsg: cannot find channel #1879553408, users.card=15
2015/10/02, 15:16:36 NetServer: users.get failed when sending to 1879553408
2015/10/02, 15:16:36 Message not sent - error 0, message ID = ffffffff, to 1879553408 (Sparks)

2015/10/06, 14:40:19 Server: Network message 13a526 is pending

See attached logs.

Also see ASM screendumps:
http://i.imgur.com/wDWZl7k.png (A3Wasteland)
http://i.imgur.com/gtBQHsr.png (Exile)

These drops/chockes like in the screenshots happen multiple times over at random. {F27210} {F27211} {F27212} {F27213} {F27214} {F27215}

Details

Legacy ID
3994001533
Severity
None
Resolution
Open
Reproducibility
Random
Category
Server
Additional Information

Already tried disabling mArma which didnt help. Tried changing basic.cfg settings, didn't help. Tried disabling addons in the mission, didn't help. Now disabled ASM to see if that fixes it.

Topic on A3Wasteland forums about this problem showing i'm not the only one experiencing this:
http://forums.a3wasteland.com/index.php?topic=2150.0

My mission: https://github.com/LouDnl/ArmA3_Wasteland.Altis
Vanilla mission: https://github.com/A3Wasteland/ArmA3_Wasteland.Altis

Event Timeline

loudnl edited Additional Information. (Show Details)Oct 7 2015, 2:20 PM
loudnl set Category to Server.
loudnl set Reproducibility to Random.
loudnl set Severity to None.
loudnl set Resolution to Open.
loudnl set Legacy ID to 3994001533.May 8 2016, 12:54 PM
loudnl added a subscriber: loudnl.May 8 2016, 12:54 PM
loudnl added a comment.Oct 7 2015, 2:21 PM

Added basic.cfg

Added my rpt's and my basic config as well.

We are experiencing the same issue. We've tried running default A3 Wasteland but we are seeing this in Exile as well.

Server FPS stays steady with it full, always 15+ (sometimes dips to 13) until randomly the server starts to hang and after a few minutes and sometimes a bunch of disconnects we start responding again.

Issue did not exist at all previous to this and we actually had more people playing than we do now (70 vs 60)

We get the following messages in our .rpt as well however they do not tend to be filled with these.
17:33:45 NetServer::SendMsg: cannot find channel #422567950, users.card=17
17:33:45 NetServer: users.get failed when sending to 422567950
17:33:45 Message not sent - error 0, message ID = ffffffff, to 422567950 (fbr)

loudnl added a comment.Oct 7 2015, 2:47 PM

I would like to add that we get this regarding the player amount or server fps. As seen in my ASM screenshot the server runs at 50 fps and only 13 players when it happens. Sometimes these chokes are really short and sometimes longer.
extDB2 stops responding aswell. It really seems as if the server.exe stops all traffic as if it hangs for a short period.

loudnl added a comment.Oct 7 2015, 3:15 PM

Disabling ASM didn't help. Still getting drops. BEC and RCON lose connection to the server aswell.

Ultimate Wasteland server has the same problem. All server traffic stops completely for anywhere from 10 seconds to 2 minutes. I recently saw it freeze with only two people online when the server fps was at 47, the freeze lasted more than 60 seconds. When the server finally resumes the fps starts out near 0 and works its way back up to 40-50. On the physical server itself the ram/cpu utilization were both below 20% during the hang.

The problem appears worst immediately after a restart, when the server has high FPS and low player count. I've noticed with higher player counts and lower server FPS the freezing is less frequent.

I'm actually a software developer, from the symptoms it would appear to me that a deadlock is occurring. I assume this is some blocking code during thread synchronization. Here is my reasoning... the system resource utilization remains normal so it's not like an intense process is maxing out the CPU, it more like a wait on a locked resource, plus it happens more often during high FPS periods which makes sense if a "thread safe" resource was being accessed 2-4x as often.

GamersInc servers have the same problem.

We from SAD have the same issue. I ran a test with an empty server and let it run for about 24 hours. There was not a single server freeze so that makes me think it is caused by players.

loudnl added a comment.Oct 9 2015, 2:28 PM

From what I see it seems to happen more on Altis then on Stratis. But I could be wrong.

If I have 10.000+ lines with this:
11:37:30 No player found for channel 982774144 - message ignored
Usually preceded by:
11:32:26 NetServer::finishDestroyPlayer(178633756): DESTROY immediately after CREATE, both cancelled

Seems to only happen on A3Wasteland servers. Our Exile servers do not suffer this problem.

This is still an issue. Even with newest perf and tbbmalloc.

Jermin added a subscriber: Jermin.May 8 2016, 12:54 PM

I'm running a Linux server with A3W. I'm getting the same lines in the log. The FPS is at least 25% less than in 1.50, even with the perf v4 binary.

This is still an issue! Tried disabling mission related addons, scripts, missions etc. Doesn't seem extDB related as couchDB as this too.

xai added a subscriber: xai.May 8 2016, 12:54 PM
xai added a comment.Oct 31 2015, 7:21 PM

I have the exact same issue with an Arma 3 Epoch Server. On both Windows an Linux.

Is anyone still getting the lag and desync chokes? We switched to v8 of the perf binaries which pretty much fixed it. When we switched to v9 the desync and lag came back. So the fix for us was somewhere in v8.

https://github.com/A3Armory/ArmA3_Wasteland.Altis

Jermin added a comment.Dec 4 2015, 1:59 PM

This issue remains in 1.54.

dedmen closed this task as Resolved.May 18 2020, 11:09 AM