During random gameplay, generally after roughly 3 hours of playtime, the server process will crash with a segfault reported in OS logs. The Arma server itself reports no fault and RPT files contain no crash awareness since the operating system simply terminates the process (SIGSEGV). Nothing about gameplay at the time, server load, or any player action seems significant or relevant in contributing to the crashes - they appear purely random but do only appear to occur after roughly 2-3 hours of playtime.
Typical log in /var/log/messages:
Apr 29 14:20:11 arma303 kernel: arma3server[2760]: segfault at 4b97e500 ip 0000000009b6f64f sp 00000000d03fe0b0 error 4 in arma3server[8048000+2693000]
This crash is mission and mod independent - I've witnessed these crashes on the following missions:
- BECTI Warfare
- Community made "classic" Escape (not the Bohemia one: you start in the Hesco prison)
- Antistasi
- Antistasi RHS (modded)
The server(s):
Virtual Machine(s) running on ESX6.5
Underlying physical host is a HP Proliant DL360 Gen7. 2x4 Core 2.13Ghz 124GB RAM rackmount server.
VM equipped with 2x CPU (2.13 GHz), 8GB RAM, 64GB Disk (LVM, all virtual drives sitting on directly attached physical drive)
OS's tested:
- CentOS Linux release 7.7.1908 (Core)
- CentOS Linux 8
- CentOS Linux 7 (Alt Arch 32 bit i364)
- Debian 10 Buster
Packages installed that differ from base minimal installation:
- ld-linux.so.2 libstdc++.so.6 open-vm-tools python3 sysstat
The arma3server executable being run under standard user privileges with selinux off (permissive and disabled tested) with the following startup line:
/home/arma303/arma3/arma3server -name=arma303server -port=2312 -config=/home/arma303/arma3/server.cfg -mod=@RHSAFRF\;@RHSGREF\;@RHSUSAF >> serverconsolelog.log 2>&1
RCON and server management performed by py3rcon (revolving MOTD messages, automated shutdown and such).
Headless clients appear to make no difference to crash frequency.
Tests performed to isolate the issue:
We've tried simply logging back onto the server after the crash - usually the server will crash at some point within the next 1-3 hours.
Tried rebooting the VM after the crash - no change to crash frequency.
Tried assigning the VM more memory, up to 16GB - no change.
Tried removing all mods (generally 1 or 2 additional max) that arent RHS (meaning USAF, AFRF, SAF and GREF)- no change.
Tried removing persistent saves (such as the Antistasi save) and re-starting the campaigns/missions from scratch - no change.
Tried deleting the entirety of the arma 3 server install folder except server.cfg and reinstall from scratch via steamcmd - no change.
Tried running the server with no RCON interaction at all - no change.
Tried creating a new server completely from scratch and copying meaningful files over (like server.cfg, saves, etc) and re-running server - no change.
Tried moving virtual disks (VM) between NAS host and directly connected high performance disks - no change.
Manually physically reseated RAM within the underlying physical host server.
Run a full memtest on the entire underlying physical host server - all 120GB of memory being checked for almost 24 hours with 0 errors found. No change.
The only thing I have been able to do to prevent the crashes is to run the mission on Windows (2016 Datacenter) server using the x64 executable. In fact the exact same server configuration that was crashing every 3 hours has run for more than 24 hours stable by just moving it to Windows.
GDB register show indicates registers contain values breaching the upper limit of a 32 bit variable.
Following GDB output from a crash on the 29th April 2020 (crash04 is the associated log below).
[root@arma303 arma3]# gdb arma3server core.2746
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /home/arma303/arma3/arma3server...Missing separate debuginfo for /home/arma303/arma3/arma3server
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/a8/b7592a547af6c97391830d206a6133a579ebee.debug
(no debugging symbols found)...done.
[New LWP 2760]
[New LWP 2747]
[New LWP 2761]
[New LWP 2773]
[New LWP 2762]
[New LWP 2772]
[New LWP 2763]
[New LWP 2764]
[New LWP 2765]
[New LWP 2767]
[New LWP 2771]
[New LWP 2774]
[New LWP 3478]
[New LWP 2751]
[New LWP 2746]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Missing separate debuginfo for /home/arma303/arma3/libsteam_api.so
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/16/bcac040eda9bb32cd8504be30a429c9aa92331.debug
Missing separate debuginfo for /home/arma303/arma3/steamclient.so
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/12/8bd9ee9c294925ea030d60506e3cc16422bcf7.debug
Missing separate debuginfo for /home/arma303/arma3/libsteam.so
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/63/b30c1969f3486fe8711c54d21fdb29ac80bf19.debug
Core was generated by `./arma3server -name=arma303server -port=2312 -config=server.cfg -mod=@rhsafrf;@'.
Program terminated with signal 11, Segmentation fault.
#0 0x09b6f64f in ?? ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.i686 libgcc-4.8.5-39.el7.i686 libstdc++-4.8.5-39.el7.i686
(gdb) bt
Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x80000004:
(gdb) info registers
eax 0xcb97e400 -879238144
ecx 0x80000000 -2147483648
edx 0x80000100 -2147483392
ebx 0xa6f89d0 175081936
esp 0xd03fe0b0 0xd03fe0b0
ebp 0x80000000 0x80000000
esi 0x10 16
edi 0xde1f63e0 -568368160
eip 0x9b6f64f 0x9b6f64f
eflags 0x10286 [ PF SF IF RF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99