I have been getting intermittent server crashes on my life server and I haven't been able to see naything in the scripting that could be causing this.
Memory usage, cpu usage, and network usage all seem normal. {F26965} {F26966} {F26967} {F26968}
I have been getting intermittent server crashes on my life server and I haven't been able to see naything in the scripting that could be causing this.
Memory usage, cpu usage, and network usage all seem normal. {F26965} {F26966} {F26967} {F26968}
None that have been identified except have my life server having people on it. Sometimes it survives the restarts (which are in 5 hour windows) and sometimes it doesn't. This likely means that it's indirectly related to something in the mission but
1.) I have no means to debug this issue myself since the files produces are not able to be utilized (at least by any means I am aware)
2.) There are no other ways that it is showing the issue. Nothing is thrown to the event viewer. No dumps are created (I have user dumps enabled for the arma3server.exe process).
It does create bidmp files, but I have no way to work with them, if there IS a way to work with them (even for advanced users that can use WinDBG to debug operating system issues and can debug applications), please point me to that information and I can debug this to save you guys some time.
Since I had to re-create an issue that was never fixed, I have put all the files, each containing a bidump and it's corresponding RPT file (I have enabled rpt output since this started happening). You can get them from the following links:
https://paronity.com/i/crash1.zip
https://paronity.com/i/crash2.zip
https://paronity.com/i/crash3.zip
https://paronity.com/i/crash4.zip
https://paronity.com/i/crash5_default_malloc.zip
https://paronity.com/i/dump_default_malloc_with_mdmp.zip
https://paronity.com/i/crash_6_default_malloc.zip
https://paronity.com/i/crash_7_default_malloc.zip
https://paronity.com/i/crash_8_default_malloc.zip
https://paronity.com/i/crash_9_default_malloc.zip
It's not windows. If it were windows, there would be events being thrown and a memory dump would be created (because I have user-mode dumps enabled for that executable).
Memory related crash is possible, but you, as the software that is running are responsible for managing those memory calls and handling them should they fail.
Is there anything I can do to help further debug the issue on your end? I'm an operating system level debugger for a living and have a lot of experience in this area, I just need to know what I can do with your software to do so. I am doing everything possible from the OS level, and am getting no where, which leads me to believe it's on ARMA side of things.
The only thing on the box is this particular server (and the MySQL instance for it) - and the server has 32 GB of RAM.
Most of the time, the ARMA server idles around 110-200 MB of RAM (with the 3rd party malloc) and 1.4GB-1.8GB (with the default malloc) depending on player count, object count, ect....
The system NEVER gets above 25% RAM usage (if we are doing testing with another server, or something else with the box at that point in time). Point being, we aren't even coming close to running out of RAM.
https://paronity.com/i/8i8I3.png
For the sake of proving absolutely nothing, I ran a server on the box (default map with no customization) since my last post, with a tool restarting it every 12 hours and it never crashed. So, as was already obvious it is something in the custom code that is causing the issue (but we already knew that), but that doesn't help us (content creators and server admins) figure out what is causing the issue because it's YOUR engine that isn't handling the exception correctly. What's more, your code is "handling" it enough to prevent windows from seeing the failure (hence no user-mode dumps and no event log entries) but not enough to do its job.
I would like to be able to figure out what code is causing the issue, but you, the creator of the engine, provide me no valid way of doing so.
I poked around in the hex editor in the bidmp files and can see the error
that it "thinks" is causing the issue which is:
"Out of memory (requested 4203 KB).
footprint 536870912 KB. pages 81920 KB.
B, mapped 50040832 B), free 60440576 B", but as I said before, the server is on a box with 32 GB of RAM (https://paronity.com/i/6Z7g3.png [^]) and its the only thing running. It's RAM consumption stays around the 110MB-169MB range. (https://paronity.com/i/1E2l4.png [^])
This was the "expired" issue that was never resolved, assisted with, or debugged in any way.
Heya! You might have noticed that the issue is assigned. That means we have taken a look at the issue and analyzed the crashdump files! Our programmers are currently investigating the analyzed information from the dump files to find a fix for this issue.
I've been playing arma 3 since march of 2014. I've paid for all your DLC's, I've been playing on this server because they represent the Arma gaming community professionally. Saying that I hope you can find a fix @Adam because its frustrating for me to see BI not taking this ticket seriously.
Can we please get something done from this? If this gets fixed then more arma servers will benefit from it. Making the arma community better. Stop ignoring these reports
90% of the time I am still seeing at least ONCE per day. Sometimes it's after the server has been up for 45 minutes, and sometimes it's 3 and a half hours.
I can find no common denominator. The only thing that is odd and consistent is that it never creates MDMP.
I can't get Windows to catch the user-mode dump still either.
I uploaded 6 files, 3 are with a memory error http://puu.sh/lCoWf/537b0e4721.png
using malloc4tbb
the other 3 are
http://puu.sh/lKGv6/edfa97b073.png
http://puu.sh/lKGuS/b73151da84.png
using any other malloctbb and the other dll's not including malloc4tbb
Server crashes with 60+ at 1hour +/- 15mins
Recreated here at on 2 separate servers when they have many players. I've seen as long as 2 hours or so with 30-40 players, 0.75-1.25hr with 60-70. When the servers have 0-15 players we have experienced uptime upwards of 8 hours without issue.
Happened as stated, since the last major patch (1.50 ? I don't remember).