View previous topic :: View next topic |
Author |
Message |
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Tue May 31, 2005 21:56 Post subject: Odd crashes |
|
|
System specs:
Win2K Pro-custom load with very little running
MySQL 4.0.18
Intel P4 1.8 on Intel 845 main board
512 Megs ram
Issue:
I just started working on the mod/world again after 6 months off. I wished to use the new object storing functions to implement new features. During testing everything seemed fine on my test system. I updated NWNX on my main server and fired up the module on it. Again everything seemed fine. No errors in the log files (with debugging turned all the way up) and was stable but used more CPU (30% when empty) than previously. I attributed it to the changes in the module.
Since then every 4-6 hours NWServer will crash. No errors, nothing in the NWNX-ODBC or the NWServer log files. Checkpoints in scripts do now show any that start and don't stop, etc. Nothing in the Win2K log files. It doesn't matter if the server is empty or whether there are players on. I could discern no pattern other than time. RCO/SCO was turned off.
I do have LetoScript (vers. 1 loaded and nwnx_functions loaded. They are used ONLY during server start to populate NPCs and setup player corpses. They are not accessed at all after the first 5 minutes of server uptime.
Server resources did not seem to rise, but I have been unable to monitor while it crashed I also tried both ODBC and Direct MySQL connections, with no change in crash behavior.
After three days of fighting with the module and making numerous changes, I rolled NWNX back to a previous version I was using before and have had no further crashes in the last 24 hrs. CPU Useage is back down to the 5-13% (empty) that it was before the updates and the module feels smoother. I don't know what version of the NWNX plugins I am currently running as my FTP client insists on altering their dates but the sizes are as follows:
NWNX-MODULE.DLL -- 57,344
NWNX_ODBC.DLL -- 98,304
NWNX_FUNCTIONS.DLL -- 40,960
Any ideas? I would really like to convert my custom NPC spawn/creation scripts from StoreCampaignObjects to NWNX.....
//-----------------editted to add----------------
The server and module, along with the previous version of NWNX, has been running for over a year and a half and has never had any stability issues with player loads up to 15 at a time. I cannot remember it ever crashing except when I had a run-away army spawn script that spawned and levelled 150 creatures to 15 CR at one time on an 8x8 map...... |
|
Back to top |
|
|
Lokey
Joined: 02 Jan 2005 Posts: 158
|
Posted: Wed Jun 01, 2005 6:43 Post subject: |
|
|
Latest version of Leto (23beta4) is here. It's new syntax for some functions, but can't recall any heinous crashbugs from version 18 (first NWNxLETO release).
Nothing in log files means nothing there (look at permissions stuff) or nothing obvious? Maybe something that's running out of control...it's hard to make NWN crash reliably. If everything setup alright (I'd clean install all the latest versions for the NWNx apps you use)...I'd guess something running out of control (recursion that doesn't terminate). _________________ Neversummer PW NWNx powered mayhem |
|
Back to top |
|
|
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Wed Jun 01, 2005 8:20 Post subject: |
|
|
Lokey wrote: | Latest version of Leto (23beta4) is here. It's new syntax for some functions, but can't recall any heinous crashbugs from version 18 (first NWNxLETO release).
Nothing in log files means nothing there (look at permissions stuff) or nothing obvious? Maybe something that's running out of control...it's hard to make NWN crash reliably. If everything setup alright (I'd clean install all the latest versions for the NWNx apps you use)...I'd guess something running out of control (recursion that doesn't terminate). |
Leto is running fine, even with my older version of NWNX. As is NWNX_FUNCTIONS. Syntax for the newer Leto has altered too much for me to modify right now, I just figured out how to reliably do what I want with the 18 syntax
By nothing in the logs I meant nothing unusual. With checkpointing and log searches I can tell that every script that runs completes, no errors are showing up in the logs (NWN or System). Not even Dr. Watson fires when NWServer fails responding, it just stops. Plus whenever it does stop and NWNX tries to restart the server I get the infamous "Port in use" errors.... CPU Useage was up as much as 30% higher even with the server empty.
I tried downloading the latest version of NWNX fresh, same issues. I extracted the previous build from January out of my archive directory. Same issue. The copy I am running now (which I don't know what version, just glad I copied the entire NWN directory when I made major alterations last December) has now run perfectly for 20 hours and the module is much smoother. Memory creep has been minimal and CPU useage hasn't risen with the server lightly loaded (6 players this evening). This module in its previous form ran for over 6 weeks without reboots or llag with the version of NWNX that is currently running, almost all script changes have been in the OnLoad parts of the system.
Guess I will just stick with this setup until the next version of NWNX comes out and I can test it.....
//------------------------editted to add--------------------
{sighs} I feel like an idiot......
The version that is working fine for me is:
NWNX 2.5.3
ODBC 0.8.8
Accroding to the log files |
|
Back to top |
|
|
Papillon x-man
Joined: 28 Dec 2004 Posts: 1060 Location: Germany
|
Posted: Wed Jun 01, 2005 21:07 Post subject: |
|
|
The root of this problem might be hard to find. If I were you, I would not wait for a new version of NWNX that magically solves all troubles - since I do not know what is causing the problems on your system, I can do nothing to fix it.
The oddest thing is the 30% CPU usage that you see. Does that happen with nwnx and the odbc plugin alone (i.e. with no other plugins active) ? Does it happen with the aps_demo module ? Did you check the profiler for any apparent script problems (that maybe only occur with the current nwnx) ? Also you could enable the highest profiler logging option to see what scripts have been running last, before a crash occurs.
If you are running ODBC 0.8.8 it is already very close to the the current version, minus the SCORCO hooks, btw, that makes it even stranger. Please try to enable the scorco hooks and see if that changes anything. You could also temporarily use SQLite instead of MySQL.
These are just some ideas from the top of my head. I am sure the problem can be found, but it requires some assistance from your side. _________________ Papillon |
|
Back to top |
|
|
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Wed Jun 01, 2005 22:09 Post subject: |
|
|
ThanX Papillon!!!
I will do some further testing and see if I can come up with anything of help. It might be a while though, my time to work on things is very limited. And what I am running is stable so there is really no big reason to even update if I can't find out why its not working right.
I did try with the SCORCO both active and inactive with the same crash issues as I wondered if that was the issue. As far as disabling the other plugins I could pre-load the NPC databases and run without players on my test server, but to not have them functional with others on WILL cause issues with my death system's corpses. Profiler I can run on both and see what the differences are. My test box still has NWNX Current loaded, it isn't the same hardware though. Leto, Functions and ODBC are the only plugins registered. |
|
Back to top |
|
|
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Sat Jun 04, 2005 8:14 Post subject: |
|
|
Update:
Over the last 4 days I have adjusted and cleaned the module up, cutting my script delta in have with the help of the profiler. The module had not crashed once during this time. The last change I made was with the new module at 11:00 pm Thursday evening.
This evening at 4:30 pm (after 17.5 hours uptime) I put the current version the DLLs and NWNX2 application (from a fresh download) on the server and restarted it. CPU useage was quivelant to what the older install was (yay! ). After 4 hours and 35 minutes NWServer crashed. Again there were no errors in any of the log files with full debugging turned on. I made note of the script that was last to run (one of PrC's) and started it again. 2 hours and 15 minutes later it crashed again, this time the last one was one of the default Bioware scripts. First time there were two people on the server, second time there were none.
I guess I stay with the old version.... |
|
Back to top |
|
|
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Sun Jun 05, 2005 9:10 Post subject: |
|
|
Continued update:
I re-added the current NWNX and pulled both Leto and Functions. Server stayed up for exactly 4 hours and 23 minutes then crashed. Setup used MySQL direct connect. Three minutes before NWServer crashed it stopped talking to MySQL. When the auto-restart attempted to restart the server it came back up but didn't connect back up to the database. No errors were noted in the logs. The log files did NOT cycle, instead started appending current information to the end of the logs from before the crash. There were no players on at time of crash.
Same setup with an ODBC connection ran 4 hours 5 minutes. System continued to talk to the database right until it went down. NWNX failed to restart NWServer due to the port confict error. No errors were noted in the logs. Logs appeared to cycle properly. There was one player on at time of crash, player was sitting in an area and not using any skills nor was in combat. Last script fired was not the same as previous crash.
I set the system back to NWNX 2.5.3 but left the ODBC at 9.2.4 System has been up for 8 hours with no issues using ODBC and lightly loaded.
Full debug mode has yielded no clues in either ODBC logs or NWServer log. Win2K does not register any error logs for the crash either.
Any ideas?
MDAC is 2.8
Win2K SP4 with all critical service packs and most recommended.
AVG anti-virus
MySQL 4.0.18
LetoScript vs. 18
NWNX-Functions
Ultra-VNC
TeamSpeak
No other processes are running that can be turned off and still leave Win2K running.. |
|
Back to top |
|
|
Papillon x-man
Joined: 28 Dec 2004 Posts: 1060 Location: Germany
|
Posted: Sun Jun 05, 2005 10:11 Post subject: |
|
|
Thanks for this information.
The most interesting thing for now is that going back to nwnx 2.5.3 in combination with the current ODBC2 plugin seems to be stable ? Please verify this for a couple of days.
I was thinking that maybe ODBC2 might loose the connection to your MySQL Server and crash after a while - but this seems not to be the case. _________________ Papillon |
|
Back to top |
|
|
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Wed Jun 08, 2005 3:01 Post subject: |
|
|
Papillon wrote: | Thanks for this information.
The most interesting thing for now is that going back to nwnx 2.5.3 in combination with the current ODBC2 plugin seems to be stable ? Please verify this for a couple of days.
I was thinking that maybe ODBC2 might loose the connection to your MySQL Server and crash after a while - but this seems not to be the case. |
Server has now been up for over 72 hours without issues.
I just changed it from ODBC to Direct MySQL connection and will report back in a couple of days.... |
|
Back to top |
|
|
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Sun Jun 12, 2005 5:38 Post subject: |
|
|
100 hours straight with no failures under Direct MySQL connection.
NWNX 2.5.3
ODBC .9.2.4
Anything other information I can give you Papillon?
Incidently I was unable to duplicate this error with another machine also running Win2K with the same version of MySQL. |
|
Back to top |
|
|
Papillon x-man
Joined: 28 Dec 2004 Posts: 1060 Location: Germany
|
Posted: Sun Jun 12, 2005 13:31 Post subject: |
|
|
I know I am asking much, but could you conduct one more test, please ? Just replace NWNX.EXE with the current version, without changeing anything else. This test is meant to get rid of false positives, i.e. making absolutely sure that nwnx.exe itself is the root of your problem.
If it crashes within 2-8 hours, I know where to search. _________________ Papillon |
|
Back to top |
|
|
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Sun Jun 12, 2005 17:31 Post subject: |
|
|
ThanX! Papillon!
NWNX2.exe was replaced at 8:22 PST this morning. The logs still show hte older version, I take it because of the .dlls? We will see how it runs through the day
Thank you for all your work, I can't imagine running a server without NWNX and its plugins everything else just doesn't get the job done.... |
|
Back to top |
|
|
Papillon x-man
Joined: 28 Dec 2004 Posts: 1060 Location: Germany
|
Posted: Sun Jun 12, 2005 22:02 Post subject: |
|
|
NWNX.TXT should show the correct version, the other logfiles show the version of the corresponding DLL. Thanks for testing this. _________________ Papillon |
|
Back to top |
|
|
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Mon Jun 13, 2005 16:21 Post subject: |
|
|
After running for 10 hours with the ODBC connection without issues I shifted to Direct MySQL. Its still running after 12 hours.... |
|
Back to top |
|
|
Acrodania
Joined: 02 Jan 2005 Posts: 208
|
Posted: Tue Jun 14, 2005 3:08 Post subject: |
|
|
Update:
After 20 hours running ODBC connection server crashed unexplainably. No one was on at the time. No errors showed in log files.
Shifted to Direct MySQL to see if it stays running longer times..... |
|
Back to top |
|
|
|