On Wed, Aug 23, 2006 at 12:32:23PM +1000, Stephen Thorne wrote: > For correctness, "int alarm_count;" should be "sig_atomic_t > alarm_count;". Thanks, I've taken this. I've propragated the SIGALRM changes to the other robots and done a few tests without any problems observed. When it came to propagate the changes to ntserv, the main loop wasn't just a simple pause(2), it was a select(2). This introduces the well-known race condition. See man select ... "Suppose the signal handler sets a global flag and returns. Then a test of this global flag followed by a call of select() could hang indefinitely if the signal arrived just after the test but just before the call." In the case of ntserv, the select had a timeout, so people may have seen occasional two-second pauses when this happens. The popular solution I've seen in other projects is a signal pipe, which is a fifo into which the signal number is written when the signal handler is called. The select then includes the fifo on the list of file descriptors to monitor. This is implemented now, look at ntserv/sigpipe.c in my repo, and the changes to ntserv/input.c (the main event loop). -- James Cameron mailto:quozl at us.netrek.org http://quozl.netrek.org/