KernelPanic

back to http://scratchpad.wikia.com/wiki/Sasecurity

roboot on kernal panic
Isn't this the same as the one that Don says won't work? Or does it only work uner some circumstances?

> #!/bin/sh > # > # /etc/rc.d/rc.local: Local system initialization script. > # > # entry to reboot 30 secs after a kernel panic > echo 30 > /proc/sys/kernel/panic

> This is what I put in my rc.local on the mesh box here. > To make it work you would need to reboot the mesh box. > But you can make it work without rebooting by using :- >echo 30 > /proc/sys/kernel/panic from a root login to mesh box. I had problems here occasionally with kernel panics and since implementing > this solution, the box now reboots and comes up cleanly after a kernel > panic. Putting the above in rc.local means the box will always reboot on a kernel panic.

>>Yeah this doesn't work. Tried it a number of times and when a kernel >>panics, it's too late. Sorry I didn't report on this sooner.

This might help you guys If you experience occasional kernel panics (machine freezes, >>>    keyboard LED's, if attached, blink) try the following:

>>>   Edit the normally empty file /proc/sys/kernel/panic to add the >>>   number "30" (without the quote marks). This will the reset the node >>>   30 seconds after a kernel panic. >>>   (copied from the qorvus bbs)

>>>   I looked in the wiki and also searched thru my emails and couldnt >>>   find anything on making a mesh ap recover from a kernal panic.. but >>>   I thought I had read about how to do it but dang if i can find it.. >>> is the >>>   the internal watchdog setting in wiana, if it is all mine are on yes >>>   and always have been. i have some messages from kenny and you >>>   on using a pager for reset which is NOT an option for me right now.. >>>   btw since i have reduced the size of my mesh.. i have went about 6 >>>   weeks without anything really going crazy... almost all the nodes i >>>   have now are on car batteries(at the wrap boards) and UPS on the >>>   mini itx nodes. I have one pcengines wrap board runnin dev 76 that >>>   is on a car battery and has not required a power reset in almost 5 >>>   months. The car battery is actually about 3 feet from the node and >>>   is on top of a water tower. I just use a regular 12 volt 1 amp power >>>   supply at the bottom of the tower to feed the battery and node. Just >>>   thought I would throw that in there for someone who might still be >>>   using wrap boards. One thing I have learned is the wrap boards will >>>   work on any voltage in their range... BUT they DO NOT HAVE >>>   enough filter on their power input. VERY large filter capacitors(4000 >>>   ufd or larger) RIGHT at the power plug solves a lot of issues and a >>>   battery is even better. >>>   The graphs are really kewl but right now way over my head >>>   technically to implement

>>>   It may be hard, but don’t give up. Just work with what you >>>   have. My piece of advice is to go with Qcode. Since we have gone >>>   with Qcode, our crashes have almost stopped, and our total >>>   throughput and pingtime are much lower. We had major problems >>>   before with nodes crashing ever day clients just hating us. One of >>>   the big problems, which is actualy pretty simple, was to have the >>>   node reboot if it hit a kernal panic. Now instead of the whole >>> network >>>   going down, it just has a few mintues of rebooting. Then since I >>>   track our reboots I can go out and deal with the hardware issues on >>>   my time, not in an emergency state. >>>   Another thing to do to really 'see' your mesh is to graph graph >>>   graph.. I have the mehspoints FTPing much info which I parse and >>>   then insert into an RRD database. Here is some of my graphs.

Use a pager
Bob this too is excellent; unfortunately, we cannot recompile the kernel! Take a look at: http://www.faqs.org/docs/Linux-HOWTO/Linux-Crash-HOWTO.html It has a small C program that calls panic to force a panic.

If you experience occasional kernel panics (machine freezes, keyboard LED's, if attached, blink) try the following Edit the normally empty file /proc/sys/kernel/panic to add the number "30" (without the quote marks). This will the reset the node 30 seconds after a kernel panic. We all have crashes.. but nobody likes to talk about it. We have created a pager reset to deal with the >