[lm-sensors] Fw: Processes causing CPU to overheat

Craig Sylla csyllac at gmail.com
Tue Aug 23 18:44:45 CEST 2005


>From your description, this only happens at 100% CPU load.  If that is
correct, this sounds like the system is just not cooling well enough. 
It could be a bad/inadequate fan, an insufficient heatsink, bad case
airflow, etc.

There is a program called CPUburn that will give you instant 100% load
and push the CPU temp as hard as possible.  I would suggested Googling
for that, I use it for stress-testing my watercooled box when I do
overclocking experiments.  That will give you an on-demand way to
cause temperature rise.  Then I'd check the system and see if it's
actually dropping the fan RPM or if the stock HSF set is just not up
to the task.  I've seen off-the-shelf name-brand systems that were
actually not capable of running 100% load full-time due to poor case
design/layout.  Mini-ITX fanless boards are a good example of this.

I use sensors just by loading the modules and checking the /sys files,
although that requires knowing the chips and what the driver code puts
into each file entry.  I've never seen it leave a perl script running
even when using it normally.

Good luck!
Craig


On 8/22/05, Jon Roland <jroland at linux-migration.net> wrote:
> Thanks for your response. No, it's not dust. Problem hasn't recurred, so I can't
> open the case to see if the fan has stopped or slowed when the temp begins to
> rise, but that is an obvious next move if it does.
> 
> Of course, mechanical problems would not correlate with CPU load, processes, and
> disk load, stopping the CPU fan (or increasing power to the CPU) when the nice
> usage (not system or user) pegs at 100%, and dropping back to normal when the
> first perl process is killed, which also causes all the other perl processes to
> end, and the nice usage to drop to zero. Nothing else causes the CPU temp to rise.
> I have done this process kill twice now, so it is not a coincidence.
> 
> I don't know why my log file is truncated at 80 columns to reveal more information
> about those perl processes.
> 
> Phil Edelbrock wrote:
> >
> > Weird.  Something I would check is to make sure that dust hasn't  caused
> > the CPU fan to slow (or stop).  I had problems with a server  this
> > summer which had lots of dust, and a slightly mis-installed CPU
> > heatsink.  The result was rapidly rising temps (and hangs) when the
> > computer was loaded.
> >
> > I think you need to look inside the server to see just what is going  on
> > there at the heatsink and CPU.  Are things installed right?  Is  there
> > dust or something causing the heatsink/fan to not work  properly?  Is
> > the environment the server is in OK (temp and humidity  should be low).
> > Heatsinks sometimes have a piece of plastic which  needs to be peeled
> > off the heatsink compound before installation.
> >
> > BTW- I hope you know what's causing the strange loading (those Perl
> > processes?).  If that's unusual, then you could have some nasty  spammer
> > or somebody using your box to do evil things.  BBTW- Your top  output
> > doesn't show much becuase that it's cut off on the right side  where
> > things are just get interesting. Oh, and I have no idea why  sensor
> > values disappear unless the driver or something is being  unloaded(?).
> >
> >
> > Phil
> 
> --
> 
> ----------------------------------------------------------------
> Linux Migration Network   7793 Burnet Road #37, Austin, TX 78757
> 512/374-9585 www.linux-migration.net jroland at linux-migration.net
> ----------------------------------------------------------------
> 
> _______________________________________________
> lm-sensors mailing list
> lm-sensors at lm-sensors.org
> http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
>




More information about the lm-sensors mailing list