Kernel 2.6.9 & adm1021 & i2c_piix4 temperature monitoring

Justin Piszcz jpiszcz at lucidpixels.com
Tue Nov 9 18:19:16 CET 2004


How has it been re-worked, is the option no longer valid?

I believe it was along early 2.6.x and possibly 2.6.5-2.6.6 that I tried 
it without the option and the machine will shut itself off (due to 
lm-sensors/etc writing "their" values for what is too hot, forcing the 
machine to shutdown).

$ sensors
max1617-i2c-0-1a
Adapter: SMBus PIIX4 adapter at 0850
Board:       +45 C  (low  =   -55 C, high =  +127 C)
CPU:         +41 C  (low  =   -55 C, high =  +110 C)

If I run without the option, the values change to something like -20 C to 
50 or 60C, hence, when you compile a kernel it heats up and the box shuts 
itself off, with no warning at all, just powers off.

I may be able to try this later this week but not right now, as the 
machine needs to remain online.

Justin.

  On Mon, 8 Nov 2004, Jean Delvare wrote:

>
> Hi Justin,
>
>> I am monitoring the temperature sensors on my Dell GX1 box, and I get the
>> following in dmesg:
>>
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed reset at end of transaction (01)
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed reset at end of transaction (01)
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed! (01)
>> i2c_adapter i2c-0: Failed! (01)
>
> First note that Dell systems are well known for their improperly
> configured SMBus devices.
>
>> Of course the ``secret'' to getting the temperature sensors to work on a
>> Dell is two options when loading the i2c driver and the adm1024 driver.
>
> I guess you really mean adm1021 here?
>
>>   TYPE="/sbin/modprobe"
>>
>>   $TYPE i2c_piix4 force=1
>>   $TYPE adm1021 read_only=1
>>
>> The force is required to enable it, otherwise I do not believe it even
>> loads.
>
> True, because Dell's BIOS won't configure it.
>
> As noted in lm_sensors' documentation here:
>  http://www2.lm-sensors.nu/~lm78/cvs/lm_sensors2/doc/busses/i2c-piix4
> you should be careful when using this option.
>
> One possible cause for the trouble you experience would be that the
> piix4's address happens to conflict with another device address on your
> system. The driver should have complained if it were the case, but only
> if the other device has a Linux driver. If, say, this is an I/O address
> range the BIOS uses on its own, the i2c-piix4 driver may not know about
> this. You can use "i2cdetect -l" (or browse /sys) to find out the
> address the PIIX4 is using, as it shows in the I2C bus name.
>
> One thing you could try would be to use the force_addr parameter of the
> i2c-piix4 driver. As noted in the documentation, this is DANGEROUS so
> you better be extremely careful if you do. Basically, you would force
> the address to a supposedly unused address (check /proc/ioports for
> ranges to NOT use). Address must be a multiple of 16.
>
> Maybe you'll have better results when using this, maybe not. Please let
> us know. Be warned that using an improper address may cause SEVERE
> DAMAGE, and that there is unfortunately no way to know which addresses
> are suitable.
>
> See this thread:
>  http://archives.andrew.net.au/lm-sensors/msg21973.html
> Now, what you want to risk is left to you.
>
> This post:
>  https://bugzilla.redhat.com/bugzilla/long_list.cgi?buglist=73730
> suggests that 0x6000 MIGHT be a good candidate.
>
> BTW, you really should first try:
> 1* Upgrading BIOSes if possible.
> 2* Contacting Dell about the problem. This is really a BIOS issue, we may
> try to work around it in the kernel but this is not where the original
> problem lies.
>
>> The read_only=1 is so it does not change the temperature range in the
>> monitoring chip, if read_only=1 is not set and the machine gets hot, say
>> from compiling the kernel, the machine shuts itself off.
>
> This is theoretically not necessary anymore since Linux kernel 2.6.6
> (where adm1021 chip init was reworked). Could you please try without it
> and let us know how it goes?
>
> As a side note, I think that we should get rid of that "read_only"
> module parameter. It makes the code more complex with no obvious benefit
> (assuming you'll confirm that you don't need it anymore). No other
> hardware monitoring chip driver has this.
>
>> My question is, why all the failed transactions?
>> I graph the temperature without any issues (system and CPU) but I still
>> get some of these in dmesg, any ideas?
>
> I doubt that you really have no issue. Reads fail when you have a
> "Failed! (01)" error. Since the adm1021 driver doesn't handle that
> kind of error, this should result in bogus temperature readings
> (typically -1).
>
> Thanks,
> --
> Jean Delvare
>



More information about the lm-sensors mailing list