SCIENTIFIC-LINUX-USERS Archives

December 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Vladimir Mosgalin <[log in to unmask]>
Reply To:
Vladimir Mosgalin <[log in to unmask]>
Date:
Wed, 26 Dec 2012 23:21:40 +0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (103 lines)
Hello everybody.

For a few months I've been experiencing this problem - it was a bit hard
to track because it usually happens only during shutdown, when network
interfaces go down, so I just didn't notice it. A kernel panic happens
when one of the interfaces, provided by tg3 driver goes down. "ifdown
eth2" is enough to cause it.

It doesn't matter if this interface was actively used or even if the
link was up - I can unplug the cable, boot up (interface is configured
to use dhcp, it will attempt to go up and fail), then execute "ifdown
eth2" and system will crash.

It's a bit hard to get the full message of the crash as this happens on
the machine which I use for remote logging itself.. The best thing I can
get right away are screenshots, however, some of information might be
missing on them.

It goes like this on shutdown (or "ifdown eth2", or "service network
restart" etc):
1) interfaces are being brought down, at some point eth2 is being
   brought down
2) nothing happens for about 10 seconds, system appears to be hang
3) lots of lines with call traces appear and scroll through the
   screen. These are last lines which I captured in screenshot:
   http://img202.imageshack.us/img202/5459/20121225205828.png
4) about 10 second pause again
5) kernel panic happens, more lines scroll. Again, here are some of the
   last ones:
   http://img5.imageshack.us/img5/397/20121225205838.png
6) system hangs completely

This happens on latest kernel-2.6.32-279.19.1.el6.x86_64. It also
happened on 2.6.32-279.11.1.el6.x86_64 and 2.6.32-279.14.1.el6.x86_64.

It didn't happen in SL6.2 with (official, not from elrepo)
kmod-tg3-3.122 package installed which was present in
6.2-fastbugs repository.

I found some information about tg3 crashes like this
http://elrepo.org/bugs/view.php?id=315
or this
http://bugs.centos.org/view.php?id=5428
but in either case 3.122 version of tg3 driver solved the problem.
However, I'm already using 3.122 and still experience crash.

The controller in question is Broadcom NetXtreme BCM5701, PCI-X version
which is inserted into PCI-X slot of Supermicro X7SBE. There haven't
been any hardware changes lately and it is working stable. I'm pretty
sure that this bug has appeared somewhere along the 6.2->6.3 upgrade or
in one of the 6.3 kernels. It's a bit hard to track because it appears
simply as "hang during reboot or shutdown", which rarely happens for
this system, but I'm sure that few months ago it rebooted and powered
off just fine.

This is interface used for internet connection. VLANs are not used.
There exists sixxs-based IPv6 interface in system, configured to work
over this interface. This problem doesn't happen with other (intel
e1000e) network interfaces.

$ cat /etc/sysconfig/network-scripts/ifcfg-eth2 
DEVICE=eth2
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Ethernet
HWADDR=00:02:A5:E7:0A:10
PEERDNS=no
NOZEROCONF=yes
$ ifconfig eth2
eth2      Link encap:Ethernet  HWaddr 00:02:A5:E7:0A:10  
          inet addr:<skipped...>
          inet6 addr: fe80::202:a5ff:fee7:a10/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:130906804 errors:0 dropped:0 overruns:0 frame:0
          TX packets:178575110 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:83971053482 (78.2 GiB)  TX bytes:205754543966 (191.6 GiB)
          Interrupt:52 
$ dmesg|grep '\(eth2\|tg3\)'
tg3.c:v3.122 (December 7, 2011)
tg3 0000:03:02.0: PCI INT A -> GSI 52 (level, low) -> IRQ 52
tg3 0000:03:02.0: eth2: Tigon3 [partno(253212-001) rev 0105] (PCIX:133MHz:64-bit) MAC address 00:02:a5:e7:0a:10
tg3 0000:03:02.0: eth2: attached PHY is 5701 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
tg3 0000:03:02.0: eth2: RXcsums[0] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[0]
tg3 0000:03:02.0: eth2: dma_rwctrl[76db000f] dma_mask[64-bit]
ADDRCONF(NETDEV_UP): eth2: link is not ready
tg3 0000:03:02.0: eth2: Link is up at 100 Mbps, full duplex
tg3 0000:03:02.0: eth2: Flow control is on for TX and on for RX
ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready




Does anyone know some solution or workaround?
I'm fine with installing other version of this driver from kmod (if I
knew where to get better version), but not very comfortable with using
kernel-3.5/3.6/3.7 etc from elrepo.


-- 

Vladimir

ATOM RSS1 RSS2