SCIENTIFIC-LINUX-USERS Archives

December 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Phil Perry <[log in to unmask]>
Reply To:
Phil Perry <[log in to unmask]>
Date:
Thu, 27 Dec 2012 13:59:49 +0000
Content-Type:
text/plain
Parts/Attachments:
text/plain (114 lines)
On 26/12/12 19:21, Vladimir Mosgalin wrote:
> Hello everybody.
>
> For a few months I've been experiencing this problem - it was a bit hard
> to track because it usually happens only during shutdown, when network
> interfaces go down, so I just didn't notice it. A kernel panic happens
> when one of the interfaces, provided by tg3 driver goes down. "ifdown
> eth2" is enough to cause it.
>
> It doesn't matter if this interface was actively used or even if the
> link was up - I can unplug the cable, boot up (interface is configured
> to use dhcp, it will attempt to go up and fail), then execute "ifdown
> eth2" and system will crash.
>
> It's a bit hard to get the full message of the crash as this happens on
> the machine which I use for remote logging itself.. The best thing I can
> get right away are screenshots, however, some of information might be
> missing on them.
>
> It goes like this on shutdown (or "ifdown eth2", or "service network
> restart" etc):
> 1) interfaces are being brought down, at some point eth2 is being
>     brought down
> 2) nothing happens for about 10 seconds, system appears to be hang
> 3) lots of lines with call traces appear and scroll through the
>     screen. These are last lines which I captured in screenshot:
>     http://img202.imageshack.us/img202/5459/20121225205828.png
> 4) about 10 second pause again
> 5) kernel panic happens, more lines scroll. Again, here are some of the
>     last ones:
>     http://img5.imageshack.us/img5/397/20121225205838.png
> 6) system hangs completely
>
> This happens on latest kernel-2.6.32-279.19.1.el6.x86_64. It also
> happened on 2.6.32-279.11.1.el6.x86_64 and 2.6.32-279.14.1.el6.x86_64.
>
> It didn't happen in SL6.2 with (official, not from elrepo)
> kmod-tg3-3.122 package installed which was present in
> 6.2-fastbugs repository.
>
> I found some information about tg3 crashes like this
> http://elrepo.org/bugs/view.php?id=315
> or this
> http://bugs.centos.org/view.php?id=5428
> but in either case 3.122 version of tg3 driver solved the problem.
> However, I'm already using 3.122 and still experience crash.
>
> The controller in question is Broadcom NetXtreme BCM5701, PCI-X version
> which is inserted into PCI-X slot of Supermicro X7SBE. There haven't
> been any hardware changes lately and it is working stable. I'm pretty
> sure that this bug has appeared somewhere along the 6.2->6.3 upgrade or
> in one of the 6.3 kernels. It's a bit hard to track because it appears
> simply as "hang during reboot or shutdown", which rarely happens for
> this system, but I'm sure that few months ago it rebooted and powered
> off just fine.
>
> This is interface used for internet connection. VLANs are not used.
> There exists sixxs-based IPv6 interface in system, configured to work
> over this interface. This problem doesn't happen with other (intel
> e1000e) network interfaces.
>
> $ cat /etc/sysconfig/network-scripts/ifcfg-eth2
> DEVICE=eth2
> BOOTPROTO=dhcp
> ONBOOT=yes
> TYPE=Ethernet
> HWADDR=00:02:A5:E7:0A:10
> PEERDNS=no
> NOZEROCONF=yes
> $ ifconfig eth2
> eth2      Link encap:Ethernet  HWaddr 00:02:A5:E7:0A:10
>            inet addr:<skipped...>
>            inet6 addr: fe80::202:a5ff:fee7:a10/64 Scope:Link
>            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>            RX packets:130906804 errors:0 dropped:0 overruns:0 frame:0
>            TX packets:178575110 errors:0 dropped:0 overruns:0 carrier:0
>            collisions:0 txqueuelen:100
>            RX bytes:83971053482 (78.2 GiB)  TX bytes:205754543966 (191.6 GiB)
>            Interrupt:52
> $ dmesg|grep '\(eth2\|tg3\)'
> tg3.c:v3.122 (December 7, 2011)
> tg3 0000:03:02.0: PCI INT A ->  GSI 52 (level, low) ->  IRQ 52
> tg3 0000:03:02.0: eth2: Tigon3 [partno(253212-001) rev 0105] (PCIX:133MHz:64-bit) MAC address 00:02:a5:e7:0a:10
> tg3 0000:03:02.0: eth2: attached PHY is 5701 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> tg3 0000:03:02.0: eth2: RXcsums[0] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[0]
> tg3 0000:03:02.0: eth2: dma_rwctrl[76db000f] dma_mask[64-bit]
> ADDRCONF(NETDEV_UP): eth2: link is not ready
> tg3 0000:03:02.0: eth2: Link is up at 100 Mbps, full duplex
> tg3 0000:03:02.0: eth2: Flow control is on for TX and on for RX
> ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
>
>
>
>
> Does anyone know some solution or workaround?
> I'm fine with installing other version of this driver from kmod (if I
> knew where to get better version), but not very comfortable with using
> kernel-3.5/3.6/3.7 etc from elrepo.
>
>

Elrepo has an updated kmod package for the tg3 driver you could try.

With elrepo installed;

yum install kmod-tg3

and reboot.

If it doesn't fix the issue, try giving the elrepo folks a ping to see 
if there is a more recent version you could try that might fix the issue.

Hope that helps.

ATOM RSS1 RSS2