SCIENTIFIC-LINUX-USERS Archives

December 2013

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Paul Robert Marino <[log in to unmask]>
Reply To:
Paul Robert Marino <[log in to unmask]>
Date:
Wed, 4 Dec 2013 10:57:18 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (112 lines)
Well while the NIC firmware may be an issue I would suspect some of
the switch settings.
base on the message "... failed; no link present." here are a few more
things to check.

1) Please make sure every thing is set to full auto on both sides, and
if you can linit the advertises auto negotiation rates and duplex to
just the one you want. Almost all business class switches can be
configured to do this. Do not hard set the speed and duplex because
that causes other intermittent problems due to the fact that it also
disables other safety checks which require the auto negotiation
process, for example many copper switches and NIC's and all fiber
optic switches and NIC's have a built in TDR (Time Delay
Reflectomiter) to detect cable problems. The TDR requires at least
limited support for reflecting the signal on the opposite sides to
work and that is determined during the auto negotiation process.

2) check if the switch has any flapping prevention hold down timers if
so disable them. the writers of these features often assume MS.
Windows like behavior which in similar to how Network Manager would
handle a delayed link up, but  traditional *ux detects it as a failure
to bring the interface up on boot.





On Wed, Dec 4, 2013 at 9:19 AM, Mark Stodola <[log in to unmask]> wrote:
> On 12/3/2013 10:04 PM, ~Stack~ wrote:
>>
>> I think we are on to something!
>>
>> On 12/03/2013 09:41 PM, ~Stack~ wrote:
>>>
>>> On 12/03/2013 09:16 PM, ~Stack~ wrote:
>>>>
>>>> On 12/03/2013 08:37 PM, Nico Kadel-Garcia wrote:
>>>>>
>>>>> On Tue, Dec 3, 2013 at 6:36 PM, ~Stack~<[log in to unmask]>  wrote:
>>>>>>
>>>>>> On 12/01/2013 10:36 AM, olli hauer wrote:
>>>>>>>
>>>>>>> Have you tried 'service network restart'? Does that bring up your
>>>>>>> nic?
>>>>>>
>>>>>> Well now. That is interesting. This is consistent even with a fresh
>>>>>> kickstart install.
>>>>>> $ service network restart
>>>>>> Shutting down interface eth0:      [  OK  ]
>>>>>> Shutting down loopback interface:  [  OK  ]
>>>>>> Bringing up loopback interface:    [  OK  ]
>>>>>> Bringing up interface eth0:
>>>>>> Determining IP information for eth0... failed; no link present.  Check
>>>>>> cable?  [FAILED]
>>>>>> $ ifup eth0
>>>>>> Determining IP information for eth0... done.
>>>>>>
>>>>>> Errr...what? *scratches head* What exactly is 'ifup eth0' doing that
>>>>>> 'service network restart' isn't?
>>>>>
>>>>> It's running significantly later. Even dumb switches, and supported
>>>>> network drivers, can tike time to recognize  the available MAC
>>>>> address. This is especially the case with DHCP, which requires
>>>>> communications all the way upstream to whatever DHCP server is in
>>>>> place.
>>>>
>>>> The weird part for me is that this is after the box is booted and I have
>>>> logged in. When I manually run 'service network restart' it fails in the
>>>> same way _every_ time. Then as soon as I run 'ifup eth0' it works! I
>>>> think I am going to experiment with this a bit.
>>>
>>> Also, I have been tinkering with this a bit. In /etc/init.d/functions on
>>> line ~536 (I have been editing a bit but I think that is right) there is
>>> a line like this in the action function:
>>> "$@"&&  success $"$STRING" || failure $"$STRING"
>>>
>>> When I dumped out the variables it is just running './ifup eth0' but it
>>> is on this line that everything seems to choke. What I find odd though
>>> is:
>>> * If I run it on the command line it works. Running it as a service, it
>>> fails. Thus I am wondering if it is an environmental variable setting?
>>> That is my next investigation.
>>>
>>> * If I run 'ifup eth0', get a IP, I can run 'service network restart'
>>> and get an IP! If I run 'ifdown eth0' or reboot then the service kicks
>>> back the error about a missing cable (which is obviously wrong).
>>>
>>> Very very odd.
>>
>> I checked out the environment variables, that is not it. I tried a few
>> other things and nothing. I don't understand why running '/sbin/ifup
>> eth0' but in the service command it doesn't work.
>>
>> So I just started adding '/sbin/ifup eth0' statements into the start
>> command till it worked. I tweaked it and to reliably get a DHCP IP (even
>> on reboot!) just add *two* copies of the ifup command in the start
>> section. I put mine at the end just before the ";;" of the "start)" case
>> section. One copy alone will not do it. Thus the command is essentially
>> called three times in a row.
>>
>> So there *is* a timing issue going on and just hammering it will
>> eventually get it to work. Now to find the best place to put the timing
>> delay...
>>
>> Thanks!
>
> I would suggest trying a NIC that uses a different driver or getting a newer
> driver from ELrepo (kmod-tg3).  Broadcom has been known to have issues in my
> experience.  Personally, I try to stick with Intel.
>
> -Mark

ATOM RSS1 RSS2