Hi Troy,
On Mon, 19 Nov 2007, Troy Dawson wrote:
> [log in to unmask] wrote:
>> Hallo Thomas,
>>
>> On Thu, 15 Nov 2007, Thomas Mueller wrote:
>>
>>> Hi Stephan,
>>>
>>> On Thu, 15 Nov 2007, [log in to unmask] wrote:
>>>
>>>> How about updating openafs to the new 1.4.5 release? I put up an SRPM in
>>>> http://www-zeuthen.desy.de/~wiesand/SL5/
>>> There is an open issue with the fileserver.
>>> Jeff Altmann is about to track this down - see
>>> http://rt.central.org/rt/Ticket/Display.html?id=74708
>>>
>>> It seems the problem is really strange and will not often occur -
>>> but anyway ...
>>
>> thanks for the heads-up! Yes, I noticed that issue, and the fileserver
>> patches in this SRPM are in there because it scares me. I had some hope
>> that it was sorted out already, but that was before I just read your last
>> reply in RT...
>>
>> If this problem still persists when SL5 is about to be released, I
>> can provide a 1.4.4 package with most of the bug fixes that went into
>> 1.4.5 (see the changelog). This one had serious testing in our cell,
>> on many clients and a dozen production fileservers, with not a single
>> fileserver problem on record - but then we don't have 200
>> "XtendedProblems" clients rebooting at the same time...
>>
>> Cheers,
>> Stephan
>>
>
> So ... we're waiting to see if the bug get fixed? If it does, we update, if
> it doesn't, we get the more patched version of 1.4.4?
> Is this correct? Just working on what goes in and what doesn't. Hopefully
> we can get this release out fairly quick.
Correct ... but maybe a bit more complicated.
First, I'm not convinced the bug isn't present in 1.4.4 or earlier
releases - after all it hasn't been identified yet, so we don't know.
Thomas, do you have evidence (and if you have a feeling only, that would
be sufficient to me) that the bug is 1.4.5-only?
Then, there are clearly bugs fixed in 1.4.5. And probably (I haven't had
time to check yet), some of the post-1.4.5 fixes in the srpm I put up
apply to 1.4.4 as well.
And then, unfortunately, 1.4.5 is not just a bug fix release. My
comprehension of the changes w.r.t. 1.4.4 is pretty incomplete, but I
think they basically fall into three categories:
1) relatively minor fixes all over the place, adjustment to current
linux kernels, compilers etc., enhancements extremely unlikely to
break anything
2) changes that help the fileserver survive (inadvertent) DOS attacks,
from misbehaving clients, typically from not-so-well-administered
Windows clients - I'm not sure those should be called bug fixes
in the SL context
3) performance enhancements, mainly for those running file servers
on zfs, by making [the formlerly synchronous] fsyncs [the fileserver
executes abundantly] asynchronous [by moving them into a separate
thread]
Maybe we should agree on and write down a policy for what we put into SL
releases. But all the SRPMS I offered to the project so far followed
this rule: "Provide the latest "stable" openafs release, with all
post-release patches from the project's cvs added that belong into
category 1 to the very best of my knowledge. Try hard to give the
client and the fileserver some serious testing before the SL release."
In this case, we're in a hurry to get 5.1 out a few days after
openafs-1.4.5 was released, and there's a serious bug reported against the
fileserver. And while that bug is very actively being chased right now,
and it's not too likely to bite the typical SL site, it may be better to
roll out 1.4.4 with "category 1" type patches applied in SL5.1 .
Hence my proposal: Let's put the 1.4.5 build I offered into the 5.1 beta.
If additional fixes come up in the course of tracking Thomas' bug, let's
add them. If SL5.1 has to be released and we're not convinced that the
then current openafs build is good enough, let's fall back to a
"1.4.4+cat1" build that had serious real life testing.
As such, I put up my last 1.4.4 build (openafs.SLx-1.4.4-52.src.rpm) in
http://www-zeuthen.desy.de/~wiesand/SL5/ . This is what I currently
deploy on production fileservers if I have to touch them. It differs from
1.4.4-51 (found in the same directory) only by a rather tiny patch (by
Rainer Toebbicke from CERN, and pulled into the 1.4.5 release without any
discussion). But just to be completely honest, it's the 1.4.4-51 build
that has run on many, and the most critical, fileservers here (at DESY's
Zeuthen Site) for months.
Pick your poison ;-)
Cheers,
Stephan
--
Stephan Wiesand
DESY - DV -
Platanenallee 6
15738 Zeuthen, Germany
|