CMS_UAF_USERS Archives

September 2016, Week 2

CMS_UAF_USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
David A Mason <[log in to unmask]>
Reply To:
David A Mason <[log in to unmask]>
Date:
Tue, 13 Sep 2016 13:37:53 +0000
Content-Type:
text/plain
Parts/Attachments:
text/plain (1 lines)


Reminder of the downtime beginning this friday at 5:





> On Sep 7, 2016, at 10:09 AM, David A Mason <[log in to unmask]> wrote:

> 

> 

> Good morning!

> 

> Two of the transformers feeding FCC, where all our servers, storage, and LPC interactive nodes are housed, have lost the ability to auto-failover to generator, and require repair.  This work has been scheduled to begin early in the morning on the September 17th, and we’ll need to shut down our computing infrastructure ahead of that.  We’ll begin doing this at the end of business on Friday the 16th (5 pm FNAL time).  We will begin bringing up infrastructure after the work is completed on the weekend, but users should not expect to be able to use the LPC resources until the following monday morning FNAL time.  The interactive cmslpc nodes will be unavailable, EOS data offline even remotely via xrootd, and any running jobs on the LPC farm will be terminated as we bring things down on the 16th.  The FNAL Tier 1 and data it houses will also be down during this work.  FCC being down also affects much of the lab computing infrastructure in general, kerberos will be down and so will web services.  Work is going on now however to ensure at least FNAL email services will be able to stay up during this outage.  The village should still have power and network connectivity to the outside that weekend, but Wilson Hall will be closed due to cooling work there.

> 

> Also for the LPC EOS in particular, two of the EOS storage nodes will need to have batteries replaced before this downtime, so will come down briefly (one at a time) to make that change.  If your data is all in replicated space (the default) you should not notice, however if you are trying to access data in noreplica that happens to live on one of these servers, you might see a brief outage.  (It is important we do this _before_ this power outage to ensure they come back up properly after!)

> 

> Thanks for your patience, will update as we know more!

> 

> —Dave




ATOM RSS1 RSS2