SCIENTIFIC-LINUX-USERS Archives

April 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Steven Timm <[log in to unmask]>
Reply To:
Steven Timm <[log in to unmask]>
Date:
Fri, 20 Apr 2012 08:58:51 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (37 lines)
On Fri, 20 Apr 2012, Niels_Walet wrote:

> When moving my virtual machines (libvirt/qemu-kvm) from one server to
> another (from amd to intel hardware), I seem to have suddenly hit the
> time-out issues that have been discussed in many places (the dreaded
> "blocked more than 120s" message), after which the systems both become
> totally unresponsive). Since the time-out involves the filesystem, I can
> only take screenshots, which I attach; nothing appears in the syslog.

This timeout issue is not specific to virtual machines, at Fermilab
we see it on bare metal machines just as much as we do on
virtual machines.

>
> I have updated the virtual machines from SL 5.5 to 5.7; with some change but
> similar crashes; I added a few boot parameters (having to do with idle=),
> with no change at all.
>
> Does anyone have any suggestion what I could try to further diagnose this
> problem, or maybe even a solution?
>
> Niels Walet
>
More than half the time when I've seen those timeouts they have
been blocking on some sort of network task or other.  Do you have
any kind of a network file system mounted such as NFS, AFS, GFS, etc?

Steve Timm


------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
[log in to unmask]  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Group Leader.
Lead of FermiCloud project.

ATOM RSS1 RSS2