SCIENTIFIC-LINUX-USERS Archives

November 2011

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Andrey Y. Shevel" <[log in to unmask]>
Reply To:
Andrey Y. Shevel
Date:
Fri, 11 Nov 2011 02:34:26 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (152 lines)
Hi everybody,

I spent some time to find the cause for below problem. Here I inform about
what I did find.


1. In my concrete case described below the only way to perform operations
with tape was just reboot the server. It was unavoidable.

2. Anatoly Oreshkin did informe me about the message from author of the
driver 'st' 
(pls see  http://www.spinics.net/lists/linux-scsi/msg48626.html). In short
the memory problem was right guess. At the same time described below effect
is the result of special combination of concrete parameters:
  - SCSI adapter
  - size of main memory
  - number of active proceses

In our case we used described scheme for an year or more. However after
deployment on the same machine XEN we got some shortage in main memory. In
turn it led to the memory fragmentation which does prevent to get enough
memory for tar/dd for our type of SCSI adapter (the adapter is quite old).

Only thing we could change in our case is to decrease the size of block to
be written to tape to the value 4K * 128 = 512K.

After we change the size everything does start to work as expected. 

To avaid such the problem one has to have modern hardware and reserve of
main memory.

Andrey












On Mon, 22 Aug 2011 03:05:26 -0400, Andrey Y. Shevel <[log in to unmask]> wrote:

>Hello,
>
>I met some problem with SL55 when I tried to read the data from SCSI tape.
>
>====================
># tar -t --blocking-factor=2048 --file=/dev/st0l
>tar: /dev/st0l: Cannot read: Device or resource busy
>tar: At beginning of tape, quitting now
>tar: Error is not recoverable: exiting now
>====================
>
>It was quite surprising because a couple of days ago everything was
>running well.
>
>I did
>
>===========================
>service xend restart
>===========================
>
>and after everything became fine.
>
>
>Unfortunately several days later I met exactly same problem.
>
>
>I tested dd
>
>=========================
># dd if=/dev/nst0l of=test-tape count=1
>dd: reading `/dev/nst0l': Cannot allocate memory
>0+0 records in
>0+0 records out
>0 bytes (0 B) copied, 0.052914 seconds, 0.0 kB/s
>=========================
>
>and when I set right block size I see
>
>=================================
># dd if=/dev/nst0l of=test-tape bs=1048576 count=1
>dd: reading `/dev/nst0l': Device or resource busy
>0+0 records in
>0+0 records out
>0 bytes (0 B) copied, 0.000933 seconds, 0.0 kB/s
>=================================
>
>The commands 'fuser' and 'lsof' gave nothing (no process uses this
>device).
>
>
>My colleagues reminded me that in the past one of them solved such the
>issue by reboot. I do not think that reboot is good idea for the server in
>production and continue the investigation.
>
>
>The man page for driver 'st' (driver for SCSI tape drive) tells
>
>=========================
>EBUSY         The device is already in use or the driver was unable to
>allocate a buffer.
>=========================
>
>
>
>The main memory is quite busy
>
>===========================
># free
>              total       used       free     shared    buffers     cached
>Mem:       2097152    2087972       9180          0      68624     387956
>-/+ buffers/cache:    1631392     465760
>Swap:      8193064        652    8192412
>===========================
>
>I did
>==================================
># echo 2 > /proc/sys/vm/drop_caches; free
>              total       used       free     shared    buffers     cached
>Mem:       2097152     334528    1762624          0        848      32364
>-/+ buffers/cache:     301316    1795836
>Swap:      8193064        652    8192412
>==================================
>
>Presumably I get more free memory but it gave no change for dd and/or tar.
>
>My scanning the Internet on this issue gave just a little.
>
>Does anybody know how to make commands 'dd' and 'tar' running again
>without reboot ?
>
>
>Many thanks in advance for any ideas.
>
>
>Andrey
>
>
>--
>____________________________________________________________________
>NAME: Andrey Y. Shevel (Chevel) :  EMAIL: [log in to unmask]  \
>Computing Systems Department    :     http://hepd.pnpi.spb.ru/CSD     |
>TEL : +7(81371)36040 | POST ADDRESS: Petersburg Nuclear Physics Inst. |
>FAX : +7(81371)36040 | 188300, Gatchina, Leningrad district, Russia.  |
>______+7(81371)46256________________________________________________ /
>=========================================================================

ATOM RSS1 RSS2