SCIENTIFIC-LINUX-USERS Archives

August 2007

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Miles O'Neal <[log in to unmask]>
Reply To:
Miles O'Neal <[log in to unmask]>
Date:
Mon, 6 Aug 2007 21:56:32 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (23 lines)
We recently migrated from PBS to torque, and most of our
systems are now running 4.4 .  The torque server (a Core2
Duo at 2.4GHz) is only handling about 3x the jobs our 300MHz
Sun Ultra 5 could handle before bogging down horribly.  This
seems a bit odd.

Watching the server logs, it seems there's a lot of time
spent waiting for replies on sockets, though it's not clear
whether it's on the same system between the scheduler and
batch server, or between the batch server and client node
processes (pbs_moms).

We're beginning to wonder of it's OS-related.  Torque uses
a lot of sockets, and sets them up and tears them down at a
hefty rate.  We have the number set to 16K for the scheduler
and server processes via ulimit, but we aren't getting much
above 1400 between the two processes.

Is anyone aware of an issue in 4.4 that might affect this?

Thanks,
Miles

ATOM RSS1 RSS2