-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kinzel, David wrote: > Have you checked out the ktune package? It does basically what you > are suggesting, unless I'm missing something. I had not looked at the ktune package. Stepping back a bit, the goal as I see it is to provide a supported, well-documented way for SL users to effectively use wide area networks for bulk data transfer. We find that when we start working with users to troubleshoot network problems, the first thing we need to do is teach them how to change their network stack defaults to something reasonable. This is true in almost all cases. If telling users to install and use the ktune package is the Right Thing, that's great - we'll add that recommendation to our network tuning knowledge base. If the Scientific Linux user community would prefer a package created and maintained by the Scientific Linux development team, perhaps that might be better - that's your call, since you know your user community and it would take your cycles to create and maintain the package. However, the high-order bit from my perspective is to remove the barrier to high-performance network utilization that default stack tuning parameters represent.....that is why I was asking about changing the defaults. I respect the desire to leave defaults the way they are, but in my experience those defaults are a significant barrier to productivity. As it stands now, each and every research group that has large data movement needs has to independently figure out that they need to change their network stack parameters, find a resource that tells them what to do, make the changes, and so on. Knowledge of this solution space does not appear to be widespread, so getting to the point where sysadmins are actually modifying host configs to improve performance is hard without a concerted outreach effort. We are doing that outreach, but it is difficult to know where all the users with tuning difficulties are. I am open to ideas on how to help users use the network more effectively.....if nothing else, we will continue as we are now and help all the users we can engage. However, it is my perception that the problem of network underutilization due to network stack defaults is sufficiently widespread that adding some scale to the solution would be a big help. It is also my impression that a lot of LHC Tier2 and Tier3 sites use Scientific Linux, so there is the potential for a big win here. Many thanks, --eli > > > -----Original Message----- From: > [log in to unmask] > [mailto:[log in to unmask]] On Behalf Of > Eli Dart Sent: Friday, July 10, 2009 1:58 PM To: Troy Dawson Cc: > [log in to unmask]; Brian Tierney; Joe Metzger Subject: > Re: Default network tuning parameters in SL > > > > Troy Dawson wrote: >> Hello, No, I won't change the default SL network tuning parameters. >> But I'm not against making that a SL rpm. I think that would be >> good idea if everyone could agree on a batch of settings to put >> into an rpm. > > OK - that seems perfectly reasonable. > > I'm guessing that all that would be required is to put some stuff in > /etc/sysctl.conf and run sysctl -p after. > > Are you able to make conditional decisions when installing an RPM? > For example, are you able to make a list of Linux versions that have > known bugs for certain things and change parameters based on that? > For example: > > $congestion_alg = "default"; if ($KERNEL_VERSION == 2.6.18) { > $congestion_alg = "htcp"; } (.... other tests ...) if > ($congestion_alg ne "default") { > InstallCongestionAlg($congestion_alg); } > > where InstallCongestionAlg inserts the line > "net.ipv4.tcp_congestion_control = $congestion_alg" into > /etc/sysctl.conf > > Linux TCP autotuning is pretty good these days. All that is really > needed is to give it enough room to move. So, for most stuff the > following are fine: > > # increase TCP max buffer size net.core.rmem_max = 4194304 > net.core.wmem_max = 4194304 # increase Linux autotuning TCP buffer > limits net.ipv4.tcp_rmem = 4096 87380 4194304 net.ipv4.tcp_wmem = > 4096 65536 4194304 > > This allows autotuning to go up to 4MB of window. Some thoughts on > this...these are some approximate latencies of interest: > > 80msec = SF Bay Area to New York (US continental width) 170msec = SF > Bay Area to CERN across the Atlantic 280msec = US East Coast (NY) to > China (Beijing) across the Pacific > > With a 4MB TCP window, we get the following per-flow bandwidth > limits: > > 80msec with 4MB max window = ~420Mbps max per-flow bandwidth 170msec > with 4MB max window = ~200Mbps max per-flow bandwidth 280msec with > 4MB max window = ~120Mbps max per-flow bandwidth > > Given that GridFTP and friends routinely use parallel streams, I > think that these numbers are reasonable for 1G-attached hosts (which > are largely the norm in environments without lots of network tuning > expertise, which are the target market for such an RPM). Going 4-way > to 8-way parallel will fill a 1G pipe with the above config (as far > as network stack tuning is concerned). > > --eli > > >> Troy > >> Eli Dart wrote: Hi all, > >> I hope this is the right list for discussion of this topic... > >> It appears that Scientific Linux is used by many science >> communities, but in particular by the HEP community. The science >> community often has significant bulk data movement requirements >> that are outside the capabilities of the default network tuning >> parameters of most Linux distributions. > >> Would the Scientific Linux community consider changing the network >> tuning defaults for future releases? > >> ESnet maintains a site that explains network performance tuning and >> how to increase network performance - please see >> http://fasterdata.es.net/ > >> However, we have recently seen several sites where the first thing >> that is needed is to change the network stack parameters so that >> high-performance wide-area data transfers are possible. > >> Note that with today's TCP autotuning and modern congestion >> recovery algorithms, one need not set up particular TCP parameters >> on a per-destination basis. One need only give TCP autotuning >> enough buffer space to do its work and ensure that a modern >> congestion recovery algorithm is used (the default in Linux 2.6 has >> been cubic for a while, though 2.6.18 has bugs in cubic that >> significantly damage performance so for 2.6.18 one should use htcp >> instead). > >> Please see http://fasterdata.es.net/tuning.html and >> http://fasterdata.es.net/TCP-tuning/linux.html for linux-specific >> information. > >> Thoughts? Comments welcome.... > >> Many thanks, > > >> --eli > > > This email communication is intended as a private communication for the sole use of the primary addressee and those individuals listed for copies in the original message. The information contained in this email is private and confidential and If you are not an intended recipient you are hereby notified that copying, forwarding or other dissemination or distribution of this communication by any means is prohibited. If you are not specifically authorized to receive this email and if you believe that you received it in error please notify the original sender immediately. We honour similar requests relating to the privacy of email communications. Cette communication par courrier ?lectronique est une communication priv?e ? l'usage exclusif du destinataire principal ainsi que des personnes dont les noms figurent en copie. Les renseignements contenus dans ce courriel sont confidentiels et si vous n'?tes pas le destinataire pr?vu, vous ?tes avis?, par les pr?sentes que toute reproduction, transfert ou autre forme de diffusion de cette communication par quelque moyen que ce soit est interdite. Si vous n'?tes pas sp?cifiquement autoris? ? recevoir ce courriel ou si vous croyez l'avoir re?u par erreur, veuillez en aviser l'exp?diteur original imm?diatement. Nous respectons les demandes similaires qui touchent la confidentialit? des communications par courrier ?lectronique. - -- Eli Dart NOC: (510) 486-7600 ESnet Network Engineering Group (800) 333-7638 Lawrence Berkeley National Laboratory PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAkpbni0ACgkQLTFEeF+CsrO+RgCdEaao46hCGFZSTk690EVm5N/g OkEAnREfzMmW2ojcJtALeCWWuLFOu8VK =Ykj+ -----END PGP SIGNATURE-----