SCIENTIFIC-LINUX-USERS Archives

June 2017

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Nico Kadel-Garcia <[log in to unmask]>
Reply To:
Nico Kadel-Garcia <[log in to unmask]>
Date:
Sun, 25 Jun 2017 23:04:38 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (271 lines)
The advice is, of course, useless if your Linux host is virtualized
and you have no direct way of reviewing the processer of your virtual
server. Moreover, turning off hyperthreading on a virtual server can
cut its capacity and performance quite profoundly if the
individualized virtualization guests are  not well configured to
survive being tied to a single CPU. This is most common when hosting
most more virtual guests than the server as cores available, which is
not extremely common but certainly not unheard of.

On Sun, Jun 25, 2017 at 10:34 PM, Steven Haigh <[log in to unmask]> wrote:
> *****
> Forwarded from the debian-users / debian-devel mailing list]
> https://lists.debian.org/debian-devel/2017/06/msg00308.html
> *****
>
> This warning advisory is relevant for users of systems with the Intel
> processors code-named "Skylake" and "Kaby Lake".  These are: the 6th and
> 7th generation Intel Core processors (desktop, embedded, mobile and
> HEDT), their related server processors (such as Xeon v5 and Xeon v6), as
> well as select Intel Pentium processor models.
>
> TL;DR: unfixed Skylake and Kaby Lake processors could, in some
> situations, dangerously misbehave when hyper-threading is enabled.
> Disable hyper-threading immediately in BIOS/UEFI to work around the
> problem.  Read this advisory for instructions about an Intel-provided
> fix.
>
>
> SO, WHAT IS THIS ALL ABOUT?
> ---------------------------
>
> This advisory is about a processor/microcode defect recently identified
> on Intel Skylake and Intel Kaby Lake processors with hyper-threading
> enabled.  This defect can, when triggered, cause unpredictable system
> behavior: it could cause spurious errors, such as application and system
> misbehavior, data corruption, and data loss.
>
> It was brought to the attention of the Debian project that this defect
> is known to directly affect some Debian stable users (refer to the end
> of this advisory for details), thus this advisory.
>
> Please note that the defect can potentially affect any operating system
> (it is not restricted to Debian, and it is not restricted to Linux-based
> systems).  It can be either avoided (by disabling hyper-threading), or
> fixed (by updating the processor microcode).
>
> Due to the difficult detection of potentially affected software, and the
> unpredictable nature of the defect, all users of the affected Intel
> processors are strongly urged to take action as recommended by this
> advisory.
>
>
> DO I HAVE AN INTEL SKYLAKE OR KABY LAKE PROCESSOR WITH HYPER-THREADING?
> -----------------------------------------------------------------------
>
> The earliest of these Intel processor models were launched in September
> 2015.  If your processor is older than that, it will not be an Skylake
> or Kaby Lake processor and you can just ignore this advisory.
>
> If you don't know the model name of your processor(s), the command below
> will tell you their model names.  Run it in a command line shell (e.g.
> xterm):
>
>     grep name /proc/cpuinfo | sort -u
>
> Once you know your processor model name, you can check the two lists
> below:
>
>   * List of Intel processors code-named "Skylake":
>     http://ark.intel.com/products/codename/37572/Skylake
>
>   * List of Intel processors code-named "Kaby Lake":
>     http://ark.intel.com/products/codename/82879/Kaby-Lake
>
> Some of the processors in these two lists are not affected because they
> lack hyper-threading support.  Run the command below in a command line
> shell (e.g. xterm), and it will output a message if hyper-threading is
> supported/enabled:
>
>   grep -q '^flags.*[[:space:]]ht[[:space:]]' /proc/cpuinfo && \
>         echo "Hyper-threading is supported"
>
> Alternatively, use the processor lists above to go to that processor's
> information page, and the information on hyper-threading will be there.
>
> If your processor does not support hyper-threading, you can ignore this
> advisory.
>
>
> WHAT SHOULD I DO IF I DO HAVE SUCH PROCESSORS?
> ----------------------------------------------
>
> Kaby Lake:
>
> Users of systems with Intel Kaby Lake processors should immediately
> *disable* hyper-threading in the BIOS/UEFI configuration.  Please
> consult your computer/motherboard's manual for instructions, or maybe
> contact your system vendor's support line.
>
> The Kaby Lake microcode updates that fix this issue are currently only
> available to system vendors, so you will need a BIOS/UEFI update to get
> it.  Contact your system vendor: if you are lucky, such a BIOS/UEFI
> update might already be available, or undergoing beta testing.
>
> You want your system vendor to provide a BIOS/UEFI update that fixes
> "Intel processor errata KBL095, KBW095 or the similar one for my Kaby
> Lake processor".
>
> We strongly recommend that you should not re-enable hyper-threading
> until you install a BIOS/UEFI update with this fix.
>
>
> Skylake:
>
> Users of systems with Intel Skylake processors may have two choices:
>
> 1. If your processor model (listed in /proc/cpuinfo) is 78 or 94, and
>    the stepping is 3, install the non-free "intel-microcode" package
>    with base version 3.20170511.1, and reboot the system.  THIS IS
>    THE RECOMMENDED SOLUTION FOR THESE SYSTEMS, AS IT FIXES OTHER
>    PROCESSOR ISSUES AS WELL.
>
>    Run this command in a command line shell (e.g. xterm) to know the
>    model numbers and steppings of your processor.  All processors must
>    be either model 78 or 94, and stepping 3, for the intel-microcode fix
>    to work:
>
>          grep -E 'model|stepping' /proc/cpuinfo | sort -u
>
>    If you get any lines with a model number that is neither 78 or 94, or
>    the stepping is not 3, you will have to disable hyper-threading as
>    described on choice 2, below.
>
>    Refer to the section "INSTALLING THE MICROCODE UPDATES FROM NON-FREE"
>    for instructions on how to install the intel-microcode package.
>
> 2. For other processor models, disable hyper-threading in BIOS/UEFI
>    configuration.  Please consult your computer/motherboard's manual for
>    instructions on how to do this.  Contact your system vendor for a
>    BIOS/UEFI update that fixes "Intel erratum SKW144, SKL150, SKX150,
>    SKZ7, or the similar one for my Skylake processor".
>
> NOTE: If you did not have the intel-microcode package installed on your
> Skylake system before, it is best if you check for (and install) any
> BIOS/UEFI updates *first*.  Read the wiki page mentioned below.
>
>
> INSTALLING THE MICROCODE UPDATES FROM NON-FREE:
> -----------------------------------------------
>
> Instructions are available at:
>
>     https://wiki.debian.org/Microcode
>
> Updated intel-microcode packages are already available in non-free for:
> unstable, testing, Debian 9 "stretch" (stable), and Debian 8 *backports*
> (jessie-backports).
>
> THE MICROCODE PACKAGES FROM THE RECENT STABLE RELEASE (June 17th, 2017)
> ALREADY HAVE THE SKYLAKE FIX, BUT YOU MAY HAVE TO INSTALL THEM.
>
> Updated intel-microcode packages in non-free for Debian 8 "jessie"
> (oldstable) are waiting for approval and will likely be released in the
> next non-free oldstable point release.  They are the same as the
> packages in non-free jessie-backports, with a change to the version
> number.
>
> The wiki page above has instructions on how to enable "contrib" and
> "non-free", so as to be possible to install the intel-microcode package.
>
> Users of "jessie" (oldstable) might want to enable jessie-backports to
> get *this* intel-microcode update faster.  This is also explained in the
> wiki page above.
>
>
> MORE DETAILS ABOUT THE PROCESSOR DEFECT:
> ----------------------------------------
>
> On 2017-05-29, Mark Shinwell, a core OCaml toolchain developer,
> contacted the Debian developer responsible for the intel-microcode
> package with key information about a Intel processor issue that could be
> easily triggered by the OCaml compiler.
>
> The issue was being investigated by the OCaml community since
> 2017-01-06, with reports of malfunctions going at least as far back as
> Q2 2016.  It was narrowed down to Skylake with hyper-threading, which is
> a strong indicative of a processor defect.  Intel was contacted about
> it, but did not provide further feedback as far as we know.
>
> Fast-forward a few months, and Mark Shinwell noticed the mention of a
> possible fix for a microcode defect with unknown hit-ratio in the
> intel-microcode package changelog.  He matched it to the issues the
> OCaml community were observing, verified that the microcode fix indeed
> solved the OCaml issue, and contacted the Debian maintainer about it.
>
> Apparently, Intel had indeed found the issue, *documented it* (see
> below) and *fixed it*.  There was no direct feedback to the OCaml
> people, so they only found about it later.
>
> The defect is described by the SKZ7/SKW144/SKL150/SKX150/KBL095/KBW095
> Intel processor errata.  As described in official public Intel
> documentation (processor specification updates):
>
>   Errata:   SKZ7/SKW144/SKL150/SKX150/SKZ7/KBL095/KBW095
>             Short Loops Which Use AH/BH/CH/DH Registers May Cause
>             Unpredictable System Behavior.
>
>   Problem:  Under complex micro-architectural conditions, short loops
>             of less than 64 instructions that use AH, BH, CH or DH
>             registers as well as their corresponding wider register
>             (e.g. RAX, EAX or AX for AH) may cause unpredictable
>             system behavior. This can only happen when both logical
>             processors on the same physical processor are active.
>
>   Implication: Due to this erratum, the system may experience
>             unpredictable system behavior.
>
> We do not have enough information at this time to know how much software
> out there will trigger this specific defect.
>
> One important point is that the code pattern that triggered the issue in
> OCaml was present on gcc-generated code.  There were extra constraints
> being placed on gcc by OCaml, which would explain why gcc apparently
> rarely generates this pattern.
>
> The reported effects of the processor defect were: compiler and
> application crashes, incorrect program behavior, including incorrect
> program output.
>
>
> What we know about the microcode updates issued by Intel related to
> these specific errata:
>
> Fixes for processors with signatures[1] 0x406E3 and 0x506E3 are
> available in the Intel public Linux microcode release 20170511.  This
> will fix only Skylake processors with model 78 stepping 3, and model 94
> stepping 3.  The fixed microcode for these two processor models reports
> revision 0xb9/0xba, or higher.
>
> Apparently, these errata were fixed by microcode updates issued in early
> April/2017.  Based on this date range, microcode revision 0x5d/0x5e (and
> higher) for Kaby Lake processors with signatures 0x806e9 and 0x906e9
> *might* fix the issue.  We do not have confirmation about which
> microcode revision fixes Kaby Lake at this time.
>
> Related processor signatures and microcode revisions:
> Skylake   : 0x406e3, 0x506e3 (fixed in revision 0xb9/0xba and later,
>                               public fix in linux microcode 20170511)
> Skylake   : 0x50654          (no information, erratum listed)
> Kaby Lake : 0x806e9, 0x906e9 (defect still exists in revision 0x48,
>                               fix available as a BIOS/UEFI update)
>
>
> References:
> https://caml.inria.fr/mantis/view.php?id=7452
> http://metadata.ftp-master.debian.org/changelogs/non-free/i/intel-microcode/unstable_changelog
> https://www.intel.com/content/www/us/en/processors/core/desktop-6th-gen-core-family-spec-update.html
> https://www.intel.com/content/www/us/en/processors/core/7th-gen-core-family-spec-update.html
> https://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v6-spec-update.html
> https://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v5-spec-update.html
> https://www.intel.com/content/www/us/en/products/processors/core/6th-gen-x-series-spec-update.html
>
> [1] iucode_tool -S will output your processor signature.  This tool is
>     available in the *contrib* repository, package "iucode-tool".
>
> --
> Steven Haigh
>
> ? [log in to unmask]     ? http://www.crc.id.au
> ? +61 (3) 9001 6090    ? 0412 935 897

ATOM RSS1 RSS2