On 15/03/2018, Andrew C Aitchison <[log in to unmask]> wrote: > On Wed, 14 Mar 2018, Jason Vas Dias wrote: > >> gcc-4.8.5-16 (20150623) + binutils as 2.25.1-32.base.el7_4.2 here >> that is causing references to __x86_indirect_thunk_rax to be >> inserted in the above switch with more than 4 clauses whenever >> 'gtod' is referenced . The patch I sent avoids the problem, but >> why does the problem arise with the new GCC version and >> not the old one ? > > Is this the gcc with the reptoline fixes for Spectre ? > Yes, this is the gcc that supports retpoline : -DRETPOLINE . > I understand that firefox reduced the precision of several clocks > https://blog.mozilla.org/security/2018/01/03/mitigations-landing-new-class-timing-attack/ > to mitigate Meltdown and Spectre. > Could gcc and/or the kernel be doing the same thing ? > On Intel, no, I don't think so - the Intel SDM manual says: The time stamp disable (TSD) flag in register CR4 restricts the use of the RDTSC instruction as follows. When the flag is clear, the RDTSC instruction can be executed at any privilege level; when the flag is set, the instruction can only be executed at privilege level 0. AFAICS, Linux makes no attempt to set this flag , so user-space code can read the TSC , if CONFIG_X86_TSC linux kernel configuration flag is enabled. On ARM, the Linux default is to disable access to the CNTPCT & CNTFRQ registers to user-space code in arch/arm/kernel/arm_arch_timer.c - it is easy to edit that file and enable user-space access to them. I do think the system call latency has increased 20-30% since the Spectre + Meltdown fixes - the kernel and user space now really do not share any pages, including the call gate and VDSO pages. The VDSO page is now "faked" and is updated regularly by the kernel rather than being a live mapping into kernel address space. I will test with the relevant patches disabled. My patch was just to make the CLOCK_MONOTONIC_RAW latency about the same as that for clock_gettime(CLOCK_MONOTONIC ...) calls, instead of at least 10 times slower ( now @ 20ns vs @ 200-1000ns ), and also to enable user-space TSC readers to discover the kernel calculated calibrated TSC frequency so they are free to implement TSC readers with even lesser latency . As I understand it, Spectre / Meltdown relied mainly on Performance Counter cycle counts (which are typically derived from TSC), which are still always available in user-space. The updated patch I sent today is the best to use for CLOCK_MONOTONIC_RAW. All the best, Regards, Jason