| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101 |
- From 1ef06119163f106fc0de4990e7ae559e9a5a8169 Mon Sep 17 00:00:00 2001
- From: Andy Lutomirski <[email protected]>
- Date: Sat, 4 Nov 2017 04:16:12 -0700
- Subject: [PATCH 046/242] Revert "x86/mm: Stop calling leave_mm() in idle code"
- MIME-Version: 1.0
- Content-Type: text/plain; charset=UTF-8
- Content-Transfer-Encoding: 8bit
- CVE-2017-5754
- This reverts commit 43858b4f25cf0adc5c2ca9cf5ce5fdf2532941e5.
- The reason I removed the leave_mm() calls in question is because the
- heuristic wasn't needed after that patch. With the original version
- of my PCID series, we never flushed a "lazy cpu" (i.e. a CPU running
- kernel thread) due a flush on the loaded mm.
- Unfortunately, that caused architectural issues, so now I've
- reinstated these flushes on non-PCID systems in:
- commit b956575bed91 ("x86/mm: Flush more aggressively in lazy TLB mode").
- That, in turn, gives us a power management and occasionally
- performance regression as compared to old kernels: a process that
- goes into a deep idle state on a given CPU and gets its mm flushed
- due to activity on a different CPU will wake the idle CPU.
- Reinstate the old ugly heuristic: if a CPU goes into ACPI C3 or an
- intel_idle state that is likely to cause a TLB flush gets its mm
- switched to init_mm before going idle.
- FWIW, this heuristic is lousy. Whether we should change CR3 before
- idle isn't a good hint except insofar as the performance hit is a bit
- lower if the TLB is getting flushed by the idle code anyway. What we
- really want to know is whether we anticipate being idle long enough
- that the mm is likely to be flushed before we wake up. This is more a
- matter of the expected latency than the idle state that gets chosen.
- This heuristic also completely fails on systems that don't know
- whether the TLB will be flushed (e.g. AMD systems?). OTOH it may be a
- bit obsolete anyway -- PCID systems don't presently benefit from this
- heuristic at all.
- We also shouldn't do this callback from innermost bit of the idle code
- due to the RCU nastiness it causes. All the information need is
- available before rcu_idle_enter() needs to happen.
- Signed-off-by: Andy Lutomirski <[email protected]>
- Cc: Borislav Petkov <[email protected]>
- Cc: Borislav Petkov <[email protected]>
- Cc: Brian Gerst <[email protected]>
- Cc: Denys Vlasenko <[email protected]>
- Cc: H. Peter Anvin <[email protected]>
- Cc: Josh Poimboeuf <[email protected]>
- Cc: Linus Torvalds <[email protected]>
- Cc: Peter Zijlstra <[email protected]>
- Cc: Thomas Gleixner <[email protected]>
- Fixes: 43858b4f25cf "x86/mm: Stop calling leave_mm() in idle code"
- Link: http://lkml.kernel.org/r/c513bbd4e653747213e05bc7062de000bf0202a5.1509793738.git.luto@kernel.org
- Signed-off-by: Ingo Molnar <[email protected]>
- (cherry picked from commit 675357362aeba19688440eb1aaa7991067f73b12)
- Signed-off-by: Andy Whitcroft <[email protected]>
- Signed-off-by: Kleber Sacilotto de Souza <[email protected]>
- (cherry picked from commit b607843145fd0593fcd87e2596d1dc5a1d5f79a5)
- Signed-off-by: Fabian Grünbichler <[email protected]>
- ---
- arch/x86/mm/tlb.c | 16 +++++++++++++---
- 1 file changed, 13 insertions(+), 3 deletions(-)
- diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
- index b27aceaf7ed1..ed06f1593390 100644
- --- a/arch/x86/mm/tlb.c
- +++ b/arch/x86/mm/tlb.c
- @@ -194,12 +194,22 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
- this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id);
- this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen);
- write_cr3(build_cr3(next, new_asid));
- - trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH,
- - TLB_FLUSH_ALL);
- +
- + /*
- + * NB: This gets called via leave_mm() in the idle path
- + * where RCU functions differently. Tracing normally
- + * uses RCU, so we need to use the _rcuidle variant.
- + *
- + * (There is no good reason for this. The idle code should
- + * be rearranged to call this before rcu_idle_enter().)
- + */
- + trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
- } else {
- /* The new ASID is already up to date. */
- write_cr3(build_cr3_noflush(next, new_asid));
- - trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, 0);
- +
- + /* See above wrt _rcuidle. */
- + trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0);
- }
-
- this_cpu_write(cpu_tlbstate.loaded_mm, next);
- --
- 2.14.2
|