d2be00c0fb5ae0794deffcdb0425cd5a8d823db0
19516 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
b436069059 |
wait: use EXIT_TRACE only if thread_group_leader(zombie)
wait_task_zombie() always uses EXIT_TRACE/ptrace_unlink() if ptrace_reparented(). This is suboptimal and a bit confusing: we do not need do_notify_parent(p) if !thread_group_leader(p) and in this case we also do not need ptrace_unlink(), we can rely on ptrace_release_task(). Change wait_task_zombie() to check thread_group_leader() along with ptrace_reparented() and simplify the final p->exit_state transition. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Michal Schmidt <mschmidt@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Lennart Poettering <lpoetter@redhat.com> Cc: Roland McGrath <roland@hack.frob.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
abd50b39e7 |
wait: introduce EXIT_TRACE to avoid the racy EXIT_DEAD->EXIT_ZOMBIE transition
wait_task_zombie() first does EXIT_ZOMBIE->EXIT_DEAD transition and
drops tasklist_lock. If this task is not the natural child and it is
traced, we change its state back to EXIT_ZOMBIE for ->real_parent.
The last transition is racy, this is even documented in
|
||
|
|
dfccbb5e49 |
wait: fix reparent_leader() vs EXIT_DEAD->EXIT_ZOMBIE race
wait_task_zombie() first does EXIT_ZOMBIE->EXIT_DEAD transition and
drops tasklist_lock. If this task is not the natural child and it is
traced, we change its state back to EXIT_ZOMBIE for ->real_parent.
The last transition is racy, this is even documented in
|
||
|
|
ef9823939e |
kernel/exit.c: call proc_exit_connector() after exit_state is set
The process events connector delivers a notification when a process
exits. This is really convenient for a process that spawns and wants to
monitor its children through an epoll-able() interface.
Unfortunately, there is a small window between when the event is
delivered and the child become wait()-able.
This is creates a race if the parent wants to make sure that it knows
about the exit, e.g
pid_t pid = fork();
if (pid > 0) {
register_interest_for_pid(pid);
if (waitpid(pid, NULL, WNOHANG) > 0)
{
/* We might have raced with exit() */
}
return;
}
/* Child */
execve(...)
register_interest_for_pid() would be telling the the connector socket
reader to pay attention to events related to pid.
Though this is not a bug, I think it would make the connector a bit more
usable if this race was closed by simply moving the call to
proc_exit_connector() from just before exit_notify() to right after.
Oleg said:
: Even with this patch the code above is still "racy" if the child is
: multi-threaded. Plus it should obviously filter-out subthreads. And
: afaics there is no way to make it reliable, even if you change the code
: above so that waitpid() is called only after the last thread exits WNOHANG
: still can fail.
Signed-off-by: Guillaume Morin <guillaume@morinfr.org>
Cc: Matt Helsley <matt.helsley@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
4bcb8232cf |
exit: move check_stack_usage() to the end of do_exit()
It is not clear why check_stack_usage() is called so early and thus it never checks the stack usage in, say, exit_notify() or flush_ptrace_hw_breakpoint() or other functions which are only called by do_exit(). Move the callsite down to the last preempt_disable/schedule. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
c39df5fa37 |
exit: call disassociate_ctty() before exit_task_namespaces()
Commit
|
||
|
|
539a13b47e |
res_counter: remove interface for locked charging and uncharging
The res_counter_{charge,uncharge}_locked() variants are not used in the
kernel outside of the resource counter code itself, so remove the
interface.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Tim Hockin <thockin@google.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
f0432d1596 |
mm, mempolicy: remove per-process flag
PF_MEMPOLICY is an unnecessary optimization for CONFIG_SLAB users. There's no significant performance degradation to checking current->mempolicy rather than current->flags & PF_MEMPOLICY in the allocation path, especially since this is considered unlikely(). Running TCP_RR with netperf-2.4.5 through localhost on 16 cpu machine with 64GB of memory and without a mempolicy: threads before after 16 1249409 1244487 32 1281786 1246783 48 1239175 1239138 64 1244642 1241841 80 1244346 1248918 96 1266436 1254316 112 1307398 1312135 128 1327607 1326502 Per-process flags are a scarce resource so we should free them up whenever possible and make them available. We'll be using it shortly for memcg oom reserves. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Tim Hockin <thockin@google.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
514ddb446c |
fork: collapse copy_flags into copy_process
copy_flags() does not use the clone_flags formal and can be collapsed into copy_process() for cleaner code. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Tim Hockin <thockin@google.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
615d6e8756 |
mm: per-thread vma caching
This patch is a continuation of efforts trying to optimize find_vma(), avoiding potentially expensive rbtree walks to locate a vma upon faults. The original approach (https://lkml.org/lkml/2013/11/1/410), where the largest vma was also cached, ended up being too specific and random, thus further comparison with other approaches were needed. There are two things to consider when dealing with this, the cache hit rate and the latency of find_vma(). Improving the hit-rate does not necessarily translate in finding the vma any faster, as the overhead of any fancy caching schemes can be too high to consider. We currently cache the last used vma for the whole address space, which provides a nice optimization, reducing the total cycles in find_vma() by up to 250%, for workloads with good locality. On the other hand, this simple scheme is pretty much useless for workloads with poor locality. Analyzing ebizzy runs shows that, no matter how many threads are running, the mmap_cache hit rate is less than 2%, and in many situations below 1%. The proposed approach is to replace this scheme with a small per-thread cache, maximizing hit rates at a very low maintenance cost. Invalidations are performed by simply bumping up a 32-bit sequence number. The only expensive operation is in the rare case of a seq number overflow, where all caches that share the same address space are flushed. Upon a miss, the proposed replacement policy is based on the page number that contains the virtual address in question. Concretely, the following results are seen on an 80 core, 8 socket x86-64 box: 1) System bootup: Most programs are single threaded, so the per-thread scheme does improve ~50% hit rate by just adding a few more slots to the cache. +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 50.61% | 19.90 | | patched | 73.45% | 13.58 | +----------------+----------+------------------+ 2) Kernel build: This one is already pretty good with the current approach as we're dealing with good locality. +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 75.28% | 11.03 | | patched | 88.09% | 9.31 | +----------------+----------+------------------+ 3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload. +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 70.66% | 17.14 | | patched | 91.15% | 12.57 | +----------------+----------+------------------+ 4) Ebizzy: There's a fair amount of variation from run to run, but this approach always shows nearly perfect hit rates, while baseline is just about non-existent. The amounts of cycles can fluctuate between anywhere from ~60 to ~116 for the baseline scheme, but this approach reduces it considerably. For instance, with 80 threads: +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 1.06% | 91.54 | | patched | 99.97% | 14.18 | +----------------+----------+------------------+ [akpm@linux-foundation.org: fix nommu build, per Davidlohr] [akpm@linux-foundation.org: document vmacache_valid() logic] [akpm@linux-foundation.org: attempt to untangle header files] [akpm@linux-foundation.org: add vmacache_find() BUG_ON] [hughd@google.com: add vmacache_valid_mm() (from Oleg)] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: adjust and enhance comments] Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Tested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
a0715cc226 |
mm, thp: add VM_INIT_DEF_MASK and PRCTL_THP_DISABLE
Add VM_INIT_DEF_MASK, to allow us to set the default flags for VMs. It also adds a prctl control which allows us to set the THP disable bit in mm->def_flags so that VMs will pick up the setting as they are created. Signed-off-by: Alex Thorlton <athorlton@sgi.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
dc5ed40686 |
Merge branch 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo: "Two patches to fix fallouts from the kernfs conversion: Li's patch to stop leaking cgroup_root refs across multiple mounts and the other fixes the 90s hang during shutdown caused by always using root's uid/gid for new cgroup dirs and files." * 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: newly created dirs and files should be owned by the creator cgroup: fix top cgroup refcnt leak |
||
|
|
467a9e1633 |
Merge tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull CPU hotplug notifiers registration fixes from Rafael Wysocki:
"The purpose of this single series of commits from Srivatsa S Bhat
(with a small piece from Gautham R Shenoy) touching multiple
subsystems that use CPU hotplug notifiers is to provide a way to
register them that will not lead to deadlocks with CPU online/offline
operations as described in the changelog of commit
|
||
|
|
49957f8e2a |
cgroup: newly created dirs and files should be owned by the creator
While converting cgroup to kernfs,
|
||
|
|
b8780c363d |
sched: remove sleep_on() and friends
This is the final piece in the puzzle, as all patches to remove the last users of \(interruptible_\|\)sleep_on\(_timeout\|\) have made it into the 3.15 merge window. The work was long overdue, and this interface in particular should not have survived the BKL removal that was done a couple of years ago. Citing Jon Corbet from http://lwn.net/2001/0201/kernel.php3": "[...] it was suggested that the janitors look for and fix all code that calls sleep_on() [...] since (1) almost all such code is incorrect, and (2) Linus has agreed that those functions should be removed in the 2.5 development series". We haven't quite made it for 2.5, but maybe we can merge this for 3.15. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
6f4c98e1c2 |
Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module updates from Rusty Russell: "Nothing major: the stricter permissions checking for sysfs broke a staging driver; fix included. Greg KH said he'd take the patch but hadn't as the merge window opened, so it's included here to avoid breaking build" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: staging: fix up speakup kobject mode Use 'E' instead of 'X' for unsigned module taint flag. VERIFY_OCTAL_PERMISSIONS: stricter checking for sysfs perms. kallsyms: fix percpu vars on x86-64 with relocation. kallsyms: generalize address range checking module: LLVMLinux: Remove unused function warning from __param_check macro Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE module: remove MODULE_GENERIC_TABLE module: allow multiple calls to MODULE_DEVICE_TABLE() per module module: use pr_cont |
||
|
|
2d1eb87ae1 |
Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM changes from Russell King:
- Perf updates from Will Deacon:
- Support for Qualcomm Krait processors (run perf on your phone!)
- Support for Cortex-A12 (run perf stat on your FPGA!)
- Support for perf_sample_event_took, allowing us to automatically decrease
the sample rate if we can't handle the PMU interrupts quickly enough
(run perf record on your FPGA!).
- Basic uprobes support from David Long:
This patch series adds basic uprobes support to ARM. It is based on
patches developed earlier by Rabin Vincent. That approach of adding
hooks into the kprobes instruction parsing code was not well received.
This approach separates the ARM instruction parsing code in kprobes out
into a separate set of functions which can be used by both kprobes and
uprobes. Both kprobes and uprobes then provide their own semantic action
tables to process the results of the parsing.
- ARMv7M (microcontroller) updates from Uwe Kleine-König
- OMAP DMA updates (recently added Vinod's Ack even though they've been
sitting in linux-next for a few months) to reduce the reliance of
omap-dma on the code in arch/arm.
- SA11x0 changes from Dmitry Eremin-Solenikov and Alexander Shiyan
- Support for Cortex-A12 CPU
- Align support for ARMv6 with ARMv7 so they can cooperate better in a
single zImage.
- Addition of first AT_HWCAP2 feature bits for ARMv8 crypto support.
- Removal of IRQ_DISABLED from various ARM files
- Improved efficiency of virt_to_page() for single zImage
- Patch from Ulf Hansson to permit runtime PM callbacks to be available for
AMBA devices for suspend/resume as well.
- Finally kill asm/system.h on ARM.
* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (89 commits)
dmaengine: omap-dma: more consolidation of CCR register setup
dmaengine: omap-dma: move IRQ handling to omap-dma
dmaengine: omap-dma: move register read/writes into omap-dma.c
ARM: omap: dma: get rid of 'p' allocation and clean up
ARM: omap: move dma channel allocation into plat-omap code
ARM: omap: dma: get rid of errata global
ARM: omap: clean up DMA register accesses
ARM: omap: remove almost-const variables
ARM: omap: remove references to disable_irq_lch
dmaengine: omap-dma: cleanup errata 3.3 handling
dmaengine: omap-dma: provide register read/write functions
dmaengine: omap-dma: use cached CCR value when enabling DMA
dmaengine: omap-dma: move barrier to omap_dma_start_desc()
dmaengine: omap-dma: move clnk_ctrl setting to preparation functions
dmaengine: omap-dma: improve efficiency loading C.SA/C.EI/C.FI registers
dmaengine: omap-dma: consolidate clearing channel status register
dmaengine: omap-dma: move CCR buffering disable errata out of the fast path
dmaengine: omap-dma: provide register definitions
dmaengine: omap-dma: consolidate setup of CCR
dmaengine: omap-dma: consolidate setup of CSDP
...
|
||
|
|
c6b3d5bcd6 |
cgroup: fix top cgroup refcnt leak
As mount() and kill_sb() is not a one-to-one match, If we mount the same
cgroupfs in serveral mount points, and then umount all of them, kill_sb()
will be called only once.
Try:
# mount -t cgroup -o cpuacct xxx /cgroup
# mount -t cgroup -o cpuacct xxx /cgroup2
# cat /proc/cgroups | grep cpuacct
cpuacct 2 1 1
# umount /cgroup
# umount /cgroup2
# cat /proc/cgroups | grep cpuacct
cpuacct 2 1 1
You'll see cgroupfs will never be freed.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
||
|
|
76ca7d1cca |
Merge branch 'akpm' (incoming from Andrew)
Merge first patch-bomb from Andrew Morton: - Various misc bits - kmemleak fixes - small befs, codafs, cifs, efs, freexxfs, hfsplus, minixfs, reiserfs things - fanotify - I appear to have become SuperH maintainer - ocfs2 updates - direct-io tweaks - a bit of the MM queue - printk updates - MAINTAINERS maintenance - some backlight things - lib/ updates - checkpatch updates - the rtc queue - nilfs2 updates - Small Documentation/ updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (237 commits) Documentation/SubmittingPatches: remove references to patch-scripts Documentation/SubmittingPatches: update some dead URLs Documentation/filesystems/ntfs.txt: remove changelog reference Documentation/kmemleak.txt: updates fs/reiserfs/super.c: add __init to init_inodecache fs/reiserfs: move prototype declaration to header file fs/hfsplus/attributes.c: add __init to hfsplus_create_attr_tree_cache() fs/hfsplus/extents.c: fix concurrent acess of alloc_blocks fs/hfsplus/extents.c: remove unused variable in hfsplus_get_block nilfs2: update project's web site in nilfs2.txt nilfs2: update MAINTAINERS file entries fix nilfs2: verify metadata sizes read from disk nilfs2: add FITRIM ioctl support for nilfs2 nilfs2: add nilfs_sufile_trim_fs to trim clean segs nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl nilfs2: add nilfs_sufile_set_suinfo to update segment usage nilfs2: add struct nilfs_suinfo_update and flags nilfs2: update MAINTAINERS file entries fs/coda/inode.c: add __init to init_inodecache() BEFS: logging cleanup ... |
||
|
|
72581487a6 |
printk: fix one circular lockdep warning about console_lock
Fix a warning about possible circular locking dependency.
If do in following sequence:
enter suspend -> resume -> plug-out CPUx (echo 0 > cpux/online)
lockdep will show warning as following:
======================================================
[ INFO: possible circular locking dependency detected ]
3.10.0 #2 Tainted: G O
-------------------------------------------------------
sh/1271 is trying to acquire lock:
(console_lock){+.+.+.}, at: console_cpu_notify+0x20/0x2c
but task is already holding lock:
(cpu_hotplug.lock){+.+.+.}, at: cpu_hotplug_begin+0x2c/0x58
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (cpu_hotplug.lock){+.+.+.}:
lock_acquire+0x98/0x12c
mutex_lock_nested+0x50/0x3d8
cpu_hotplug_begin+0x2c/0x58
_cpu_up+0x24/0x154
cpu_up+0x64/0x84
smp_init+0x9c/0xd4
kernel_init_freeable+0x78/0x1c8
kernel_init+0x8/0xe4
ret_from_fork+0x14/0x2c
-> #1 (cpu_add_remove_lock){+.+.+.}:
lock_acquire+0x98/0x12c
mutex_lock_nested+0x50/0x3d8
disable_nonboot_cpus+0x8/0xe8
suspend_devices_and_enter+0x214/0x448
pm_suspend+0x1e4/0x284
try_to_suspend+0xa4/0xbc
process_one_work+0x1c4/0x4fc
worker_thread+0x138/0x37c
kthread+0xa4/0xb0
ret_from_fork+0x14/0x2c
-> #0 (console_lock){+.+.+.}:
__lock_acquire+0x1b38/0x1b80
lock_acquire+0x98/0x12c
console_lock+0x54/0x68
console_cpu_notify+0x20/0x2c
notifier_call_chain+0x44/0x84
__cpu_notify+0x2c/0x48
cpu_notify_nofail+0x8/0x14
_cpu_down+0xf4/0x258
cpu_down+0x24/0x40
store_online+0x30/0x74
dev_attr_store+0x18/0x24
sysfs_write_file+0x16c/0x19c
vfs_write+0xb4/0x190
SyS_write+0x3c/0x70
ret_fast_syscall+0x0/0x48
Chain exists of:
console_lock --> cpu_add_remove_lock --> cpu_hotplug.lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(cpu_hotplug.lock);
lock(cpu_add_remove_lock);
lock(cpu_hotplug.lock);
lock(console_lock);
*** DEADLOCK ***
There are three locks involved in two sequence:
a) pm suspend:
console_lock (@suspend_console())
cpu_add_remove_lock (@disable_nonboot_cpus())
cpu_hotplug.lock (@_cpu_down())
b) Plug-out CPUx:
cpu_add_remove_lock (@(cpu_down())
cpu_hotplug.lock (@_cpu_down())
console_lock (@console_cpu_notify()) => Lockdeps prints warning log.
There should be not real deadlock, as flag of console_suspended can
protect this.
Although console_suspend() releases console_sem, it doesn't tell lockdep
about it. That results in the lockdep warning about circular locking
when doing the following: enter suspend -> resume -> plug-out CPUx (echo
0 > cpux/online)
Fix the problem by telling lockdep we actually released the semaphore in
console_suspend() and acquired it again in console_resume().
Signed-off-by: Jane Li <jiel@marvell.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
fce6e0338a |
printk: do not compute the size of the message twice
This is just a tiny optimization. It removes duplicate computation of the message size. Signed-off-by: Petr Mladek <pmladek@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
39b25109b4 |
printk: use also the last bytes in the ring buffer
It seems that we have newer used the last byte in the ring buffer. In fact, we have newer used the last 4 bytes because of padding. First problem is in the check for free space. The exact number of free bytes is enough to store the length of data. Second problem is in the check where the ring buffer is rotated. The left side counts the first unused index. It is unused, so it might be the same as the size of the buffer. Note that the first problem has to be fixed together with the second one. Otherwise, the buffer is rotated even when there is enough space on the end of the buffer. Then the beginning of the buffer is rewritten and valid entries get corrupted. Signed-off-by: Petr Mladek <pmladek@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
e8c42d36ab |
printk: add comment about tricky check for text buffer size
There is no check for potential "text_len" overflow. It is not needed because only valid level is detected. It took me some time to understand why. It would deserve a comment ;-) Signed-off-by: Petr Mladek <pmladek@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
c64730b26f |
printk: remove obsolete check for log level "c"
The kernel log level "c" was removed in commit
|
||
|
|
28ab49ff7f |
kernel/resource.c: make reallocate_resource() static
sparse says: kernel/resource.c:518:5: warning: symbol 'reallocate_resource' was not declared. Should it be static? Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
c96d6660dc |
kernel: audit/fix non-modular users of module_init in core code
Code that is obj-y (always built-in) or dependent on a bool Kconfig (built-in or absent) can never be modular. So using module_init as an alias for __initcall can be somewhat misleading. Fix these up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be a worse thing. The audit targets the following module_init users for change: kernel/user.c obj-y kernel/kexec.c bool KEXEC (one instance per arch) kernel/profile.c bool PROFILING kernel/hung_task.c bool DETECT_HUNG_TASK kernel/sched/stats.c bool SCHEDSTATS kernel/user_namespace.c bool USER_NS Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of subsys_initcall (which makes sense for these files) will thus change this registration from level 6-device to level 4-subsys (i.e. slightly earlier). However no observable impact of that difference has been observed during testing. Also, two instances of missing ";" at EOL are fixed in kexec. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
69369a7003 |
fs, kernel: permit disabling the uselib syscall
uselib hasn't been used since libc5; glibc does not use it. Support turning it off. When disabled, also omit the load_elf_library implementation from binfmt_elf.c, which only uselib invokes. bloat-o-meter: add/remove: 0/4 grow/shrink: 0/1 up/down: 0/-785 (-785) function old new delta padzero 39 36 -3 uselib_flags 20 - -20 sys_uselib 168 - -168 SyS_uselib 168 - -168 load_elf_library 426 - -426 The new CONFIG_USELIB defaults to `y'. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
8f6c5ffc89 |
kernel/groups.c: remove return value of set_groups
After commit
|
||
|
|
6af9f7bf3c |
sys_sysfs: Add CONFIG_SYSFS_SYSCALL
sys_sysfs is an obsolete system call no longer supported by libc. - This patch adds a default CONFIG_SYSFS_SYSCALL=y - Option can be turned off in expert mode. - cond_syscall added to kernel/sys_ni.c [akpm@linux-foundation.org: tweak Kconfig help text] Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
5509a5d27b |
drop_caches: add some documentation and info message
There is plenty of anecdotal evidence and a load of blog posts suggesting that using "drop_caches" periodically keeps your system running in "tip top shape". Perhaps adding some kernel documentation will increase the amount of accurate data on its use. If we are not shrinking caches effectively, then we have real bugs. Using drop_caches will simply mask the bugs and make them harder to find, but certainly does not fix them, nor is it an appropriate "workaround" to limit the size of the caches. On the contrary, there have been bug reports on issues that turned out to be misguided use of cache dropping. Dropping caches is a very drastic and disruptive operation that is good for debugging and running tests, but if it creates bug reports from production use, kernel developers should be aware of its use. Add a bit more documentation about it, a syslog message to track down abusers, and vmstat drop counters to help analyze problem reports. [akpm@linux-foundation.org: checkpatch fixes] [hannes@cmpxchg.org: add runtime suppression control] Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
d26914d117 |
mm: optimize put_mems_allowed() usage
Since put_mems_allowed() is strictly optional, its a seqcount retry, we don't need to evaluate the function if the allocation was in fact successful, saving a smp_rmb some loads and comparisons on some relative fast-paths. Since the naming, get/put_mems_allowed() does suggest a mandatory pairing, rename the interface, as suggested by Mel, to resemble the seqcount interface. This gives us: read_mems_allowed_begin() and read_mems_allowed_retry(), where it is important to note that the return value of the latter call is inverted from its previous incarnation. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
62572e29bc |
kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one
I ran into a scenario where while one cpu was stuck and should have panic'd because of the NMI watchdog, it didn't. The reason was another cpu was spewing stack dumps on to the console. Upon investigation, I noticed that when writing to the console and also when dumping the stack, the watchdog is touched. This causes all the cpus to reset their NMI watchdog flags and the 'stuck' cpu just spins forever. This change causes the semantics of touch_nmi_watchdog to be changed slightly. Previously, I accidentally changed the semantics and we noticed there was a codepath in which touch_nmi_watchdog could be touched from a preemtible area. That caused a BUG() to happen when CONFIG_DEBUG_PREEMPT was enabled. I believe it was the acpi code. My attempt here re-introduces the change to have the touch_nmi_watchdog() code only touch the local cpu instead of all of the cpus. But instead of using __get_cpu_var(), I use the __raw_get_cpu_var() version. This avoids the preemption problem. However my reasoning wasn't because I was trying to be lazy. Instead I rationalized it as, well if preemption is enabled then interrupts should be enabled to and the NMI watchdog will have no reason to trigger. So it won't matter if the wrong cpu is touched because the percpu interrupt counters the NMI watchdog uses should still be incrementing. Don said: : I'm ok with this patch, though it does alter the behaviour of how : touch_nmi_watchdog works. For the most part I don't think most callers : need to touch all of the watchdogs (on each cpu). Perhaps a corner case : will pop up (the scheduler?? to mimic touch_all_softlockup_watchdogs() ). : : But this does address an issue where if a system is locked up and one cpu : is spewing out useful debug messages (or error messages), the hard lockup : will fail to go off. We have seen this on RHEL also. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Ben Zhang <benzh@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
81c98869fa |
kthread: ensure locality of task_struct allocations
In the presence of memoryless nodes, numa_node_id() will return the current CPU's NUMA node, but that may not be where we expect to allocate from memory from. Instead, we should rely on the fallback code in the memory allocator itself, by using NUMA_NO_NODE. Also, when calling kthread_create_on_node(), use the nearest node with memory to the cpu in question, rather than the node it is running on. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Reviewed-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Anton Blanchard <anton@samba.org> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Ben Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
32d01dc7be |
Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
"A lot updates for cgroup:
- The biggest one is cgroup's conversion to kernfs. cgroup took
after the long abandoned vfs-entangled sysfs implementation and
made it even more convoluted over time. cgroup's internal objects
were fused with vfs objects which also brought in vfs locking and
object lifetime rules. Naturally, there are places where vfs rules
don't fit and nasty hacks, such as credential switching or lock
dance interleaving inode mutex and cgroup_mutex with object serial
number comparison thrown in to decide whether the operation is
actually necessary, needed to be employed.
After conversion to kernfs, internal object lifetime and locking
rules are mostly isolated from vfs interactions allowing shedding
of several nasty hacks and overall simplification. This will also
allow implmentation of operations which may affect multiple cgroups
which weren't possible before as it would have required nesting
i_mutexes.
- Various simplifications including dropping of module support,
easier cgroup name/path handling, simplified cgroup file type
handling and task_cg_lists optimization.
- Prepatory changes for the planned unified hierarchy, which is still
a patchset away from being actually operational. The dummy
hierarchy is updated to serve as the default unified hierarchy.
Controllers which aren't claimed by other hierarchies are
associated with it, which BTW was what the dummy hierarchy was for
anyway.
- Various fixes from Li and others. This pull request includes some
patches to add missing slab.h to various subsystems. This was
triggered xattr.h include removal from cgroup.h. cgroup.h
indirectly got included a lot of files which brought in xattr.h
which brought in slab.h.
There are several merge commits - one to pull in kernfs updates
necessary for converting cgroup (already in upstream through
driver-core), others for interfering changes in the fixes branch"
* 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (74 commits)
cgroup: remove useless argument from cgroup_exit()
cgroup: fix spurious lockdep warning in cgroup_exit()
cgroup: Use RCU_INIT_POINTER(x, NULL) in cgroup.c
cgroup: break kernfs active_ref protection in cgroup directory operations
cgroup: fix cgroup_taskset walking order
cgroup: implement CFTYPE_ONLY_ON_DFL
cgroup: make cgrp_dfl_root mountable
cgroup: drop const from @buffer of cftype->write_string()
cgroup: rename cgroup_dummy_root and related names
cgroup: move ->subsys_mask from cgroupfs_root to cgroup
cgroup: treat cgroup_dummy_root as an equivalent hierarchy during rebinding
cgroup: remove NULL checks from [pr_cont_]cgroup_{name|path}()
cgroup: use cgroup_setup_root() to initialize cgroup_dummy_root
cgroup: reorganize cgroup bootstrapping
cgroup: relocate setting of CGRP_DEAD
cpuset: use rcu_read_lock() to protect task_cs()
cgroup_freezer: document freezer_fork() subtleties
cgroup: update cgroup_transfer_tasks() to either succeed or fail
cgroup: drop task_lock() protection around task->cgroups
cgroup: update how a newly forked task gets associated with css_set
...
|
||
|
|
68114e5eb8 |
Merge tag 'trace-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt: "Most of the changes were largely clean ups, and some documentation. But there were a few features that were added: Uprobes now work with event triggers and multi buffers and have support under ftrace and perf. The big feature is that the function tracer can now be used within the multi buffer instances. That is, you can now trace some functions in one buffer, others in another buffer, all functions in a third buffer and so on. They are basically agnostic from each other. This only works for the function tracer and not for the function graph trace, although you can have the function graph tracer running in the top level buffer (or any tracer for that matter) and have different function tracing going on in the sub buffers" * tag 'trace-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (45 commits) tracing: Add BUG_ON when stack end location is over written tracepoint: Remove unused API functions Revert "tracing: Move event storage for array from macro to standalone function" ftrace: Constify ftrace_text_reserved tracepoints: API doc update to tracepoint_probe_register() return value tracepoints: API doc update to data argument ftrace: Fix compilation warning about control_ops_free ftrace/x86: BUG when ftrace recovery fails ftrace: Warn on error when modifying ftrace function ftrace: Remove freelist from struct dyn_ftrace ftrace: Do not pass data to ftrace_dyn_arch_init ftrace: Pass retval through return in ftrace_dyn_arch_init() ftrace: Inline the code from ftrace_dyn_table_alloc() ftrace: Cleanup of global variables ftrace_new_pgs and ftrace_update_cnt tracing: Evaluate len expression only once in __dynamic_array macro tracing: Correctly expand len expressions from __dynamic_array macro tracing/module: Replace include of tracepoint.h with jump_label.h in module.h tracing: Fix event header migrate.h to include tracepoint.h tracing: Fix event header writeback.h to include tracepoint.h tracing: Warn if a tracepoint is not set via debugfs ... |
||
|
|
bea803183e |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris: "Apart from reordering the SELinux mmap code to ensure DAC is called before MAC, these are minor maintenance updates" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits) selinux: correctly label /proc inodes in use before the policy is loaded selinux: put the mmap() DAC controls before the MAC controls selinux: fix the output of ./scripts/get_maintainer.pl for SELinux evm: enable key retention service automatically ima: skip memory allocation for empty files evm: EVM does not use MD5 ima: return d_name.name if d_path fails integrity: fix checkpatch errors ima: fix erroneous removal of security.ima xattr security: integrity: Use a more current logging style MAINTAINERS: email updates and other misc. changes ima: reduce memory usage when a template containing the n field is used ima: restore the original behavior for sending data with ima template Integrity: Pass commname via get_task_comm() fs: move i_readcount ima: use static const char array definitions security: have cap_dentry_init_security return error ima: new helper: file_inode(file) kernel: Mark function as static in kernel/seccomp.c capability: Use current logging styles ... |
||
|
|
cd6362befe |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
"Here is my initial pull request for the networking subsystem during
this merge window:
1) Support for ESN in AH (RFC 4302) from Fan Du.
2) Add full kernel doc for ethtool command structures, from Ben
Hutchings.
3) Add BCM7xxx PHY driver, from Florian Fainelli.
4) Export computed TCP rate information in netlink socket dumps, from
Eric Dumazet.
5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
Dichtel.
6) Convert many drivers to pci_enable_msix_range(), from Alexander
Gordeev.
7) Record SKB timestamps more efficiently, from Eric Dumazet.
8) Switch to microsecond resolution for TCP round trip times, also
from Eric Dumazet.
9) Clean up and fix 6lowpan fragmentation handling by making use of
the existing inet_frag api for it's implementation.
10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.
11) Auto size SKB lengths when composing netlink messages based upon
past message sizes used, from Eric Dumazet.
12) qdisc dumps can take a long time, add a cond_resched(), From Eric
Dumazet.
13) Sanitize netpoll core and drivers wrt. SKB handling semantics.
Get rid of never-used-in-tree netpoll RX handling. From Eric W
Biederman.
14) Support inter-address-family and namespace changing in VTI tunnel
driver(s). From Steffen Klassert.
15) Add Altera TSE driver, from Vince Bridgers.
16) Optimizing csum_replace2() so that it doesn't adjust the checksum
by checksumming the entire header, from Eric Dumazet.
17) Expand BPF internal implementation for faster interpreting, more
direct translations into JIT'd code, and much cleaner uses of BPF
filtering in non-socket ocntexts. From Daniel Borkmann and Alexei
Starovoitov"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
net: Add a test to see if a skb is freeable in irq context
qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
net: ptp: move PTP classifier in its own file
net: sxgbe: make "core_ops" static
net: sxgbe: fix logical vs bitwise operation
net: sxgbe: sxgbe_mdio_register() frees the bus
Call efx_set_channels() before efx->type->dimension_resources()
xen-netback: disable rogue vif in kthread context
net/mlx4: Set proper build dependancy with vxlan
be2net: fix build dependency on VxLAN
mac802154: make csma/cca parameters per-wpan
mac802154: allow only one WPAN to be up at any given time
net: filter: minor: fix kdoc in __sk_run_filter
netlink: don't compare the nul-termination in nla_strcmp
can: c_can: Avoid led toggling for every packet.
can: c_can: Simplify TX interrupt cleanup
can: c_can: Store dlc private
can: c_can: Reduce register access
can: c_can: Make the code readable
...
|
||
|
|
159d8133d0 |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina: "Usual rocket science -- mostly documentation and comment updates" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: sparse: fix comment doc: fix double words isdn: capi: fix "CAPI_VERSION" comment doc: DocBook: Fix typos in xml and template file Bluetooth: add module name for btwilink driver core: unexport static function create_syslog_header mmc: core: typo fix in printk specifier ARM: spear: clean up editing mistake net-sysfs: fix comment typo 'CONFIG_SYFS' doc: Insert MODULE_ in module-signing macros Documentation: update URL to hfsplus Technote 1150 gpio: update path to documentation ixgbe: Fix format string in ixgbe_fcoe. Kconfig: Remove useless "default N" lines user_namespace.c: Remove duplicated word in comment CREDITS: fix formatting treewide: Fix typo in Documentation/DocBook mm: Fix warning on make htmldocs caused by slab.c ata: ata-samsung_cf: cleanup in header file idr: remove unused prototype of idr_free() |
||
|
|
05bf58ca4b |
Merge branch 'sched-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull sched/idle changes from Ingo Molnar: "More idle code reorganization, to prepare for more integration. (Sent separately because it depended on pending timer work, which is now upstream)" * 'sched-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/idle: Add more comments to the code sched/idle: Move idle conditions in cpuidle_idle main function sched/idle: Reorganize the idle loop cpuidle/idle: Move the cpuidle_idle_call function to idle.c idle/cpuidle: Split cpuidle_idle_call main function into smaller functions |
||
|
|
d23082257d |
pid_namespace: pidns_get() should check task_active_pid_ns() != NULL
pidns_get()->get_pid_ns() can hit ns == NULL. This task_struct can't go away, but task_active_pid_ns(task) is NULL if release_task(task) was already called. Alternatively we could change get_pid_ns(ns) to check ns != NULL, but it seems that other callers are fine. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Eric W. Biederman ebiederm@xmission.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
56c4911aed |
audit: do not cast audit_rule_data pointers pointlesly
For some sort of legacy support audit_rule is a subset of (and first entry in) audit_rule_data. We don't actually need or use audit_rule. We just do a cast from one to the other for no gain what so ever. Stop the crazy casting. Signed-off-by: Eric Paris <eparis@redhat.com> |
||
|
|
7125764c5d |
Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull compat time conversion changes from Peter Anvin: "Despite the branch name this is really neither an x86 nor an x32-specific patchset, although it the implementation of the discussions that followed the x32 security hole a few months ago. This removes get/put_compat_timespec/val() and replaces them with compat_get/put_timespec/val() which are savvy as to the current status of COMPAT_USE_64BIT_TIME. It removes several unused and/or incorrect/misleading functions (like compat_put_timeval_convert which doesn't in fact do any conversion) and also replaces several open-coded implementations what is now called compat_convert_timespec() with that function" * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: compat: Fix sparse address space warnings compat: Get rid of (get|put)_compat_time(val|spec) |
||
|
|
fbb32750a6 |
pipe: kill ->map() and ->unmap()
all pipe_buffer_operations have the same instances of those... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
7a48837732 |
Merge branch 'for-3.15/core' of git://git.kernel.dk/linux-block
Pull core block layer updates from Jens Axboe:
"This is the pull request for the core block IO bits for the 3.15
kernel. It's a smaller round this time, it contains:
- Various little blk-mq fixes and additions from Christoph and
myself.
- Cleanup of the IPI usage from the block layer, and associated
helper code. From Frederic Weisbecker and Jan Kara.
- Duplicate code cleanup in bio-integrity from Gu Zheng. This will
give you a merge conflict, but that should be easy to resolve.
- blk-mq notify spinlock fix for RT from Mike Galbraith.
- A blktrace partial accounting bug fix from Roman Pen.
- Missing REQ_SYNC detection fix for blk-mq from Shaohua Li"
* 'for-3.15/core' of git://git.kernel.dk/linux-block: (25 commits)
blk-mq: add REQ_SYNC early
rt,blk,mq: Make blk_mq_cpu_notify_lock a raw spinlock
blk-mq: support partial I/O completions
blk-mq: merge blk_mq_insert_request and blk_mq_run_request
blk-mq: remove blk_mq_alloc_rq
blk-mq: don't dump CPU -> hw queue map on driver load
blk-mq: fix wrong usage of hctx->state vs hctx->flags
blk-mq: allow blk_mq_init_commands() to return failure
block: remove old blk_iopoll_enabled variable
blktrace: fix accounting of partially completed requests
smp: Rename __smp_call_function_single() to smp_call_function_single_async()
smp: Remove wait argument from __smp_call_function_single()
watchdog: Simplify a little the IPI call
smp: Move __smp_call_function_single() below its safe version
smp: Consolidate the various smp_call_function_single() declensions
smp: Teach __smp_call_function_single() to check for offline cpus
smp: Remove unused list_head from csd
smp: Iterate functions through llist_for_each_entry_safe()
block: Stop abusing rq->csd.list in blk-softirq
block: Remove useless IPI struct initialization
...
|
||
|
|
4b1779c2cf |
Merge tag 'pci-v3.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI changes from Bjorn Helgaas: "Enumeration - Increment max correctly in pci_scan_bridge() (Andreas Noever) - Clarify the "scan anyway" comment in pci_scan_bridge() (Andreas Noever) - Assign CardBus bus number only during the second pass (Andreas Noever) - Use request_resource_conflict() instead of insert_ for bus numbers (Andreas Noever) - Make sure bus number resources stay within their parents bounds (Andreas Noever) - Remove pci_fixup_parent_subordinate_busnr() (Andreas Noever) - Check for child busses which use more bus numbers than allocated (Andreas Noever) - Don't scan random busses in pci_scan_bridge() (Andreas Noever) - x86: Drop pcibios_scan_root() check for bus already scanned (Bjorn Helgaas) - x86: Use pcibios_scan_root() instead of pci_scan_bus_with_sysdata() (Bjorn Helgaas) - x86: Use pcibios_scan_root() instead of pci_scan_bus_on_node() (Bjorn Helgaas) - x86: Merge pci_scan_bus_on_node() into pcibios_scan_root() (Bjorn Helgaas) - x86: Drop return value of pcibios_scan_root() (Bjorn Helgaas) NUMA - x86: Add x86_pci_root_bus_node() to look up NUMA node from PCI bus (Bjorn Helgaas) - x86: Use x86_pci_root_bus_node() instead of get_mp_bus_to_node() (Bjorn Helgaas) - x86: Remove mp_bus_to_node[], set_mp_bus_to_node(), get_mp_bus_to_node() (Bjorn Helgaas) - x86: Use NUMA_NO_NODE, not -1, for unknown node (Bjorn Helgaas) - x86: Remove acpi_get_pxm() usage (Bjorn Helgaas) - ia64: Use NUMA_NO_NODE, not MAX_NUMNODES, for unknown node (Bjorn Helgaas) - ia64: Remove acpi_get_pxm() usage (Bjorn Helgaas) - ACPI: Fix acpi_get_node() prototype (Bjorn Helgaas) Resource management - i2o: Fix and refactor PCI space allocation (Bjorn Helgaas) - Add resource_contains() (Bjorn Helgaas) - Add %pR support for IORESOURCE_UNSET (Bjorn Helgaas) - Mark resources as IORESOURCE_UNSET if we can't assign them (Bjorn Helgaas) - Don't clear IORESOURCE_UNSET when updating BAR (Bjorn Helgaas) - Check IORESOURCE_UNSET before updating BAR (Bjorn Helgaas) - Don't try to claim IORESOURCE_UNSET resources (Bjorn Helgaas) - Mark 64-bit resource as IORESOURCE_UNSET if we only support 32-bit (Bjorn Helgaas) - Don't enable decoding if BAR hasn't been assigned an address (Bjorn Helgaas) - Add "weak" generic pcibios_enable_device() implementation (Bjorn Helgaas) - alpha, microblaze, sh, sparc, tile: Use default pcibios_enable_device() (Bjorn Helgaas) - s390: Use generic pci_enable_resources() (Bjorn Helgaas) - Don't check resource_size() in pci_bus_alloc_resource() (Bjorn Helgaas) - Set type in __request_region() (Bjorn Helgaas) - Check all IORESOURCE_TYPE_BITS in pci_bus_alloc_from_region() (Bjorn Helgaas) - Change pci_bus_alloc_resource() type_mask to unsigned long (Bjorn Helgaas) - Log IDE resource quirk in dmesg (Bjorn Helgaas) - Revert "[PATCH] Insert GART region into resource map" (Bjorn Helgaas) PCI device hotplug - Make check_link_active() non-static (Rajat Jain) - Use link change notifications for hot-plug and removal (Rajat Jain) - Enable link state change notifications (Rajat Jain) - Don't disable the link permanently during removal (Rajat Jain) - Don't check adapter or latch status while disabling (Rajat Jain) - Disable link notification across slot reset (Rajat Jain) - Ensure very fast hotplug events are also processed (Rajat Jain) - Add hotplug_lock to serialize hotplug events (Rajat Jain) - Remove a non-existent card, regardless of "surprise" capability (Rajat Jain) - Don't turn slot off when hot-added device already exists (Yijing Wang) MSI - Keep pci_enable_msi() documentation (Alexander Gordeev) - ahci: Fix broken single MSI fallback (Alexander Gordeev) - ahci, vfio: Use pci_enable_msi_range() (Alexander Gordeev) - Check kmalloc() return value, fix leak of name (Greg Kroah-Hartman) - Fix leak of msi_attrs (Greg Kroah-Hartman) - Fix pci_msix_vec_count() htmldocs failure (Masanari Iida) Virtualization - Device-specific ACS support (Alex Williamson) Freescale i.MX6 - Wait for retraining (Marek Vasut) Marvell MVEBU - Use Device ID and revision from underlying endpoint (Andrew Lunn) - Fix incorrect size for PCI aperture resources (Jason Gunthorpe) - Call request_resource() on the apertures (Jason Gunthorpe) - Fix potential issue in range parsing (Jean-Jacques Hiblot) Renesas R-Car - Check platform_get_irq() return code (Ben Dooks) - Add error interrupt handling (Ben Dooks) - Fix bridge logic configuration accesses (Ben Dooks) - Register each instance independently (Magnus Damm) - Break out window size handling (Magnus Damm) - Make the Kconfig dependencies more generic (Magnus Damm) Synopsys DesignWare - Fix RC BAR to be single 64-bit non-prefetchable memory (Mohit Kumar) Miscellaneous - Remove unused SR-IOV VF Migration support (Bjorn Helgaas) - Enable INTx if BIOS left them disabled (Bjorn Helgaas) - Fix hex vs decimal typo in cpqhpc_probe() (Dan Carpenter) - Clean up par-arch object file list (Liviu Dudau) - Set IORESOURCE_ROM_SHADOW only for the default VGA device (Sander Eikelenboom) - ACPI, ARM, drm, powerpc, pcmcia, PCI: Use list_for_each_entry() for bus traversal (Yijing Wang) - Fix pci_bus_b() build failure (Paul Gortmaker)" * tag 'pci-v3.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (108 commits) Revert "[PATCH] Insert GART region into resource map" PCI: Log IDE resource quirk in dmesg PCI: Change pci_bus_alloc_resource() type_mask to unsigned long PCI: Check all IORESOURCE_TYPE_BITS in pci_bus_alloc_from_region() resources: Set type in __request_region() PCI: Don't check resource_size() in pci_bus_alloc_resource() s390/PCI: Use generic pci_enable_resources() tile PCI RC: Use default pcibios_enable_device() sparc/PCI: Use default pcibios_enable_device() (Leon only) sh/PCI: Use default pcibios_enable_device() microblaze/PCI: Use default pcibios_enable_device() alpha/PCI: Use default pcibios_enable_device() PCI: Add "weak" generic pcibios_enable_device() implementation PCI: Don't enable decoding if BAR hasn't been assigned an address PCI: Enable INTx in pci_reenable_device() only when MSI/MSI-X not enabled PCI: Mark 64-bit resource as IORESOURCE_UNSET if we only support 32-bit PCI: Don't try to claim IORESOURCE_UNSET resources PCI: Check IORESOURCE_UNSET before updating BAR PCI: Don't clear IORESOURCE_UNSET when updating BAR PCI: Mark resources as IORESOURCE_UNSET if we can't assign them ... Conflicts: arch/x86/include/asm/topology.h drivers/ata/ahci.c |
||
|
|
4dedde7c7a |
Merge tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"The majority of this material spent some time in linux-next, some of
it even several weeks. There are a few relatively fresh commits in
it, but they are mostly fixes and simple cleanups.
ACPI took the lead this time, both in terms of the number of commits
and the number of modified lines of code, cpufreq follows and there
are a few changes in the PM core and in cpuidle too.
A new feature that already got some LWN.net's attention is the device
PM QoS extension allowing latency tolerance requirements to be
propagated from leaf devices to their ancestors with hardware
interfaces for specifying latency tolerance. That should help systems
with hardware-driven power management to avoid going too far with it
in cases when there are latency tolerance constraints.
There also are some significant changes in the ACPI core related to
the way in which hotplug notifications are handled. They affect PCI
hotplug (ACPIPHP) and the ACPI dock station code too. The bottom line
is that all those notification now go through the root notify handler
and are propagated to the interested subsystems by means of callbacks
instead of having to install a notify handler for each device object
that we can potentially get hotplug notifications for.
In addition to that ACPICA will now advertise "Windows 2013"
compatibility for _OSI, because some systems out there don't work
correctly if that is not done (some of them don't even boot).
On the system suspend side of things, all of the device suspend and
resume callbacks, except for ->prepare() and ->complete(), are now
going to be executed asynchronously as that turns out to speed up
system suspend and resume on some platforms quite significantly and we
have a few more optimizations in that area.
Apart from that, there are some new device IDs and fixes and cleanups
all over. In particular, the system suspend and resume handling by
cpufreq should be improved and the cpuidle menu governor should be a
bit more robust now.
Specifics:
- Device PM QoS support for latency tolerance constraints on systems
with hardware interfaces allowing such constraints to be specified.
That is necessary to prevent hardware-driven power management from
becoming overly aggressive on some systems and to prevent power
management features leading to excessive latencies from being used
in some cases.
- Consolidation of the handling of ACPI hotplug notifications for
device objects. This causes all device hotplug notifications to go
through the root notify handler (that was executed for all of them
anyway before) that propagates them to individual subsystems, if
necessary, by executing callbacks provided by those subsystems
(those callbacks are associated with struct acpi_device objects
during device enumeration). As a result, the code in question
becomes both smaller in size and more straightforward and all of
those changes should not affect users.
- ACPICA update, including fixes related to the handling of _PRT in
cases when it is broken and the addition of "Windows 2013" to the
list of supported "features" for _OSI (which is necessary to
support systems that work incorrectly or don't even boot without
it). Changes from Bob Moore and Lv Zheng.
- Consolidation of ACPI _OST handling from Jiang Liu.
- ACPI battery and AC fixes allowing unusual system configurations to
be handled by that code from Alexander Mezin.
- New device IDs for the ACPI LPSS driver from Chiau Ee Chew.
- ACPI fan and thermal optimizations related to system suspend and
resume from Aaron Lu.
- Cleanups related to ACPI video from Jean Delvare.
- Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan
Tianyu, Paul Bolle, Tomasz Nowicki.
- Intel RAPL (Running Average Power Limits) driver cleanups from
Jacob Pan.
- intel_pstate fixes and cleanups from Dirk Brandewie.
- cpufreq fixes related to system suspend/resume handling from Viresh
Kumar.
- cpufreq core fixes and cleanups from Viresh Kumar, Stratos
Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches.
- cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob
Herring.
- cpuidle fixes related to the menu governor from Tuukka Tikkanen.
- cpuidle fix related to coupled CPUs handling from Paul Burton.
- Asynchronous execution of all device suspend and resume callbacks,
except for ->prepare and ->complete, during system suspend and
resume from Chuansheng Liu.
- Delayed resuming of runtime-suspended devices during system suspend
for the PCI bus type and ACPI PM domain.
- New set of PM helper routines to allow device runtime PM callbacks
to be used during system suspend and resume more easily from Ulf
Hansson.
- Assorted fixes and cleanups in the PM core from Geert Uytterhoeven,
Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella.
- devfreq fix from Saravana Kannan"
* tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
PM / devfreq: Rewrite devfreq_update_status() to fix multiple bugs
PM / sleep: Correct whitespace errors in <linux/pm.h>
intel_pstate: Set core to min P state during core offline
cpufreq: Add stop CPU callback to cpufreq_driver interface
cpufreq: Remove unnecessary braces
cpufreq: Fix checkpatch errors and warnings
cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs
MAINTAINERS: Reorder maintainer addresses for PM and ACPI
PM / Runtime: Update runtime_idle() documentation for return value meaning
video / output: Drop display output class support
fujitsu-laptop: Drop unneeded include
acer-wmi: Stop selecting VIDEO_OUTPUT_CONTROL
ACPI / gpu / drm: Stop selecting VIDEO_OUTPUT_CONTROL
ACPI / video: fix ACPI_VIDEO dependencies
cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE}
cpufreq: Do not allow ->setpolicy drivers to provide ->target
cpufreq: arm_big_little: set 'physical_cluster' for each CPU
cpufreq: arm_big_little: make vexpress driver depend on bL core driver
ACPI / button: Add ACPI Button event via netlink routine
ACPI: Remove duplicate definitions of PREFIX
...
|
||
|
|
683b6c6f82 |
Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq code updates from Thomas Gleixner:
"The irq department proudly presents:
- Another tree wide sweep of irq infrastructure abuse. Clear winner
of the trainwreck engineering contest was:
#include "../../../kernel/irq/settings.h"
- Tree wide update of irq_set_affinity() callbacks which miss a cpu
online check when picking a single cpu out of the affinity mask.
- Tree wide consolidation of interrupt statistics.
- Updates to the threaded interrupt infrastructure to allow explicit
wakeup of the interrupt thread and a variant of synchronize_irq()
which synchronizes only the hard interrupt handler. Both are
needed to replace the homebrewn thread handling in the mmc/sdhci
code.
- New irq chip callbacks to allow proper support for GPIO based irqs.
The GPIO based interrupts need to request/release GPIO resources
from request/free_irq.
- A few new ARM interrupt chips. No revolutionary new hardware, just
differently wreckaged variations of the scheme.
- Small improvments, cleanups and updates all over the place"
I was hoping that that trainwreck engineering contest was a April Fools'
joke. But no.
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits)
irqchip: sun7i/sun6i: Disable NMI before registering the handler
ARM: sun7i/sun6i: dts: Fix IRQ number for sun6i NMI controller
ARM: sun7i/sun6i: irqchip: Update the documentation
ARM: sun7i/sun6i: dts: Add NMI irqchip support
ARM: sun7i/sun6i: irqchip: Add irqchip driver for NMI controller
genirq: Export symbol no_action()
arm: omap: Fix typo in ams-delta-fiq.c
m68k: atari: Fix the last kernel_stat.h fallout
irqchip: sun4i: Simplify sun4i_irq_ack
irqchip: sun4i: Use handle_fasteoi_irq for all interrupts
genirq: procfs: Make smp_affinity values go+r
softirq: Add linux/irq.h to make it compile again
m68k: amiga: Add linux/irq.h to make it compile again
irqchip: sun4i: Don't ack IRQs > 0, fix acking of IRQ 0
irqchip: sun4i: Fix a comment about mask register initialization
irqchip: sun4i: Fix irq 0 not working
genirq: Add a new IRQCHIP_EOI_THREADED flag
genirq: Document IRQCHIP_ONESHOT_SAFE flag
ARM: sunxi: dt: Convert to the new irq controller compatibles
irqchip: sunxi: Change compatibles
...
|
||
|
|
1ead658124 |
Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer changes from Thomas Gleixner:
"This assorted collection provides:
- A new timer based timer broadcast feature for systems which do not
provide a global accessible timer device. That allows those
systems to put CPUs into deep idle states where the per cpu timer
device stops.
- A few NOHZ_FULL related improvements to the timer wheel
- The usual updates to timer devices found in ARM SoCs
- Small improvements and updates all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
tick: Remove code duplication in tick_handle_periodic()
tick: Fix spelling mistake in tick_handle_periodic()
x86: hpet: Use proper destructor for delayed work
workqueue: Provide destroy_delayed_work_on_stack()
clocksource: CMT, MTU2, TMU and STI should depend on GENERIC_CLOCKEVENTS
timer: Remove code redundancy while calling get_nohz_timer_target()
hrtimer: Rearrange comments in the order struct members are declared
timer: Use variable head instead of &work_list in __run_timers()
clocksource: exynos_mct: silence a static checker warning
arm: zynq: Add support for cpufreq
arm: zynq: Don't use arm_global_timer with cpufreq
clocksource/cadence_ttc: Overhaul clocksource frequency adjustment
clocksource/cadence_ttc: Call clockevents_update_freq() with IRQs enabled
clocksource: Add Kconfig entries for CMT, MTU2, TMU and STI
sh: Remove Kconfig entries for TMU, CMT and MTU2
ARM: shmobile: Remove CMT, TMU and STI Kconfig entries
clocksource: armada-370-xp: Use atomic access for shared registers
clocksource: orion: Use atomic access for shared registers
clocksource: timer-keystone: Delete unnecessary variable
clocksource: timer-keystone: introduce clocksource driver for Keystone
...
|
||
|
|
a21e40877a |
Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Ingo Molnar: "The main purpose is to fix a full dynticks bug related to virtualization, where steal time accounting appears to be zero in /proc/stat even after a few seconds of competing guests running busy loops in a same host CPU. It's not a regression though as it was there since the beginning. The other commits are preparatory work to fix the bug and various cleanups" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arch: Remove stub cputime.h headers sched: Remove needless round trip nsecs <-> tick conversion of steal time cputime: Fix jiffies based cputime assumption on steal accounting cputime: Bring cputime -> nsecs conversion cputime: Default implementation of nsecs -> cputime conversion cputime: Fix nsecs_to_cputime() return type cast |
||
|
|
1ce235faa8 |
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull ARM64 updates from Catalin Marinas: - KGDB support for arm64 - PCI I/O space extended to 16M (in preparation of PCIe support patches) - Dropping ZONE_DMA32 in favour of ZONE_DMA (we only need one for the time being), together with swiotlb late initialisation to correctly setup the bounce buffer - DMA API cache maintenance support (not all ARMv8 platforms have hardware cache coherency) - Crypto extensions advertising via ELF_HWCAP2 for compat user space - Perf support for dwarf unwinding in compat mode - asm/tlb.h converted to the generic mmu_gather code - asm-generic rwsem implementation - Code clean-up * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits) arm64: Remove pgprot_dmacoherent() arm64: Support DMA_ATTR_WRITE_COMBINE arm64: Implement custom mmap functions for dma mapping arm64: Fix __range_ok macro arm64: Fix duplicated Kconfig entries arm64: mm: Route pmd thp functions through pte equivalents arm64: rwsem: use asm-generic rwsem implementation asm-generic: rwsem: de-PPCify rwsem.h arm64: enable generic CPU feature modalias matching for this architecture arm64: smp: make local symbol static arm64: debug: make local symbols static ARM64: perf: support dwarf unwinding in compat mode ARM64: perf: add support for frame pointer unwinding in compat mode ARM64: perf: add support for perf registers API arm64: Add boot time configuration of Intermediate Physical Address size arm64: Do not synchronise I and D caches for special ptes arm64: Make DMA coherent and strongly ordered mappings not executable arm64: barriers: add dmb barrier arm64: topology: Implement basic CPU topology support arm64: advertise ARMv8 extensions to 32-bit compat ELF binaries ... |