| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| pypdf is a free and open-source pure-python PDF library. Prior to 6.12.0, an attacker who uses this vulnerability can craft a PDF which leads to long runtimes. This requires cross-reference streams with /W [0 0 0] values and large /Size values. This vulnerability is fixed in 6.12.0. |
| Mermaid is a JavaScript tool that uses Markdown-inspired text to create and modify diagrams and charts. Prior to 10.9.6 and 11.15.0, there is a denial-of-service attack when rendering gantt charts, if they use the excludes attribute to exclude all dates. mermaid.parse is unaffected, unless you then call the ganttDb.getTasks() (which is called when rendering a diagram). This vulnerability is fixed in 10.9.6 and 11.15.0. |
| In the Linux kernel, the following vulnerability has been resolved:
can: ucan: Fix infinite loop from zero-length messages
If a broken ucan device gets a message with the message length field set
to 0, then the driver will loop for forever in
ucan_read_bulk_callback(), hanging the system. If the length is 0, just
skip the message and go on to the next one.
This has been fixed in the kvaser_usb driver in the past in commit
0c73772cd2b8 ("can: kvaser_usb: leaf: Fix potential infinite loop in
command parsers"), so there must be some broken devices out there like
this somewhere. |
| In the Linux kernel, the following vulnerability has been resolved:
cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated
A chain of commits going back to v7.0 reworked rmdir to satisfy the
controller invariant that a subsystem's ->css_offline() must not run while
tasks are still doing kernel-side work in the cgroup.
[1] d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out")
[2] a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup")
[3] 1b164b876c36 ("cgroup: Wait for dying tasks to leave on rmdir")
[4] 4c56a8ac6869 ("cgroup: Fix cgroup_drain_dying() testing the wrong condition")
[5] 13e786b64bd3 ("cgroup: Increment nr_dying_subsys_* from rmdir context")
[1] moved task cset unlink from do_exit() to finish_task_switch() so a
task's cset link drops only after the task has fully stopped scheduling.
That made tasks past exit_signals() linger on cset->tasks until their final
context switch, which led to a series of problems as what userspace expected
to see after rmdir diverged from what the kernel needs to wait for. [2]-[5]
tried to bridge that divergence: [2] filtered the exiting tasks from
cgroup.procs; [3] had rmdir(2) sleep in TASK_UNINTERRUPTIBLE for them; [4]
fixed the wait's condition; [5] made nr_dying_subsys_* visible
synchronously.
The cgroup_drain_dying() wait in [3] turned out to be a dead end. When the
rmdir caller is also the reaper of a zombie that pins a pidns teardown (e.g.
host PID 1 systemd reaping orphan pids that were re-parented to it during
the same teardown), rmdir blocks in TASK_UNINTERRUPTIBLE waiting for those
pids to free, the pids can't free because PID 1 is the reaper and it's stuck
in rmdir, and the system A-A deadlocks. No internal lock ordering breaks
this; the wait itself is the bug.
The css killing side that drove the original reorder, however, can be made
cleanly asynchronous: ->css_offline() is already async, run from
css_killed_work_fn() driven by percpu_ref_kill_and_confirm(). The fix is to
make that chain start only after all tasks have left the cgroup. rmdir's
user-visible side then returns as soon as cgroup.procs and friends are
empty, while ->css_offline() still runs only after the cgroup is fully
drained.
Verified by the original reproducer (pidns teardown + zombie reaper, runs
under vng) which hangs vanilla and succeeds here, and by per-commit
deterministic repros for [2], [3], [4], [5] with a boot parameter that
widens the post-exit_signals() window so each state is reliably reachable.
Some stress tests on top of that.
cgroup_apply_control_disable() has the same shape of pre-existing race:
when a controller is disabled via subtree_control, kill_css() ran
synchronously while tasks past exit_signals() could still be linked to
the cgroup's csets, and ->css_offline() could fire before they drained.
This patch preserves the existing synchronous behavior at that call site
(kill_css_sync() + kill_css_finish() back-to-back) and a follow-up patch
will defer kill_css_finish() there using a per-css trigger.
This seems like the right approach and I don't see problems with it. The
changes are somewhat invasive but not excessively so, so backporting to
-stable should be okay. If something does turn out to be wrong, the fallback
is to revert the entire chain ([1]-[5]) and rework in the development branch
instead.
v2: Pin cgrp across the deferred destroy work with explicit
cgroup_get()/cgroup_put() around queue_work() and the work_fn. v1
wasn't actually broken (ordered cgroup_offline_wq + queue_work order
in cgroup_task_dead() saved it) but the explicit ref removes the
dependency on those non-obvious invariants. Also note the
pre-existing cgroup_apply_control_disable() race in the description;
a follow-up will defer kill_css_finish() there. |
| In the Linux kernel, the following vulnerability has been resolved:
ipmi: Add limits to event and receive message requests
The driver would just fetch events and receive messages until the
BMC said it was done. To avoid issues with BMCs that never say they are
done, add a limit of 10 fetches at a time.
In addition, an si interface has an attn state it can return from the
hardware which is supposed to cause a flag fetch to see if the driver
needs to fetch events or message or a few other things. If the attn
bit gets stuck, it's a similar problem. So allow messages in between
flag fetches so the driver itself doesn't get stuck.
This is a more general fix than the previous fix for the specific bad
BMC, but should fix the more general issue of a BMC that won't stop
saying it has data.
This has been there from the beginning of the driver. It's not a bug
per-se, but it is accounting for bugs in BMCs. |
| A flaw was found in glib-networking. A remote attacker can exploit this vulnerability by presenting a specially crafted certificate chain to an application that uses glib-networking with the GnuTLS backend enabled and performs certificate verification. This crafted chain, which contains circular issuer relationships, can cause an infinite loop during certificate verification. The unbounded traversal consumes excessive CPU resources, leading to a denial of service for the affected process or worker. |
| Ubuntu Linux 6.8, 6.17 and 7.0 contain AppArmor SAUCE patches which incorrectly sleep while holding a spinlock in notification handling code. The bug can be triggered by an unprivileged local user and can result in kernel panic or deadlock. |
| In the Linux kernel, the following vulnerability has been resolved:
openvswitch: vport: fix self-deadlock on release of tunnel ports
vports are used concurrently and protected by RCU, so netdev_put()
must happen after the RCU grace period. So, either in an RCU call or
after the synchronize_net(). The rtnl_delete_link() must happen under
RTNL and so can't be executed in RCU context. Calling synchronize_net()
while holding RTNL is not a good idea for performance and system
stability under load in general, so calling netdev_put() in RCU call
is the right solution here.
However,
when the device is deleted, rtnl_unlock() will call netdev_run_todo()
and block until all the references are gone. In the current code this
means that we never reach the call_rcu() and the vport is never freed
and the reference is never released, causing a self-deadlock on device
removal.
Fix that by moving the rcu_call() before the rtnl_unlock(), so the
scheduled RCU callback will be executed when synchronize_net() is
called from the rtnl_unlock()->netdev_run_todo() while the RTNL itself
is already released. |
| Apache POI in versions prior to release 3.17 are vulnerable to Denial of Service Attacks: 1) Infinite Loops while parsing crafted WMF, EMF, MSG and macros (POI bugs 61338 and 61294), and 2) Out of Memory Exceptions while parsing crafted DOC, PPT and XLS (POI bugs 52372 and 61295). |
| Vulnerability in the Oracle Java SE, Oracle GraalVM Enterprise Edition, Oracle GraalVM for JDK product of Oracle Java SE (component: Utility). Supported versions that are affected are Oracle Java SE: 11.0.19, 17.0.7, 20.0.1; Oracle GraalVM Enterprise Edition: 20.3.10, 21.3.6, 22.3.2; Oracle GraalVM for JDK: 17.0.7 and 20.0.1. Difficult to exploit vulnerability allows unauthenticated attacker with network access via multiple protocols to compromise Oracle Java SE, Oracle GraalVM Enterprise Edition, Oracle GraalVM for JDK. Successful attacks of this vulnerability can result in unauthorized ability to cause a partial denial of service (partial DOS) of Oracle Java SE, Oracle GraalVM Enterprise Edition, Oracle GraalVM for JDK. Note: This vulnerability can be exploited by using APIs in the specified Component, e.g., through a web service which supplies data to the APIs. This vulnerability also applies to Java deployments, typically in clients running sandboxed Java Web Start applications or sandboxed Java applets, that load and run untrusted code (e.g., code that comes from the internet) and rely on the Java sandbox for security. CVSS 3.1 Base Score 3.7 (Availability impacts). CVSS Vector: (CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:N/I:N/A:L). |
| In the Linux kernel, the following vulnerability has been resolved:
md/raid5: fix IO hang with degraded array with llbitmap
When llbitmap bit state is still unwritten, any new write should force
rcw, as bitmap_ops->blocks_synced() is checked in handle_stripe_dirtying().
However, later the same check is missing in need_this_block(), causing
stripe to deadloop during handling because handle_stripe() will decide
to go to handle_stripe_fill(), meanwhile need_this_block() always return
0 and nothing is handled. |
| In the Linux kernel, the following vulnerability has been resolved:
crypto: inside-secure/eip93 - unregister only available algorithm
EIP93 has an options register. This register indicates which crypto
algorithms are implemented in silicon. Supported algorithms are
registered on this basis. Unregister algorithms on the same basis.
Currently, all algorithms are unregistered, even those not supported
by HW. This results in panic on platforms that don't have all options
implemented in silicon. |
| In the Linux kernel, the following vulnerability has been resolved:
powerpc/eeh: fix recursive pci_lock_rescan_remove locking in EEH event handling
The recent commit 1010b4c012b0 ("powerpc/eeh: Make EEH driver device
hotplug safe") restructured the EEH driver to improve synchronization
with the PCI hotplug layer.
However, it inadvertently moved pci_lock_rescan_remove() outside its
intended scope in eeh_handle_normal_event(), leading to broken PCI
error reporting and improper EEH event triggering. Specifically,
eeh_handle_normal_event() acquired pci_lock_rescan_remove() before
calling eeh_pe_bus_get(), but eeh_pe_bus_get() itself attempts to
acquire the same lock internally, causing nested locking and disrupting
normal EEH event handling paths.
This patch adds a boolean parameter do_lock to _eeh_pe_bus_get(),
with two public wrappers:
eeh_pe_bus_get() with locking enabled.
eeh_pe_bus_get_nolock() that skips locking.
Callers that already hold pci_lock_rescan_remove() now use
eeh_pe_bus_get_nolock() to avoid recursive lock acquisition.
Additionally, pci_lock_rescan_remove() calls are restored to the correct
position—after eeh_pe_bus_get() and immediately before iterating affected
PEs and devices. This ensures EEH-triggered PCI removes occur under proper
bus rescan locking without recursive lock contention.
The eeh_pe_loc_get() function has been split into two functions:
eeh_pe_loc_get(struct eeh_pe *pe) which retrieves the loc for given PE.
eeh_pe_loc_get_bus(struct pci_bus *bus) which retrieves the location
code for given bus.
This resolves lockdep warnings such as:
<snip>
[ 84.964298] [ T928] ============================================
[ 84.964304] [ T928] WARNING: possible recursive locking detected
[ 84.964311] [ T928] 6.18.0-rc3 #51 Not tainted
[ 84.964315] [ T928] --------------------------------------------
[ 84.964320] [ T928] eehd/928 is trying to acquire lock:
[ 84.964324] [ T928] c000000003b29d58 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pci_lock_rescan_remove+0x28/0x40
[ 84.964342] [ T928]
but task is already holding lock:
[ 84.964347] [ T928] c000000003b29d58 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pci_lock_rescan_remove+0x28/0x40
[ 84.964357] [ T928]
other info that might help us debug this:
[ 84.964363] [ T928] Possible unsafe locking scenario:
[ 84.964367] [ T928] CPU0
[ 84.964370] [ T928] ----
[ 84.964373] [ T928] lock(pci_rescan_remove_lock);
[ 84.964378] [ T928] lock(pci_rescan_remove_lock);
[ 84.964383] [ T928]
*** DEADLOCK ***
[ 84.964388] [ T928] May be due to missing lock nesting notation
[ 84.964393] [ T928] 1 lock held by eehd/928:
[ 84.964397] [ T928] #0: c000000003b29d58 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pci_lock_rescan_remove+0x28/0x40
[ 84.964408] [ T928]
stack backtrace:
[ 84.964414] [ T928] CPU: 2 UID: 0 PID: 928 Comm: eehd Not tainted 6.18.0-rc3 #51 VOLUNTARY
[ 84.964417] [ T928] Hardware name: IBM,9080-HEX POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_022) hv:phyp pSeries
[ 84.964419] [ T928] Call Trace:
[ 84.964420] [ T928] [c0000011a7157990] [c000000001705de4] dump_stack_lvl+0xc8/0x130 (unreliable)
[ 84.964424] [ T928] [c0000011a71579d0] [c0000000002f66e0] print_deadlock_bug+0x430/0x440
[ 84.964428] [ T928] [c0000011a7157a70] [c0000000002fd0c0] __lock_acquire+0x1530/0x2d80
[ 84.964431] [ T928] [c0000011a7157ba0] [c0000000002fea54] lock_acquire+0x144/0x410
[ 84.964433] [ T928] [c0000011a7157cb0] [c0000011a7157cb0] __mutex_lock+0xf4/0x1050
[ 84.964436] [ T928] [c0000011a7157e00] [c000000000de21d8] pci_lock_rescan_remove+0x28/0x40
[ 84.964439] [ T928] [c0000011a7157e20] [c00000000004ed98] eeh_pe_bus_get+0x48/0xc0
[ 84.964442] [ T928] [c0000011a7157e50] [c00000
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
sched/rt: Skip currently executing CPU in rto_next_cpu()
CPU0 becomes overloaded when hosting a CPU-bound RT task, a non-CPU-bound
RT task, and a CFS task stuck in kernel space. When other CPUs switch from
RT to non-RT tasks, RT load balancing (LB) is triggered; with
HAVE_RT_PUSH_IPI enabled, they send IPIs to CPU0 to drive the execution
of rto_push_irq_work_func. During push_rt_task on CPU0,
if next_task->prio < rq->donor->prio, resched_curr() sets NEED_RESCHED
and after the push operation completes, CPU0 calls rto_next_cpu().
Since only CPU0 is overloaded in this scenario, rto_next_cpu() should
ideally return -1 (no further IPI needed).
However, multiple CPUs invoking tell_cpu_to_push() during LB increments
rd->rto_loop_next. Even when rd->rto_cpu is set to -1, the mismatch between
rd->rto_loop and rd->rto_loop_next forces rto_next_cpu() to restart its
search from -1. With CPU0 remaining overloaded (satisfying rt_nr_migratory
&& rt_nr_total > 1), it gets reselected, causing CPU0 to queue irq_work to
itself and send self-IPIs repeatedly. As long as CPU0 stays overloaded and
other CPUs run pull_rt_tasks(), it falls into an infinite self-IPI loop,
which triggers a CPU hardlockup due to continuous self-interrupts.
The trigging scenario is as follows:
cpu0 cpu1 cpu2
pull_rt_task
tell_cpu_to_push
<------------irq_work_queue_on
rto_push_irq_work_func
push_rt_task
resched_curr(rq) pull_rt_task
rto_next_cpu tell_cpu_to_push
<-------------------------- atomic_inc(rto_loop_next)
rd->rto_loop != next
rto_next_cpu
irq_work_queue_on
rto_push_irq_work_func
Fix redundant self-IPI by filtering the initiating CPU in rto_next_cpu().
This solution has been verified to effectively eliminate spurious self-IPIs
and prevent CPU hardlockup scenarios. |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Fix deadlocks between devlink and netdev instance locks
In the mentioned "Fixes" commit, various work tasks triggering devlink
health reporter recovery were switched to use netdev_trylock to protect
against concurrent tear down of the channels being recovered. But this
had the side effect of introducing potential deadlocks because of
incorrect lock ordering.
The correct lock order is described by the init flow:
probe_one -> mlx5_init_one (acquires devlink lock)
-> mlx5_init_one_devl_locked -> mlx5_register_device
-> mlx5_rescan_drivers_locked -...-> mlx5e_probe -> _mlx5e_probe
-> register_netdev (acquires rtnl lock)
-> register_netdevice (acquires netdev lock)
=> devlink lock -> rtnl lock -> netdev lock.
But in the current recovery flow, the order is wrong:
mlx5e_tx_err_cqe_work (acquires netdev lock)
-> mlx5e_reporter_tx_err_cqe -> mlx5e_health_report
-> devlink_health_report (acquires devlink lock => boom!)
-> devlink_health_reporter_recover
-> mlx5e_tx_reporter_recover -> mlx5e_tx_reporter_recover_from_ctx
-> mlx5e_tx_reporter_err_cqe_recover
The same pattern exists in:
mlx5e_reporter_rx_timeout
mlx5e_reporter_tx_ptpsq_unhealthy
mlx5e_reporter_tx_timeout
Fix these by moving the netdev_trylock calls from the work handlers
lower in the call stack, in the respective recovery functions, where
they are actually necessary. |
| In the Linux kernel, the following vulnerability has been resolved:
quota: fix livelock between quotactl and freeze_super
When a filesystem is frozen, quotactl_block() enters a retry loop
waiting for the filesystem to thaw. It acquires s_umount, checks the
freeze state, drops s_umount and uses sb_start_write() - sb_end_write()
pair to wait for the unfreeze.
However, this retry loop can trigger a livelock issue, specifically on
kernels with preemption disabled.
The mechanism is as follows:
1. freeze_super() sets SB_FREEZE_WRITE and calls sb_wait_write().
2. sb_wait_write() calls percpu_down_write(), which initiates
synchronize_rcu().
3. Simultaneously, quotactl_block() spins in its retry loop, immediately
executing the sb_start_write() - sb_end_write() pair.
4. Because the kernel is non-preemptible and the loop contains no
scheduling points, quotactl_block() never yields the CPU. This
prevents that CPU from reaching an RCU quiescent state.
5. synchronize_rcu() in the freezer thread waits indefinitely for the
quotactl_block() CPU to report a quiescent state.
6. quotactl_block() spins indefinitely waiting for the freezer to
advance, which it cannot do as it is blocked on the RCU sync.
This results in a hang of the freezer process and 100% CPU usage by the
quota process.
While this can occur intermittently on multi-core systems, it is
reliably reproducing on a node with the following script, running both
the freezer and the quota toggle on the same CPU:
# mkfs.ext4 -O quota /dev/sda 2g && mkdir a_mount
# mount /dev/sda -o quota,usrquota,grpquota a_mount
# taskset -c 3 bash -c "while true; do xfs_freeze -f a_mount; \
xfs_freeze -u a_mount; done" &
# taskset -c 3 bash -c "while true; do quotaon a_mount; \
quotaoff a_mount; done" &
Adding cond_resched() to the retry loop fixes the issue. It acts as an
RCU quiescent state, allowing synchronize_rcu() in percpu_down_write()
to complete. |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_tables: revert commit_mutex usage in reset path
It causes circular lock dependency between commit_mutex, nfnl_subsys_ipset
and nlk_cb_mutex when nft reset, ipset list, and iptables-nft with '-m set'
rule run at the same time.
Previous patches made it safe to run individual reset handlers concurrently
so commit_mutex is no longer required to prevent this. |
| In the Linux kernel, the following vulnerability has been resolved:
jbd2: fix deadlock in jbd2_journal_cancel_revoke()
Commit f76d4c28a46a ("fs/jbd2: use sleeping version of
__find_get_block()") changed jbd2_journal_cancel_revoke() to use
__find_get_block_nonatomic() which holds the folio lock instead of
i_private_lock. This breaks the lock ordering (folio -> buffer) and
causes an ABBA deadlock when the filesystem blocksize < pagesize:
T1 T2
ext4_mkdir()
ext4_init_new_dir()
ext4_append()
ext4_getblk()
lock_buffer() <- A
sync_blockdev()
blkdev_writepages()
writeback_iter()
writeback_get_folio()
folio_lock() <- B
ext4_journal_get_create_access()
jbd2_journal_cancel_revoke()
__find_get_block_nonatomic()
folio_lock() <- B
block_write_full_folio()
lock_buffer() <- A
This can occasionally cause generic/013 to hang.
Fix by only calling __find_get_block_nonatomic() when the passed
buffer_head doesn't belong to the bdev, which is the only case that we
need to look up its bdev alias. Otherwise, the lookup is redundant since
the found buffer_head is equal to the one we passed in. |
| In the Linux kernel, the following vulnerability has been resolved:
RDMA/mlx5: Fix UMR hang in LAG error state unload
During firmware reset in LAG mode, a race condition causes the driver
to hang indefinitely while waiting for UMR completion during device
unload. See [1].
In LAG mode the bond device is only registered on the master, so it
never sees sys_error events from the slave.
During firmware reset this causes UMR waits to hang forever on unload
as the slave is dead but the master hasn't entered error state yet, so
UMR posts succeed but completions never arrive.
Fix this by adding a sys_error notifier that gets registered before
MLX5_IB_STAGE_IB_REG and stays alive until after ib_unregister_device().
This ensures error events reach the bond device throughout teardown.
[1]
Call Trace:
__schedule+0x2bd/0x760
schedule+0x37/0xa0
schedule_preempt_disabled+0xa/0x10
__mutex_lock.isra.6+0x2b5/0x4a0
__mlx5_ib_dereg_mr+0x606/0x870 [mlx5_ib]
? __xa_erase+0x4a/0xa0
? _cond_resched+0x15/0x30
? wait_for_completion+0x31/0x100
ib_dereg_mr_user+0x48/0xc0 [ib_core]
? rdmacg_uncharge_hierarchy+0xa0/0x100
destroy_hw_idr_uobject+0x20/0x50 [ib_uverbs]
uverbs_destroy_uobject+0x37/0x150 [ib_uverbs]
__uverbs_cleanup_ufile+0xda/0x140 [ib_uverbs]
uverbs_destroy_ufile_hw+0x3a/0xf0 [ib_uverbs]
ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs]
remove_client_context+0x8b/0xd0 [ib_core]
disable_device+0x8c/0x130 [ib_core]
__ib_unregister_device+0x10d/0x180 [ib_core]
ib_unregister_device+0x21/0x30 [ib_core]
__mlx5_ib_remove+0x1e4/0x1f0 [mlx5_ib]
auxiliary_bus_remove+0x1e/0x30
device_release_driver_internal+0x103/0x1f0
bus_remove_device+0xf7/0x170
device_del+0x181/0x410
mlx5_rescan_drivers_locked.part.10+0xa9/0x1d0 [mlx5_core]
mlx5_disable_lag+0x253/0x260 [mlx5_core]
mlx5_lag_disable_change+0x89/0xc0 [mlx5_core]
mlx5_eswitch_disable+0x67/0xa0 [mlx5_core]
mlx5_unload+0x15/0xd0 [mlx5_core]
mlx5_unload_one+0x71/0xc0 [mlx5_core]
mlx5_sync_reset_reload_work+0x83/0x100 [mlx5_core]
process_one_work+0x1a7/0x360
worker_thread+0x30/0x390
? create_worker+0x1a0/0x1a0
kthread+0x116/0x130
? kthread_flush_work_fn+0x10/0x10
ret_from_fork+0x22/0x40 |
| In the Linux kernel, the following vulnerability has been resolved:
rcu: Fix rcu_read_unlock() deadloop due to softirq
Commit 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in
__rcu_read_unlock()") removes the recursion-protection code from
__rcu_read_unlock(). Therefore, we could invoke the deadloop in
raise_softirq_irqoff() with ftrace enabled as follows:
WARNING: CPU: 0 PID: 0 at kernel/trace/trace.c:3021 __ftrace_trace_stack.constprop.0+0x172/0x180
Modules linked in: my_irq_work(O)
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.18.0-rc7-dirty #23 PREEMPT(full)
Tainted: [O]=OOT_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:__ftrace_trace_stack.constprop.0+0x172/0x180
RSP: 0018:ffffc900000034a8 EFLAGS: 00010002
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
RDX: 0000000000000003 RSI: ffffffff826d7b87 RDI: ffffffff826e9329
RBP: 0000000000090009 R08: 0000000000000005 R09: ffffffff82afbc4c
R10: 0000000000000008 R11: 0000000000011d7a R12: 0000000000000000
R13: ffff888003874100 R14: 0000000000000003 R15: ffff8880038c1054
FS: 0000000000000000(0000) GS:ffff8880fa8ea000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055b31fa7f540 CR3: 00000000078f4005 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
<IRQ>
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
unwind_next_frame+0x203/0x9b0
__unwind_start+0x15d/0x1c0
arch_stack_walk+0x62/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
unwind_next_frame+0x203/0x9b0
__unwind_start+0x15d/0x1c0
arch_stack_walk+0x62/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
unwind_next_frame+0x203/0x9b0
__unwind_start+0x15d/0x1c0
arch_stack_walk+0x62/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
__is_insn_slot_addr+0x54/0x70
kernel_text_address+0x48/0xc0
__kernel_text_address+0xd/0x40
unwind_get_return_address+0x1e/0x40
arch_stack_walk+0x9c/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
__raise_softirq_irqoff+0x61/0x80
__flush_smp_call_function_queue+0x115/0x420
__sysvec_call_function_single+0x17/0xb0
sysvec_call_function_single+0x8c/0xc0
</IRQ>
Commit b41642c87716 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
fixed the infinite loop in rcu_read_unlock_special() for IRQ work by
setting a flag before calling irq_work_queue_on(). We fix this issue by
setting the same flag before calling raise_softirq_irqoff() and rename the
flag to defer_qs_pending for more common. |