https://github.com/torvalds/linux
Revision 688f3d1ebedffa310b6591bd1b63fa0770d945fe authored by Lyude Paul on 20 June 2019, 23:21:26 UTC, committed by Alex Deucher on 01 July 2019, 14:15:00 UTC
I'm not entirely sure why this is, but for some reason:

921935dc6404 ("drm/amd/powerplay: enforce display related settings only on needed")

Breaks runtime PM resume on the Radeon PRO WX 3100 (Lexa) in one the
pre-production laptops I have. The issue manifests as the following
messages in dmesg:

[drm] UVD and UVD ENC initialized successfully.
amdgpu 0000:3b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vce1 test failed (-110)
[drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <vce_v3_0> failed -110
[drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-110).

And happens after about 6-10 runtime PM suspend/resume cycles (sometimes
sooner, if you're lucky!). Unfortunately I can't seem to pin down
precisely which part in psm_adjust_power_state_dynamic that is causing
the issue, but not skipping the display setting setup seems to fix it.
Hopefully if there is a better fix for this, this patch will spark
discussion around it.

Fixes: 921935dc6404 ("drm/amd/powerplay: enforce display related settings only on needed")
Cc: Evan Quan <evan.quan@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Rex Zhu <Rex.Zhu@amd.com>
Cc: Likun Gao <Likun.Gao@amd.com>
Cc: <stable@vger.kernel.org> # v5.1+
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1 parent f78c581
Raw File
Tip revision: 688f3d1ebedffa310b6591bd1b63fa0770d945fe authored by Lyude Paul on 20 June 2019, 23:21:26 UTC
drm/amdgpu: Don't skip display settings in hwmgr_resume()
Tip revision: 688f3d1
mlx-wdt.txt
		Mellanox watchdog drivers
		for x86 based system switches

This driver provides watchdog functionality for various Mellanox
Ethernet and Infiniband switch systems.

Mellanox watchdog device is implemented in a programmable logic device.

There are 2 types of HW watchdog implementations.

Type 1:
Actual HW timeout can be defined as a power of 2 msec.
e.g. timeout 20 sec will be rounded up to 32768 msec.
The maximum timeout period is 32 sec (32768 msec.),
Get time-left isn't supported

Type 2:
Actual HW timeout is defined in sec. and it's the same as
a user-defined timeout.
Maximum timeout is 255 sec.
Get time-left is supported.

Type 1 HW watchdog implementation exist in old systems and
all new systems have type 2 HW watchdog.
Two types of HW implementation have also different register map.

Mellanox system can have 2 watchdogs: main and auxiliary.
Main and auxiliary watchdog devices can be enabled together
on the same system.
There are several actions that can be defined in the watchdog:
system reset, start fans on full speed and increase register counter.
The last 2 actions are performed without a system reset.
Actions without reset are provided for auxiliary watchdog device,
which is optional.
Watchdog can be started during a probe, in this case it will be
pinged by watchdog core before watchdog device will be opened by
user space application.
Watchdog can be initialised in nowayout way, i.e. oncse started
it can't be stopped.

This mlx-wdt driver supports both HW watchdog implementations.

Watchdog driver is probed from the common mlx_platform driver.
Mlx_platform driver provides an appropriate set of registers for
Mellanox watchdog device, identity name (mlx-wdt-main or mlx-wdt-aux),
initial timeout, performed action in expiration and configuration flags.
watchdog configuration flags: nowayout and start_at_boot, hw watchdog
version - type1 or type2.
The driver checks during initialization if the previous system reset
was done by the watchdog. If yes, it makes a notification about this event.

Access to HW registers is performed through a generic regmap interface.
back to top