https://github.com/torvalds/linux
Revision 9282012fc0aa248b77a69f5eb802b67c5a16bb13 authored by Jaewon Kim on 25 July 2022, 09:52:12 UTC, committed by Andrew Morton on 29 July 2022, 18:33:37 UTC
There was a report that a task is waiting at the
throttle_direct_reclaim. The pgscan_direct_throttle in vmstat was
increasing.

This is a bug where zone_watermark_fast returns true even when the free
is very low. The commit f27ce0e14088 ("page_alloc: consider highatomic
reserve in watermark fast") changed the watermark fast to consider
highatomic reserve. But it did not handle a negative value case which
can be happened when reserved_highatomic pageblock is bigger than the
actual free.

If watermark is considered as ok for the negative value, allocating
contexts for order-0 will consume all free pages without direct reclaim,
and finally free page may become depleted except highatomic free.

Then allocating contexts may fall into throttle_direct_reclaim. This
symptom may easily happen in a system where wmark min is low and other
reclaimers like kswapd does not make free pages quickly.

Handle the negative case by using MIN.

Link: https://lkml.kernel.org/r/20220725095212.25388-1-jaewon31.kim@samsung.com
Fixes: f27ce0e14088 ("page_alloc: consider highatomic reserve in watermark fast")
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Reported-by: GyeongHwan Hong <gh21.hong@samsung.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Yong-Taek Lee <ytk.lee@samsung.com>
Cc: <stable@vger.kerenl.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent 1f7ea54
Raw File
Tip revision: 9282012fc0aa248b77a69f5eb802b67c5a16bb13 authored by Jaewon Kim on 25 July 2022, 09:52:12 UTC
page_alloc: fix invalid watermark check on a negative value
Tip revision: 9282012
extract-ikconfig
#!/bin/sh
# ----------------------------------------------------------------------
# extract-ikconfig - Extract the .config file from a kernel image
#
# This will only work when the kernel was compiled with CONFIG_IKCONFIG.
#
# The obscure use of the "tr" filter is to work around older versions of
# "grep" that report the byte offset of the line instead of the pattern.
#
# (c) 2009,2010 Dick Streefland <dick@streefland.net>
# Licensed under the terms of the GNU General Public License.
# ----------------------------------------------------------------------

cf1='IKCFG_ST\037\213\010'
cf2='0123456789'

dump_config()
{
	if	pos=`tr "$cf1\n$cf2" "\n$cf2=" < "$1" | grep -abo "^$cf2"`
	then
		pos=${pos%%:*}
		tail -c+$(($pos+8)) "$1" | zcat > $tmp1 2> /dev/null
		if	[ $? != 1 ]
		then	# exit status must be 0 or 2 (trailing garbage warning)
			cat $tmp1
			exit 0
		fi
	fi
}

try_decompress()
{
	for	pos in `tr "$1\n$2" "\n$2=" < "$img" | grep -abo "^$2"`
	do
		pos=${pos%%:*}
		tail -c+$pos "$img" | $3 > $tmp2 2> /dev/null
		dump_config $tmp2
	done
}

# Check invocation:
me=${0##*/}
img=$1
if	[ $# -ne 1 -o ! -s "$img" ]
then
	echo "Usage: $me <kernel-image>" >&2
	exit 2
fi

# Prepare temp files:
tmp1=/tmp/ikconfig$$.1
tmp2=/tmp/ikconfig$$.2
trap "rm -f $tmp1 $tmp2" 0

# Initial attempt for uncompressed images or objects:
dump_config "$img"

# That didn't work, so retry after decompression.
try_decompress '\037\213\010' xy    gunzip
try_decompress '\3757zXZ\000' abcde unxz
try_decompress 'BZh'          xy    bunzip2
try_decompress '\135\0\0\0'   xxx   unlzma
try_decompress '\211\114\132' xy    'lzop -d'
try_decompress '\002\041\114\030' xyy 'lz4 -d -l'

# Bail out:
echo "$me: Cannot find kernel config." >&2
exit 1
back to top