https://github.com/torvalds/linux
Revision 5018c59607a511cdee743b629c76206d9c9e6d7b authored by Wei Wang on 16 October 2019, 19:03:15 UTC, committed by David S. Miller on 17 October 2019, 23:44:03 UTC
Jesse and Ido reported the following race condition:
<CPU A, t0> - Received packet A is forwarded and cached dst entry is
taken from the nexthop ('nhc->nhc_rth_input'). Calls skb_dst_set()

<t1> - Given Jesse has busy routers ("ingesting full BGP routing tables
from multiple ISPs"), route is added / deleted and rt_cache_flush() is
called

<CPU B, t2> - Received packet B tries to use the same cached dst entry
from t0, but rt_cache_valid() is no longer true and it is replaced in
rt_cache_route() by the newer one. This calls dst_dev_put() on the
original dst entry which assigns the blackhole netdev to 'dst->dev'

<CPU A, t3> - dst_input(skb) is called on packet A and it is dropped due
to 'dst->dev' being the blackhole netdev

There are 2 issues in the v4 routing code:
1. A per-netns counter is used to do the validation of the route. That
means whenever a route is changed in the netns, users of all routes in
the netns needs to redo lookup. v6 has an implementation of only
updating fn_sernum for routes that are affected.
2. When rt_cache_valid() returns false, rt_cache_route() is called to
throw away the current cache, and create a new one. This seems
unnecessary because as long as this route does not change, the route
cache does not need to be recreated.

To fully solve the above 2 issues, it probably needs quite some code
changes and requires careful testing, and does not suite for net branch.

So this patch only tries to add the deleted cached rt into the uncached
list, so user could still be able to use it to receive packets until
it's done.

Fixes: 95c47f9cf5e0 ("ipv4: call dst_dev_put() properly")
Signed-off-by: Wei Wang <weiwan@google.com>
Reported-by: Ido Schimmel <idosch@idosch.org>
Reported-by: Jesse Hathaway <jesse@mbuki-mvuki.org>
Tested-by: Jesse Hathaway <jesse@mbuki-mvuki.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent 595e065
History
Tip revision: 5018c59607a511cdee743b629c76206d9c9e6d7b authored by Wei Wang on 16 October 2019, 19:03:15 UTC
ipv4: fix race condition between route lookup and invalidation
Tip revision: 5018c59
File Mode Size
Documentation
LICENSES
arch
block
certs
crypto
drivers
fs
include
init
ipc
kernel
lib
mm
net
samples
scripts
security
sound
tools
usr
virt
.clang-format -rw-r--r-- 15.0 KB
.cocciconfig -rw-r--r-- 59 bytes
.get_maintainer.ignore -rw-r--r-- 71 bytes
.gitattributes -rw-r--r-- 30 bytes
.gitignore -rw-r--r-- 1.7 KB
.mailmap -rw-r--r-- 13.1 KB
COPYING -rw-r--r-- 423 bytes
CREDITS -rw-r--r-- 97.1 KB
Kbuild -rw-r--r-- 1.3 KB
Kconfig -rw-r--r-- 595 bytes
MAINTAINERS -rw-r--r-- 516.8 KB
Makefile -rw-r--r-- 59.4 KB
README -rw-r--r-- 727 bytes

README

back to top