https://github.com/torvalds/linux

sort by:
Revision Author Date Message Commit Date
1ba3752 perf pmu-events: Hide the pmu_events Hide that the pmu_event structs are an array with a new wrapper struct. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 18:02:08 UTC
660842e perf pmu-events: Don't assume pmu_event is an array The current code assumes that a struct pmu_event can be iterated over forward until a NULL pmu_event is encountered. This makes it difficult to refactor pmu_event. Add a loop function taking a callback function that's passed the struct pmu_event. This way the pmu_event is only needed for one element and not an entire array. Switch existing code iterating over the pmu_event arrays to use the new loop function pmu_events_table_for_each_event. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 18:01:31 UTC
7ae5c03 perf pmu-events: Move test events/metrics to JSON Move arrays of pmu_events into the JSON code so that it may be regenerated and modified by the jevents.py script. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 18:01:15 UTC
64234c1 perf test: Use full metric resolution The simple metric resolution doesn't handle recursion properly, switch to use the full resolution as with the parse-metric tests which also increases coverage. Don't set the values for the metric backward as failures to generate a result are ignored. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 18:01:03 UTC
29be2fe perf pmu-events: Hide pmu_events_map Move usage of the table to pmu-events.c so it may be hidden. By abstracting the table the implementation can later be changed. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 18:00:47 UTC
eeac773 perf pmu-events: Avoid passing pmu_events_map Preparation for hiding pmu_events_map as an implementation detail. While the map is passed, the table of events is all that is normally wanted. While modifying the function's types, rename pmu_events_map__find to pmu_events_table__find to match later encapsulation. Similarly rename pmu_add_cpu_aliases_map to pmu_add_cpu_aliases_table. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 18:00:32 UTC
2519db2 perf pmu-events: Hide pmu_sys_event_tables Move usage of the table to pmu-events.c so it may be hidden. By abstracting the table the implementation can later be changed. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 18:00:16 UTC
7b2f844 perf jevents: Sort JSON files entries Sort the JSON files entries on conversion to C. The sort order tries to replicated cmp_sevent from pmu.c so that the input there is already sorted except for sysfs events. Specifically, the sort order is given by the tuple: (not j.desc is None, fix_none(j.topic), fix_none(j.name), fix_none(j.pmu), fix_none(j.metric_name)) which is putting events with descriptions and topics before those without, then sorting by name, then pmu and finally metric_name Add the topic to JsonEvent on reading to simplify. Remove an unnecessary lambda in the JSON reading. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 17:59:44 UTC
ee2ce6f perf jevents: Provide path to JSON file on error If a JSONDecoderError or similar is raised then it is useful to know the path. Print this and then raise the exception agan. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 17:59:26 UTC
f793ae1 perf jevents: Remove the type/version variables pmu_events_map has a type variable that is always initialized to "core" and a version variable that is never read. Remove these from the API as it is straightforward to add them back when necessary. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 17:58:40 UTC
099b157 perf jevent: Add an 'all' architecture argument When 'all' is passed as the architecture generate a mapping table for all architectures. This simplifies testing. To identify the table for an architecture add an arch variable to the pmu_events_map. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220812230949.683239-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 13 August 2022, 17:58:20 UTC
8d33834 perf stat: Remove duplicated include in builtin-stat.c util/topdown.h is included twice in builtin-stat.c, remove one of them. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=1818 Link: https://lore.kernel.org/r/20220804005213.71990-1-yang.lee@linux.alibaba.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 12 August 2022, 19:51:31 UTC
0029e8a perf scripting python: Delete repeated word in comments Delete the repeated word "into" in comments. Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220807160239.474-1-dengshaomin@cdjrlc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 12 August 2022, 19:45:53 UTC
987f5cb perf tools: Fix double word in comments Delete the repeated word "to" in comments. Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220807155549.30953-1-dengshaomin@cdjrlc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 12 August 2022, 19:45:24 UTC
632f5c2 perf trace: Fix double word in comments Delete repeated word "and" in comments. Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220807084629.23121-1-dengshaomin@cdjrlc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 12 August 2022, 19:44:56 UTC
ae4e4a0 perf script: Delete repeated word "from" Delete the repeated word "from" in code. Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220807080642.13004-1-dengshaomin@cdjrlc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 12 August 2022, 19:44:30 UTC
f3c96be perf test: Fix double word in comments Delete the redundant word "then" in comments. Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220807074753.7857-1-dengshaomin@cdjrlc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 12 August 2022, 19:42:55 UTC
1bf7d83 perf record: Improve error message of -p not_existing_pid When one uses -p $not_existing_pid, the output of --help is printed: $ perf record -p 123456789 2>&1 | head -n3 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] Let's change it something similar what perf top -p $not_existing_pid prints: $ ./perf top -p 123456789 --stdio Error: Couldn't create thread/CPU maps: No such process Newly suggested error message: $ ./perf record -p 123456789 Couldn't create thread/CPU maps: No such process Signed-off-by: Martin Liška <mliska@suse.cz> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/8e00eda1-4de0-2c44-ce67-d4df48ac1f7c@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 12 August 2022, 13:27:21 UTC
a072a7a perf build-id: Print debuginfod queries if -v option is used When ending a 'perf record' session, the querying of a debuginfod server can take quite some time. Inform a user about it when -v options is used. Signed-off-by: Martin Liška <mliska@suse.cz> Link: http://lore.kernel.org/lkml/325871cf-b71f-6237-8793-82182272ece8@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 12 August 2022, 13:27:20 UTC
34575de perf build-id: Fix coding style, replace 8 spaces by tabs Use tabs instead of 8 spaces for the indentation. Signed-off-by: Martin Liška <mliska@suse.cz> Link: http://lore.kernel.org/lkml/2983e2e0-6850-ad59-79d8-efe83b22cffe@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 12 August 2022, 13:22:49 UTC
e754dd7 perf c2c: Update documentation for new display option 'peer' Since the new display option 'peer' is introduced, this patch is to update the documentation to reflect it. Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-16-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:32 UTC
ead42a0 perf c2c: Use 'peer' as default display for Arm64 Since Arm64 arch doesn't support HITMs flags, this patch changes to use 'peer' as default display if user doesn't specify any type; for other arches, it still uses 'tot' as default display type if user doesn't specify it. This patch changes to call perf_session__new() in an earlier place, so session environment can be initialized ahead and arch info can be used for setting display type. Suggested-by: Ali Saidi <alisaidi@amazon.com> Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-15-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:31 UTC
f37c5d9 perf c2c: Sort on peer snooping for load operations This patch adds a new option 'peer' so can sort on the cache hit for peer snooping. For displaying with option 'peer', the "Shared Data Cache Line Table" and "Shared Cache Line Distribution Pareto" both sort with the metrics "tot_peer". As result, we can get the 'peer' display: # perf c2c report -d peer --coalesce tid,pid,iaddr,dso -N --stdio ================================================= Shared Data Cache Line Table ================================================= # # ----------- Cacheline ---------- Peer ------- Load Peer ------- Total Total Total --------- Stores -------- ----- Core Load Hit ----- - LLC Load Hit -- - RMT Load Hit -- --- Load Dram ---- # Index Address Node PA cnt Snoop Total Local Remote records Loads Stores L1Hit L1Miss N/A FB L1 L2 LclHit LclHitm RmtHit RmtHitm Lcl Rmt # ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ........ ....... ........ ........ # 0 0xaaaac17d6000 N/A 0 100.00% 99 99 0 18851 18851 0 0 0 0 0 18752 0 99 0 0 0 0 0 ================================================= Shared Cache Line Distribution Pareto ================================================= # # -- Peer Snoop -- ------- Store Refs ------ --------- Data address --------- ---------- cycles ---------- Total cpu Shared # Num Rmt Lcl L1 Hit L1 Miss N/A Offset Node PA cnt Pid Tid Code address rmt peer lcl peer load records cnt Symbol Object Source:Line Node{cpus %peers %stores} # ..... ....... ....... ....... ....... ....... .................. .... ...... ....... ................. .................. ........ ........ ........ ....... ........ ...................... ................ ............... .... # ---------------------------------------------------------------------- 0 0 99 0 0 0 0xaaaac17d6000 ---------------------------------------------------------------------- 0.00% 3.03% 0.00% 0.00% 0.00% 0x20 N/A 0 3603 3603:memstress 0xaaaac17c25ac 0 376 41 9314 2 [.] 0x00000000000025ac memstress memstress[25ac] 0{ 2 100.0% n/a} 0.00% 3.03% 0.00% 0.00% 0.00% 0x20 N/A 0 3603 3606:memstress 0xaaaac17c25ac 0 375 44 9155 1 [.] 0x00000000000025ac memstress memstress[25ac] 0{ 1 100.0% n/a} 0.00% 48.48% 0.00% 0.00% 0.00% 0x29 N/A 0 3603 3606:memstress 0xaaaac17c3e88 0 180 170 65 1 [.] 0x0000000000003e88 memstress memstress[3e88] 0{ 1 100.0% n/a} 0.00% 45.45% 0.00% 0.00% 0.00% 0x29 N/A 0 3603 3603:memstress 0xaaaac17c3e88 0 180 175 70 2 [.] 0x0000000000003e88 memstress memstress[3e88] 0{ 2 100.0% n/a} Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-14-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:25 UTC
faa30df perf c2c: Refactor display string The display type is shown by combination the display string array and a suffix string "HITMs", which is not friendly to extend display for other sorting type (e.g. extension for peer operations). This patch moves the suffix string "HITMs" into display string array for HITM types, so it can allow us to not necessarily to output string "HITMs" for new incoming display type. Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-13-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:25 UTC
7c10b65 perf c2c: Refactor node header The node header array contains 3 items, each item is used for one of the 3 flavors for node accessing info. To extend sorting on other snooping type and not always stick to HITMs, the second header string "Node{cpus %hitms %stores}" should be adjusted (e.g. it's changed as "Node{cpus %peer %stores}"). For this reason, this patch changes the node header array to three flat variables and uses switch-case in function setup_nodes_header(), thus it is easier for altering the header string. Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-12-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:24 UTC
2be0bc7 perf c2c: Rename dimension from 'percent_hitm' to 'percent_costly_snoop' Use more general naming for the main sort dimension, this can allow us not to sort only on HITM snoop type, so it can be extended to support other costly snooping operations. So rename the dimension to the prefix 'percent_costly_". Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-11-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:22 UTC
c82ccc3 perf c2c: Use explicit names for display macros Perf c2c tool has an assumption that it heavily depends on HITM snoop type to detect cache false sharing, unfortunately, HITM is not supported on some architectures. Essentially, perf c2c tool wants to find some very costly snooping operations for false cache sharing, this means it's not necessarily to stick using HITM tags and we can explore other snooping types (e.g. SNOOPX_PEER). For this reason, this patch renames HITM related display macros with suffix '_HITM', so it can be distinct if later add more display types for on other snooping type. Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-10-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:21 UTC
682352e perf c2c: Add mean dimensions for peer operations This patch adds two dimensions for the mean value of peer operations. Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-9-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:20 UTC
9082282 perf c2c: Add dimensions of peer metrics for cache line view This patch adds dimensions of peer ops, which will be used for Shared cache line distribution pareto. It adds the percentage dimensions for local and remote peer operations, and the dimensions for accounting operation numbers which is used for stdio mode. Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-8-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:19 UTC
63e74ab perf c2c: Add dimensions for peer load operations This patch adds three dimensions for peer load operations of 'lcl_peer', 'rmt_peer' and 'tot_peer'. These three dimensions will be used in the shared data cache line table. Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-7-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:18 UTC
3ef1fc1 perf c2c: Output statistics for peer snooping This patch outputs statistics for peer snooping for whole trace events and global shared cache line. Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-6-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:16 UTC
e843dec perf mem: Add statistics for peer snooping Since the flag PERF_MEM_SNOOPX_PEER is added to support cache snooping from peer cache line, it can come from a peer core, a peer cluster, or a remote NUMA node. This patch adds statistics for the flag PERF_MEM_SNOOPX_PEER. Note, we take PERF_MEM_SNOOPX_PEER as an affiliated info, it needs to cooperate with cache level statistics. Therefore, we account the load operations for both the cache level's metrics (e.g. ld_l2hit, ld_llchit, etc.) and peer related metrics when flag PERF_MEM_SNOOPX_PEER is set. So three new metrics are introduced: 'lcl_peer' is for local cache access, the metric 'rmt_peer' is for remote access (includes remote DRAM and any caches in remote node), and the metric 'tot_peer' is accounting the sum value of 'lcl_peer' and 'rmt_peer'. Reviewed-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-5-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:12 UTC
4e6430c perf arm-spe: Use SPE data source for neoverse cores When synthesizing data from SPE, augment the type with source information for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use the same encoding. I can't find encoding information for any other SPE implementations to unify their choices with Arm's thus that is left for future work. This change populates the mem_lvl_num for Neoverse cores as well as the deprecated mem_lvl namespace. Reviewed-by: German Gomez <german.gomez@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Ali Saidi <alisaidi@amazon.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-4-leo.yan@linaro.org Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:12:01 UTC
f78d625 perf mem: Print snoop peer flag Since PERF_MEM_SNOOPX_PEER flag is a new snoop type, print this flag if it is set. Before: memstress 3603 [020] 122.463754: 1 l1d-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 l1d-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 llc-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 llc-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 tlb-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 memory: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) After: memstress 3603 [020] 122.463754: 1 l1d-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 l1d-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 llc-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 llc-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 tlb-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 memory: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) Reviewed-by: Ali Saidi <alisaidi@amazon.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Ali Saidi <alisaidi@amazon.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:11:36 UTC
2e21bcf perf tools: Sync addition of PERF_MEM_SNOOPX_PEER Add a flag to the 'perf mem' data struct to signal that a request caused a cache-to-cache transfer of a line from a peer of the requestor and wasn't sourced from a lower cache level. The line being moved from one peer cache to another has latency and performance implications. On Arm64 Neoverse systems the data source can indicate a cache-to-cache transfer but not if the line is dirty or clean, so instead of overloading HITM define a new flag that indicates this type of transfer. Committer notes: This really is not syncing with the kernel since the patch to the kernel wasn't merged. But we're going ahead of this as it seems trivial and is just a matter of the perf kernel maintainers to give their ack or for us to find another way of expressing this in the perf records synthesized in userspace from the ARM64 hardware traces. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Ali Saidi <alisaidi@amazon.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220811062451.435810-2-leo.yan@linaro.org Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:11:36 UTC
4a88c4e perf arm64: Add missing -I for tools/arch/arm64/include/ to find asm/sysreg.h when building arm_spe.h This cures a current problem where tools/perf/util/arm-spe.c isn't finding a ARM64 specific asm header, so lets add it for now to make progress. Adding a .o specific rule seems clunky, lets try and find if this is really the right solution. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Will Deacon <will@kernel.org> Cc: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/lkml/20220811124825.GA868014@leoy-huanghe.lan Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 22:11:36 UTC
53e76d3 perf tools: Tidy guest option documentation Move common guest options into include files. Use attribute substitution to customize an example, using "[verse]" to define the block instead of a "literal" block which does not permit substitution. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220811170411.84154-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 21:50:17 UTC
d9ca43c perf inject: Fix missing guestmount option documentation The 'perf inject' documentation is missing the guestmount option. Add it. Fixes: 97406a7e4fa6e5ca ("perf inject: Add support for injecting guest sideband events") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220811170411.84154-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 21:49:57 UTC
696d0a4 perf script: Fix missing guest option documentation The 'perf script' documentation is missing several options relating to guests. Add them. Fixes: 15a108af1a18b597 ("perf script: Allow specifying the files to process guest samples") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220811170411.84154-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 21:49:38 UTC
ade1d03 perf offcpu: Update offcpu test for child process Record off-cpu data with perf bench sched messaging workload and count the number of offcpu-time events. Also update the test script not to run next tests if failed already and revise the error messages. $ sudo ./perf test offcpu -v 88: perf record offcpu profiling tests : --- start --- test child forked, pid 344780 Checking off-cpu privilege Basic off-cpu test Basic off-cpu test [Success] Child task off-cpu test Child task off-cpu test [Success] test child finished with 0 ---- end ---- perf record offcpu profiling tests: Ok Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220811185456.194721-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 20:57:45 UTC
d234776 perf offcpu: Track child processes When -p option used or a workload is given, it needs to handle child processes. The perf_event can inherit those task events automatically. We can add a new BPF program in task_newtask tracepoint to track child processes. Before: $ sudo perf record --off-cpu -- perf bench sched messaging $ sudo perf report --stat | grep -A1 offcpu offcpu-time stats: SAMPLE events: 1 After: $ sudo perf record -a --off-cpu -- perf bench sched messaging $ sudo perf report --stat | grep -A1 offcpu offcpu-time stats: SAMPLE events: 856 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220811185456.194721-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 20:57:34 UTC
d6f415c perf offcpu: Parse process id separately The current target code uses thread id for tracking tasks because perf_events need to be opened for each task. But we can use tgid in BPF maps and check it easily. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220811185456.194721-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 20:57:11 UTC
07fc958 perf offcpu: Check process id for the given workload Current task filter checks task->pid which is different for each thread. But we want to profile all the threads in the process. So let's compare process id (or thread-group id: tgid) instead. Before: $ sudo perf record --off-cpu -- perf bench sched messaging -t $ sudo perf report --stat | grep -A1 offcpu offcpu-time stats: SAMPLE events: 2 After: $ sudo perf record --off-cpu -- perf bench sched messaging -t $ sudo perf report --stat | grep -A1 offcpu offcpu-time stats: SAMPLE events: 850 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220811185456.194721-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 11 August 2022, 20:56:47 UTC
806731a perf tools: Do not pass NULL to parse_events() Many cases do not use the extra error information provided by parse_events and instead pass NULL as the struct parse_events_error pointer. Add a wrapper for those cases so that the pointer is never NULL. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220809080702.6921-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 17:30:09 UTC
1da1d60 perf tests: Fix Track with sched_switch test for hybrid case If cpu_core PMU event fails to parse, try also cpu_atom PMU event when parsing cycles event. Fixes: 43eb05d066795bdf ("perf tests: Support 'Track with sched_switch' test for hybrid") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220809080702.6921-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 17:29:46 UTC
2e82858 perf parse-events: Fix segfault when event parser gets an error parse_events() is often called with parse_events_error set to NULL. Make parse_events_error__handle() not segfault in that case. A subsequent patch changes to avoid passing NULL in the first place. Fixes: 43eb05d066795bdf ("perf tests: Support 'Track with sched_switch' test for hybrid") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220809080702.6921-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 17:29:23 UTC
b39c9e1 perf machine: Fix missing free of machine->kallsyms_filename Add missing free of machine->kallsyms_filename to machine__exit(). Fixes: a5367ecb5353fbf2 ("perf tools: Automatically use guest kcore_dir if present") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220809130758.12800-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
0c39f14 perf script: Fix reference to perf insert instead of perf inject Amend "perf insert" to "perf inject". Fixes: e28fb159f1163e76 ("perf script: Add machine_pid and vcpu") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220809123258.9086-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
628881e perf sched latency: Fix subcommand matching error perf sched latency use strncmp to match subcommands which matching does not meet expectation. Before: # perf sched lat1234 >/dev/null # echo $? 0 # Solution: Use strstarts to match subcommand. After: # perf sched lat1234 Usage: perf sched [<options>] {record|latency|map|replay|script|timehist} -D, --dump-raw-trace dump raw trace in ASCII -f, --force don't complain, do it -i, --input <file> input file name -v, --verbose be more verbose (show symbol address, etc) # echo $? 129 # # perf sched lat >/dev/null # echo $? 0 # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220808092408.107399-3-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
d2f30b7 perf kvm: Fix subcommand matching error Currently the 'diff', 'top', 'buildid-list' and 'stat' perf commands use strncmp() to match subcommands. As a result, matching does not meet expectation. For example: # perf kvm diff1234 # Event 'cycles' # # Baseline Delta Abs Shared Object Symbol # ........ ......... ............. ...... # # Event 'dummy:HG' # # Baseline Delta Abs Shared Object Symbol # ........ ......... ............. ...... # # echo $? 0 # Invalid information should be returned, but success is actually returned. Solution: Use strstarts() to match subcommands. After: # perf kvm diff1234 Usage: perf kvm [<options>] {top|record|report|diff|buildid-list|stat} -i, --input <file> Input file name -o, --output <file> Output file name -v, --verbose be more verbose (show counter open errors, etc) --guest Collect guest os data --guest-code Guest code can be found in hypervisor process --guestkallsyms <file> file saving guest os /proc/kallsyms --guestmodules <file> file saving guest os /proc/modules --guestmount <directory> guest mount directory under which every guest os instance has a subdir --guestvmlinux <file> file saving guest os vmlinux --host Collect host os data # echo $? 129 # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220808092408.107399-2-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
4bf6dca perf probe: Fix an error handling path in 'parse_perf_probe_command()' If a memory allocation fail, we should branch to the error handling path in order to free some resources allocated a few lines above. Fixes: 15354d54698648e2 ("perf probe: Generate event name with line number") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: kernel-janitors@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/b71bcb01fa0c7b9778647235c3ab490f699ba278.1659797452.git.christophe.jaillet@wanadoo.fr Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
46f7bd5 perf inject jit: Ignore memfd and anonymous mmap events if jitdump present Some processes store jitted code in memfd mappings to avoid having rwx mappings. These processes map the code with a writeable mapping and a read-execute mapping. They write the code using the writeable mapping and then unmap the writeable mapping. All subsequent execution is through the read-execute mapping. perf inject --jit ignores //anon* mappings for each process where a jitdump is present because it expects to inject mmap events for each jitted code range, and said jitted code ranges will overlap with the //anon* mappings. Ignore /memfd: and [anon:* mappings so that jitted code contained in /memfd: and [anon:* mappings is treated the same way as jitted code contained in //anon* mappings. Signed-off-by: Brian Robbins <brianrob@linux.microsoft.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220805220645.95855-1-brianrob@linux.microsoft.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
e0b23af perf list: Add PMU pai_crypto event description for IBM z16 Add the event description for the IBM z16 pai_crypto PMU released with commit 1bf54f32f525 ("s390/pai: Add support for cryptography counters") The document SA22-7832-13 "z/Architecture Principles of Operation", published May, 2022, contains the description of the Processor Activity Instrumentation Facility and the cryptography counter set., See Pages 5-110 to 5-113. Patch reworked to fit for the converted jevents processing. Committer notes: Couldn't find 1bf54f32f525 ("s390/pai: Add support for cryptography counters") in torvalds/master, in what tree is that cset? Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20220804075221.1132849-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
b48ddbb perf vendor events: Remove bad jaketown uncore events The event converter scripts at: https://github.com/intel/event-converter-for-linux-perf passes Filter values from data on 01.org that is bogus in a perf command line and can cause perf to infinitely recurse in parse events. Remove such events or filters using the updated patch: https://github.com/intel/event-converter-for-linux-perf/pull/15/commits/afd779df99ee41aac646eae1ae5ae651cda3394d Fixes: 376d8b581b7639c9 ("perf vendor events: Update Intel jaketown") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220805013856.1842878-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
22de36f perf vendor events: Remove bad ivytown uncore events The event converter scripts at: https://github.com/intel/event-converter-for-linux-perf passes Filter values from data on 01.org that is bogus in a perf command line and can cause perf to infinitely recurse in parse events. Remove such events or filters using the updated patch: https://github.com/intel/event-converter-for-linux-perf/pull/15/commits/afd779df99ee41aac646eae1ae5ae651cda3394d Fixes: 6220136831e34615 ("perf vendor events: Update Intel ivytown") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220805013856.1842878-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
2c98bac perf vendor events: Remove bad broadwellde uncore events The event converter scripts at: https://github.com/intel/event-converter-for-linux-perf passes Filter values from data on 01.org that is bogus in a perf command line and can cause perf to infinitely recurse in parse events. Remove such events or filters using the updated patch: https://github.com/intel/event-converter-for-linux-perf/pull/15/commits/afd779df99ee41aac646eae1ae5ae651cda3394d Fixes: ef908a192512bf45 ("perf vendor events: Update Intel broadwellde") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220805013856.1842878-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
b4f0466 perf jevents: Add JEVENTS_ARCH make option Allow the architecture built into pmu-events.c to be set on the make command line with JEVENTS_ARCH. Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220804221816.1802790-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
46acb31 perf jevents: Simplify generation of C-string Previous implementation wanted variable order and '(null)' string output to match the C implementation. The '(null)' string output was a quirk/bug and so there is no need to carry it forward. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220804221816.1802790-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
e1e19d0 perf jevents: Clean up pytype warnings Improve type hints to clean up pytype warnings. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220804221816.1802790-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
5b24598 tools build: Switch to new openssl API for test-libcrypto Switch to new EVP API for detecting libcrypto, as Fedora 36 returns an error when it encounters the deprecated function MD5_Init() and the others. The error would be interpreted as missing libcrypto, while in reality it is not. Fixes: 6e8ccb4f624a73c5 ("tools/bpf: properly account for libbfd variations") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: llvm@lists.linux.dev Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220719170555.2576993-4-roberto.sassu@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
73f8ec5 Revert "perf build: Suppress openssl v3 deprecation warnings in libcrypto feature test" This reverts commit 10fef869a58e37ec649b61eddab545f2da57a79b. Because a proper fix was submitted. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:02 UTC
dd6775f perf build: Remove FEATURE_CHECK_LDFLAGS-disassembler-{four-args,init-styled} setting As the building mechanism is now able to retry detection with different combinations of linking flags, setting FEATURE_CHECK_LDFLAGS-disassembler-four-args and FEATURE_CHECK_LDFLAGS-disassembler-init-styled is not necessary anymore, so remove it. Committer notes: Use the same technique to find the set of bfd-related libraries to link as in: 3308ffc5016e6136 ("tools, build: Retry detection of bfd-related features") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andres Freund <andres@anarazel.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20220719170555.2576993-3-roberto.sassu@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:01 UTC
13e6f53 bpftool: Complete libbfd feature detection Commit 6e8ccb4f624a7 ("tools/bpf: properly account for libbfd variations") sets the linking flags depending on which flavor of the libbfd feature was detected. However, the flavors except libbfd cannot be detected, as they are not in the feature list. Complete the list of features to detect by adding libbfd-liberty and libbfd-liberty-z. Committer notes: Adjust conflict with with: 1e1613f64cc8a09d ("tools bpftool: Don't display disassembler-four-args feature test") 600b7b26c07a070d ("tools bpftool: Fix compilation error with new binutils") Fixes: 6e8ccb4f624a73c5 ("tools/bpf: properly account for libbfd variations") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andres Freund <andres@anarazel.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: llvm@lists.linux.dev Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220719170555.2576993-2-roberto.sassu@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:01 UTC
629b98e tools, build: Retry detection of bfd-related features While separate features have been defined to determine which linking flags are required to use libbfd depending on the distribution (libbfd, libbfd-liberty and libbfd-liberty-z), the same has not been done for other features requiring linking to libbfd. For example, disassembler-four-args requires linking to libbfd too, but it should use the right linking flags. If not all the required ones are specified, e.g. -liberty, detection will always fail even if the feature is available. Instead of creating new features, similarly to libbfd, simply retry detection with the different set of flags until detection succeeds (or fails, if the libraries are missing). In this way, feature detection is transparent for the users of this building mechanism (e.g. perf), and those users don't have for example to set an appropriate value for the FEATURE_CHECK_LDFLAGS-disassembler-four-args variable. The number of retries and features for which the retry mechanism is implemented is low enough to make the increase in the complexity of Makefile negligible. Tested with perf and bpftool on Ubuntu 20.04.4 LTS, Fedora 36 and openSUSE Tumbleweed. Committer notes: Do the retry for disassembler-init-styled as well. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andres Freund <andres@anarazel.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20220719170555.2576993-1-roberto.sassu@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:01 UTC
0c343af perf test: JSON format checking Add field checking tests for perf stat JSON output. Sanity checks the expected number of fields are present, that the expected keys are present and they have the correct values. Committer notes: Had to fix this: - $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib' \ + $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \ Committer testing: [root@quaco ~]# perf test json 90: perf stat JSON output linter : Ok [root@quaco ~]# set -o vi [root@quaco ~]# perf test -v json 90: perf stat JSON output linter : --- start --- test child forked, pid 560794 Checking json output: no args [Success] Checking json output: system wide [Success] Checking json output: system wide Checking json output: system wide no aggregation [Success] Checking json output: interval [Success] Checking json output: event [Success] Checking json output: per core [Success] Checking json output: per thread [Success] Checking json output: per die [Success] Checking json output: per node [Success] Checking json output: per socket [Success] test child finished with 0 ---- end ---- perf stat JSON output linter: Ok [root@quaco ~]# Signed-off-by: Claire Jensen <cjense@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alyssa Ross <hi@alyssa.is> Cc: Claire Jensen <clairej735@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220805200105.2020995-3-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:44:01 UTC
df936ca perf stat: Add JSON output option CSV output is tricky to format and column layout changes are susceptible to breaking parsers. New JSON-formatted output has variable names to identify fields that are consistent and informative, making the output parseable. CSV output example: 1.20,msec,task-clock:u,1204272,100.00,0.697,CPUs utilized 0,,context-switches:u,1204272,100.00,0.000,/sec 0,,cpu-migrations:u,1204272,100.00,0.000,/sec 70,,page-faults:u,1204272,100.00,58.126,K/sec JSON output example: {"counter-value" : "3805.723968", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 3805731510100.00, "pcnt-running" : 100.00, "metric-value" : 4.007571, "metric-unit" : "CPUs utilized"} {"counter-value" : "6166.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 3805723045100.00, "pcnt-running" : 100.00, "metric-value" : 1.620191, "metric-unit" : "K/sec"} {"counter-value" : "466.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 3805727613100.00, "pcnt-running" : 100.00, "metric-value" : 122.447136, "metric-unit" : "/sec"} {"counter-value" : "208.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 3805726799100.00, "pcnt-running" : 100.00, "metric-value" : 54.654516, "metric-unit" : "/sec"} Also added documentation for JSON option. There is some tidy up of CSV code including a potential memory over run in the os.nfields set up. To facilitate this an AGGR_MAX value is added. Committer notes: Fixed up using PRIu64 to format u64 values, not %lu. Committer testing: ⬢[acme@toolbox perf]$ perf stat -j sleep 1 {"counter-value" : "0.731750", "unit" : "msec", "event" : "task-clock:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000731, "metric-unit" : "CPUs utilized"} {"counter-value" : "0.000000", "unit" : "", "event" : "context-switches:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"} {"counter-value" : "0.000000", "unit" : "", "event" : "cpu-migrations:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"} {"counter-value" : "75.000000", "unit" : "", "event" : "page-faults:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 102.494021, "metric-unit" : "K/sec"} {"counter-value" : "578765.000000", "unit" : "", "event" : "cycles:u", "event-runtime" : 379366, "pcnt-running" : 49.00, "metric-value" : 0.790933, "metric-unit" : "GHz"} {"counter-value" : "1298.000000", "unit" : "", "event" : "stalled-cycles-frontend:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 0.224271, "metric-unit" : "frontend cycles idle"} {"counter-value" : "21984.000000", "unit" : "", "event" : "stalled-cycles-backend:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 3.798433, "metric-unit" : "backend cycles idle"} {"counter-value" : "468197.000000", "unit" : "", "event" : "instructions:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 0.808959, "metric-unit" : "insn per cycle"} {"metric-value" : 0.046955, "metric-unit" : "stalled cycles per insn"} {"counter-value" : "103335.000000", "unit" : "", "event" : "branches:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 141.216262, "metric-unit" : "M/sec"} {"counter-value" : "2381.000000", "unit" : "", "event" : "branch-misses:u", "event-runtime" : 388654, "pcnt-running" : 50.00, "metric-value" : 2.304156, "metric-unit" : "of all branches"} ⬢[acme@toolbox perf]$ Signed-off-by: Claire Jensen <cjense@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alyssa Ross <hi@alyssa.is> Cc: Claire Jensen <clairej735@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220805200105.2020995-2-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> 10 August 2022, 13:43:29 UTC
eb555cb Merge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd Pull ksmbd updates from Steve French: - fixes for memory access bugs (out of bounds access, oops, leak) - multichannel fixes - session disconnect performance improvement, and session register improvement - cleanup * tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix heap-based overflow in set_ntacl_dacl() ksmbd: prevent out of bound read for SMB2_TREE_CONNNECT ksmbd: prevent out of bound read for SMB2_WRITE ksmbd: fix use-after-free bug in smb2_tree_disconect ksmbd: fix memory leak in smb2_handle_negotiate ksmbd: fix racy issue while destroying session on multichannel ksmbd: use wait_event instead of schedule_timeout() ksmbd: fix kernel oops from idr_remove() ksmbd: add channel rwlock ksmbd: replace sessions list in connection with xarray MAINTAINERS: ksmbd: add entry for documentation ksmbd: remove unused ksmbd_share_configs_cleanup function 09 August 2022, 03:15:13 UTC
f30adc0 Merge tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more iov_iter updates from Al Viro: - more new_sync_{read,write}() speedups - ITER_UBUF introduction - ITER_PIPE cleanups - unification of iov_iter_get_pages/iov_iter_get_pages_alloc and switching them to advancing semantics - making ITER_PIPE take high-order pages without splitting them - handling copy_page_from_iter() for high-order pages properly * tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits) fix copy_page_from_iter() for compound destinations hugetlbfs: copy_page_to_iter() can deal with compound pages copy_page_to_iter(): don't split high-order page in case of ITER_PIPE expand those iov_iter_advance()... pipe_get_pages(): switch to append_pipe() get rid of non-advancing variants ceph: switch the last caller of iov_iter_get_pages_alloc() 9p: convert to advancing variant of iov_iter_get_pages_alloc() af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages() iter_to_pipe(): switch to advancing variant of iov_iter_get_pages() block: convert to advancing variants of iov_iter_get_pages{,_alloc}() iov_iter: advancing variants of iov_iter_get_pages{,_alloc}() iov_iter: saner helper for page array allocation fold __pipe_get_pages() into pipe_get_pages() ITER_XARRAY: don't open-code DIV_ROUND_UP() unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts unify xarray_get_pages() and xarray_get_pages_alloc() unify pipe_get_pages() and pipe_get_pages_alloc() iov_iter_get_pages(): sanity-check arguments iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper ... 09 August 2022, 03:04:35 UTC
c03f05f fix copy_page_from_iter() for compound destinations had been broken for ITER_BVEC et.al. since ever (OK, v3.17 when ITER_BVEC had first appeared)... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:26 UTC
c7d57ab hugetlbfs: copy_page_to_iter() can deal with compound pages ... since April 2021 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:26 UTC
f0f6b61 copy_page_to_iter(): don't split high-order page in case of ITER_PIPE ... just shove it into one pipe_buffer. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:25 UTC
310d9d5 expand those iov_iter_advance()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:25 UTC
746de1f pipe_get_pages(): switch to append_pipe() now that we are advancing the iterator, there's no need to treat the first page separately - just call append_pipe() in a loop. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:25 UTC
eba2d3d get rid of non-advancing variants mechanical change; will be further massaged in subsequent commits Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:24 UTC
b535899 ceph: switch the last caller of iov_iter_get_pages_alloc() here nothing even looks at the iov_iter after the call, so we couldn't care less whether it advances or not. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:24 UTC
7f02464 9p: convert to advancing variant of iov_iter_get_pages_alloc() that one is somewhat clumsier than usual and needs serious testing. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:23 UTC
dc5801f af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages() ... and adjust the callers Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:23 UTC
7d690c1 iter_to_pipe(): switch to advancing variant of iov_iter_get_pages() ... and untangle the cleanup on failure to add into pipe. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:23 UTC
480cb84 block: convert to advancing variants of iov_iter_get_pages{,_alloc}() ... doing revert if we end up not using some pages Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:22 UTC
1ef255e iov_iter: advancing variants of iov_iter_get_pages{,_alloc}() Most of the users immediately follow successful iov_iter_get_pages() with advancing by the amount it had returned. Provide inline wrappers doing that, convert trivial open-coded uses of those. BTW, iov_iter_get_pages() never returns more than it had been asked to; such checks in cifs ought to be removed someday... Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:22 UTC
3cf42da iov_iter: saner helper for page array allocation All call sites of get_pages_array() are essenitally identical now. Replace with common helper... Returns number of slots available in resulting array or 0 on OOM; it's up to the caller to make sure it doesn't ask to zero-entry array (i.e. neither maxpages nor size are allowed to be zero). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:22 UTC
8520008 fold __pipe_get_pages() into pipe_get_pages() ... and don't mangle maxsize there - turn the loop into counting one instead. Easier to see that we won't run out of array that way. Note that special treatment of the partial buffer in that thing is an artifact of the non-advancing semantics of iov_iter_get_pages() - if not for that, it would be append_pipe(), same as the body of the loop that follows it. IOW, once we make iov_iter_get_pages() advancing, the whole thing will turn into calculate how many pages do we want allocate an array (if needed) call append_pipe() that many times. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:21 UTC
0aa4fc3 ITER_XARRAY: don't open-code DIV_ROUND_UP() Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:21 UTC
451c0ba unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts same as for pipes and xarrays; after that iov_iter_get_pages() becomes a wrapper for __iov_iter_get_pages_alloc(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:21 UTC
68fe506 unify xarray_get_pages() and xarray_get_pages_alloc() same as for pipes Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:20 UTC
acbdeb8 unify pipe_get_pages() and pipe_get_pages_alloc() The differences between those two are * pipe_get_pages() gets a non-NULL struct page ** value pointing to preallocated array + array size. * pipe_get_pages_alloc() gets an address of struct page ** variable that contains NULL, allocates the array and (on success) stores its address in that variable. Not hard to combine - always pass struct page ***, have the previous pipe_get_pages_alloc() caller pass ~0U as cap for array size. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:20 UTC
c81ce28 iov_iter_get_pages(): sanity-check arguments zero maxpages is bogus, but best treated as "just return 0"; NULL pages, OTOH, should be treated as a hard bug. get rid of now completely useless checks in xarray_get_pages{,_alloc}(). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:20 UTC
9132955 iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper Incidentally, ITER_XARRAY did *not* free the sucker in case when iter_xarray_populate_pages() returned 0... Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:19 UTC
12d426a ITER_PIPE: fold data_start() and pipe_space_for_user() together All their callers are next to each other; all of them want the total amount of pages and, possibly, the offset in the partial final buffer. Combine into a new helper (pipe_npages()), fix the bogosity in pipe_space_for_user(), while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:19 UTC
10f525a ITER_PIPE: cache the type of last buffer We often need to find whether the last buffer is anon or not, and currently it's rather clumsy: check if ->iov_offset is non-zero (i.e. that pipe is not empty) if so, get the corresponding pipe_buffer and check its ->ops if it's &default_pipe_buf_ops, we have an anon buffer. Let's replace the use of ->iov_offset (which is nowhere near similar to its role for other flavours) with signed field (->last_offset), with the following rules: empty, no buffers occupied: 0 anon, with bytes up to N-1 filled: N zero-copy, with bytes up to N-1 filled: -N That way abs(i->last_offset) is equal to what used to be in i->iov_offset and empty vs. anon vs. zero-copy can be distinguished by the sign of i->last_offset. Checks for "should we extend the last buffer or should we start a new one?" become easier to follow that way. Note that most of the operations can only be done in a sane state - i.e. when the pipe has nothing past the current position of iterator. About the only thing that could be done outside of that state is iov_iter_advance(), which transitions to the sane state by truncating the pipe. There are only two cases where we leave the sane state: 1) iov_iter_get_pages()/iov_iter_get_pages_alloc(). Will be dealt with later, when we make get_pages advancing - the callers are actually happier that way. 2) iov_iter copied, then something is put into the copy. Since they share the underlying pipe, the original gets behind. When we decide that we are done with the copy (original is not usable until then) we advance the original. direct_io used to be done that way; nowadays it operates on the original and we do iov_iter_revert() to discard the excessive data. At the moment there's nothing in the kernel that could do that to ITER_PIPE iterators, so this reason for insane state is theoretical right now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:18 UTC
92acdc4 ITER_PIPE: clean iov_iter_revert() Fold pipe_truncate() into it, clean up. We can release buffers in the same loop where we walk backwards to the iterator beginning looking for the place where the new position will be. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:18 UTC
2c855de ITER_PIPE: clean pipe_advance() up instead of setting ->iov_offset for new position and calling pipe_truncate() to adjust ->len of the last buffer and discard everything after it, adjust ->len at the same time we set ->iov_offset and use pipe_discard_from() to deal with buffers past that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:18 UTC
ca59196 ITER_PIPE: lose iter_head argument of __pipe_get_pages() it's only used to get to the partial buffer we can add to, and that's always the last one, i.e. pipe->head - 1. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:17 UTC
e3b4296 ITER_PIPE: fold push_pipe() into __pipe_get_pages() Expand the only remaining call of push_pipe() (in __pipe_get_pages()), combine it with the page-collecting loop there. Note that the only reason it's not a loop doing append_pipe() is that append_pipe() is advancing, while iov_iter_get_pages() is not. As soon as it switches to saner semantics, this thing will switch to using append_pipe(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:17 UTC
8fad776 ITER_PIPE: allocate buffers as we go in copy-to-pipe primitives New helper: append_pipe(). Extends the last buffer if possible, allocates a new one otherwise. Returns page and offset in it on success, NULL on failure. iov_iter is advanced past the data we've got. Use that instead of push_pipe() in copy-to-pipe primitives; they get simpler that way. Handling of short copy (in "mc" one) is done simply by iov_iter_revert() - iov_iter is in consistent state after that one, so we can use that. [Fix for braino caught by Liu Xinpeng <liuxp11@chinatelecom.cn> folded in] [another braino fix, this time in copy_pipe_to_iter() and pipe_zero(); caught by testcase from Hugh Dickins] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:17 UTC
47b7fca ITER_PIPE: helpers for adding pipe buffers There are only two kinds of pipe_buffer in the area used by ITER_PIPE. 1) anonymous - copy_to_iter() et.al. end up creating those and copying data there. They have zero ->offset, and their ->ops points to default_pipe_page_ops. 2) zero-copy ones - those come from copy_page_to_iter(), and page comes from caller. ->offset is also caller-supplied - it might be non-zero. ->ops points to page_cache_pipe_buf_ops. Move creation and insertion of those into helpers - push_anon(pipe, size) and push_page(pipe, page, offset, size) resp., separating them from the "could we avoid creating a new buffer by merging with the current head?" logics. Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:16 UTC
2dcedb2 ITER_PIPE: helper for getting pipe buffer by index pipe_buffer instances of a pipe are organized as a ring buffer, with power-of-2 size. Indices are kept *not* reduced modulo ring size, so the buffer refered to by index N is pipe->bufs[N & (pipe->ring_size - 1)]. Ring size can change over the lifetime of a pipe, but not while the pipe is locked. So for any iov_iter primitives it's a constant. Original conversion of pipes to this layout went overboard trying to microoptimize that - calculating pipe->ring_size - 1, storing it in a local variable and using through the function. In some cases it might be warranted, but most of the times it only obfuscates what's going on in there. Introduce a helper (pipe_buf(pipe, N)) that would encapsulate that and use it in the obvious cases. More will follow... Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:16 UTC
0d96493 splice: stop abusing iov_iter_advance() to flush a pipe Use pipe_discard_from() explicitly in generic_file_read_iter(); don't bother with rather non-obvious use of iov_iter_advance() in there. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:16 UTC
3e20a75 switch new_sync_{read,write}() to ITER_UBUF Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:15 UTC
fcb14cb new iov_iter flavour - ITER_UBUF Equivalent of single-segment iovec. Initialized by iov_iter_ubuf(), checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC ones. We are going to expose the things like ->write_iter() et.al. to those in subsequent commits. New predicate (user_backed_iter()) that is true for ITER_IOVEC and ITER_UBUF; places like direct-IO handling should use that for checking that pages we modify after getting them from iov_iter_get_pages() would need to be dirtied. DO NOT assume that replacing iter_is_iovec() with user_backed_iter() will solve all problems - there's code that uses iter_is_iovec() to decide how to poke around in iov_iter guts and for that the predicate replacement obviously won't suffice. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 09 August 2022, 02:37:15 UTC
back to top