sort by:
Revision Author Date Message Commit Date
7f453c2 perf_counter: PERF_SAMPLE_ID and inherited counters Anton noted that for inherited counters the counter-id as provided by PERF_SAMPLE_ID isn't mappable to the id found through PERF_RECORD_ID because each inherited counter gets its own id. His suggestion was to always return the parent counter id, since that is the primary counter id as exposed. However, these inherited counters have a unique identifier so that events like PERF_EVENT_PERIOD and PERF_EVENT_THROTTLE can be specific about which counter gets modified, which is important when trying to normalize the sample streams. This patch removes PERF_EVENT_PERIOD in favour of PERF_SAMPLE_PERIOD, which is more useful anyway, since changing periods became a lot more common than initially thought -- rendering PERF_EVENT_PERIOD the less useful solution (also, PERF_SAMPLE_PERIOD reports the more accurate value, since it reports the value used to trigger the overflow, whereas PERF_EVENT_PERIOD simply reports the requested period changed, which might only take effect on the next cycle). This still leaves us PERF_EVENT_THROTTLE to consider, but since that _should_ be a rare occurrence, and linking it to a primary id is the most useful bit to diagnose the problem, we introduce a PERF_SAMPLE_STREAM_ID, for those few cases where the full reconstruction is important. [Does change the ABI a little, but I see no other way out] Suggested-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1248095846.15751.8781.camel@twins> 22 July 2009, 16:05:56 UTC
573402d perf_counter: Plug more stack leaks Per example of Arjan's patch, I went through and found a few more. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> 22 July 2009, 16:05:55 UTC
c9f73a3 perf: Fix stack data leak the "reserved" field was not initialized to zero, resulting in 4 bytes of stack data leaking to userspace.... Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> 22 July 2009, 16:05:55 UTC
9b7019a perf_counter: Remove unused variables Fix a gcc unused variables warning. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> 22 July 2009, 16:05:55 UTC
1d2f379 Merge commit 'tip/perfcounters/core' into perf-counters-for-linus 22 July 2009, 16:05:48 UTC
1483b19 perf_counter: Make call graph option consistent perf record uses -g for logging call graph data but perf report uses -c to print call graph data. Be consistent and use -g everywhere for call graph data. Also update the help text to reflect the current default - fractal,0.5 Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.803604373@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> 18 July 2009, 09:21:33 UTC
4bba828 perf_counter: Add perf record option to log addresses Add the -d or --data option to log event addresses (eg page faults). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.697698033@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> 18 July 2009, 09:21:32 UTC
ed900c0 perf_counter: Log vfork as a fork event Right now we don't output vfork events. Even though we should always see an exec after a vfork, we may get perfcounter samples between the vfork and exec. These samples can lead to some confusion when parsing perfcounter data. To keep things consistent we should always log a fork event. It will result in a little more log data, but is less confusing to trace parsing tools. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.589309391@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> 18 July 2009, 09:21:31 UTC
11b5f81 perf_counter: Synthesize VDSO mmap event perf record synthesizes mmap events for the running process. Right now it just catches file mappings, but we can check for the vdso symbol and add that too. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.517264409@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> 18 July 2009, 09:21:30 UTC
413ee3b perf_counter: Make sure we dont leak kernel memory to userspace There are a few places we are leaking tiny amounts of kernel memory to userspace. This happens when writing out strings because we always align the end to 64 bits. To avoid this we should always use an appropriately sized temporary buffer and ensure it is zeroed. Since d_path assembles the string from the end of the buffer backwards, we need to add 64 bits after the buffer to allow for alignment. We also need to copy arch_vma_name to the temporary buffer, because if we use it directly we may end up copying to userspace a number of bytes after the end of the string constant. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.273972048@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> 18 July 2009, 09:21:29 UTC
23cdb5d perf_counter tools: Fix index boundary check Keep index within event_type_descriptors[] Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4A5A7F0B.4070106@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 13 July 2009, 08:58:28 UTC
d4d7d0b perf_counter: Fix the tracepoint channel to perfcounters Fix a missed rename in EVENT_PROFILE support so that it gets built and allows tracepoint tracing from the 'perf' tool. Fix a typo in the (never before built & enabled) portion in perf_counter.c as well, and update that code to the attr.config changes as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Gamari <bgamari.foss@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1246869094-21237-1-git-send-email-chris@chris-wilson.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu> 13 July 2009, 07:23:10 UTC
f1c6a58 perf_counter, x86: Extend perf_counter Pentium M support I've attached a patch to remove the Pentium M special casing of EMON and as noticed at least with my Pentium M the hardware PMU now works: Performance counter stats for '/bin/ls /var/tmp': 1.809988 task-clock-msecs # 0.125 CPUs 1 context-switches # 0.001 M/sec 0 CPU-migrations # 0.000 M/sec 224 page-faults # 0.124 M/sec 1425648 cycles # 787.656 M/sec 912755 instructions # 0.640 IPC Vince suggested that this code was trying to address erratum Y17 in Pentium-M's: http://download.intel.com/support/processors/mobile/pm/sb/25266532.pdf But that erratum (related to IA32_MISC_ENABLES.7) does not affect perfcounters as we dont use this toggle to disable RDPMC and WRMSR/RDMSR access to performance counters. We keep cr4's bit 8 (X86_CR4_PCE) clear so unprivileged RDPMC access is not allowed anyway. Cc: Vince Weaver <vince@deater.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@googlemail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> 13 July 2009, 06:46:51 UTC
e3d7e18 perf report: Introduce -n/--show-nr-samples [acme@doppio pahole]$ perf report -ns comm,dso,symbol -d /lib64/libc-2.10.1.so -C pahole | head -17 21.94% 32101 [.] _int_malloc 20.10% 29402 [.] __GI_strcmp 16.77% 24533 [.] __tsearch 12.61% 18450 [.] malloc_consolidate 6.42% 9394 [.] _int_free 6.28% 9191 [.] __tfind 4.56% 6678 [.] __GI___libc_free 4.46% 6520 [.] _IO_vfprintf_internal 2.59% 3786 [.] __malloc 1.17% 1716 [.] __GI_memcpy [acme@doppio pahole]$ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1247325517-12272-5-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 11 July 2009, 17:20:27 UTC
a25e46c perf_counter tools: PLT info is stripped in -debuginfo packages So we need to get the richer .symtab from the debuginfo packages but the PLT info from the original DSO where we have just the leaner .dynsym symtab. Example: | [acme@doppio pahole]$ perf report --sort comm,dso,symbol > before | [acme@doppio pahole]$ perf report --sort comm,dso,symbol > after | [acme@doppio pahole]$ diff -U1 before after | --- before 2009-07-11 11:04:22.688595741 -0300 | +++ after 2009-07-11 11:04:33.380595676 -0300 | @@ -80,3 +80,2 @@ | 0.07% pahole ./build/pahole [.] pahole_stealer | - 0.06% pahole /usr/lib64/libdw-0.141.so [.] 0x00000000007140 | 0.06% pahole /usr/lib64/libdw-0.141.so [.] __libdw_getabbrev | @@ -91,2 +90,3 @@ | 0.06% pahole [kernel] [k] free_hot_cold_page | + 0.06% pahole /usr/lib64/libdw-0.141.so [.] tfind@plt | 0.05% pahole ./build/libdwarves.so.1.0.0 [.] ftype__add_parameter | @@ -242,2 +242,3 @@ | 0.01% pahole [kernel] [k] account_group_user_time | + 0.01% pahole /usr/lib64/libdw-0.141.so [.] strlen@plt | 0.01% pahole ./build/pahole [.] strcmp@plt | [acme@doppio pahole]$ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1247325517-12272-4-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 11 July 2009, 17:20:26 UTC
021191b perf report: Make the output more compact When we filter by column content we may end up with a column that has the same value for all the lines. So remove that column and tell its unique value on the top, as a comment. Example: [acme@doppio pahole]$ perf report --sort comm,dso,symbol -d ./build/libdwarves.so.1.0.0 -C pahole | head -15 # dso: ./build/libdwarves.so.1.0.0 # comm: pahole # Samples: 58409 # # Overhead Symbol # ........ ...... # 20.93% [.] tag__recode_dwarf_type 14.94% [.] namespace__recode_dwarf_types 10.38% [.] cu__table_add_tag 6.69% [.] __die__process_tag 5.05% [.] die__process_function 4.70% [.] list__for_all_tags 3.68% [.] tag__init 3.48% [.] die__create_new_parameter [acme@doppio pahole]$ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1247325517-12272-3-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 11 July 2009, 17:20:26 UTC
27d0fd4 strlist: Introduce strlist__entry and strlist__nr_entries methods The strlist__entry method allows accessing strlists like an array, will be used in the 'perf report' to access the first entry. We now keep the nr_entries so that we can check if we have just one entry, will be used in 'perf report' to improve the output by showing just at the top when we have just, say, one DSO. While at it use nr_entries to optimize strlist__is_empty by not using the far more costly rb_first based implementation. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1247325517-12272-2-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 11 July 2009, 17:20:25 UTC
60c1baf perf report: Tidy up reporting of symbols not found Always printing the level info about if it is in the kernel, hypervisor or userspace as that is in the hist_entry. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1247325517-12272-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 11 July 2009, 17:20:25 UTC
52d422d perf report: Adjust column width to the values sampled Auto-adjust column width of perf report output to the longest occuring string length. Example: [acme@doppio pahole]$ perf report --sort comm,dso,symbol | head -13 12.79% pahole /usr/lib64/libdw-0.141.so [.] __libdw_find_attr 8.90% pahole /lib64/libc-2.10.1.so [.] _int_malloc 8.68% pahole /usr/lib64/libdw-0.141.so [.] __libdw_form_val_len 8.15% pahole /lib64/libc-2.10.1.so [.] __GI_strcmp 6.80% pahole /lib64/libc-2.10.1.so [.] __tsearch 5.54% pahole ./build/libdwarves.so.1.0.0 [.] tag__recode_dwarf_type [acme@doppio pahole]$ [acme@doppio pahole]$ perf report --sort comm,dso,symbol -d /lib64/libc-2.10.1.so | head -10 21.92% pahole /lib64/libc-2.10.1.so [.] _int_malloc 20.08% pahole /lib64/libc-2.10.1.so [.] __GI_strcmp 16.75% pahole /lib64/libc-2.10.1.so [.] __tsearch [acme@doppio pahole]$ Also add these extra options to control the new behaviour: -w, --field-width Force each column width to the provided list, for large terminal readability. -t, --field-separator: Use a special separator character and don't pad with spaces, replacing all occurances of this separator in symbol names (and other output) with a '.' character, that thus it's the only non valid separator. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090711014728.GH3452@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> 11 July 2009, 08:24:11 UTC
71a851b perf_counter: Stop open coding unclone_ctx Instead of open coding the unclone context thingy, put it in a common function. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> 10 July 2009, 08:28:40 UTC
984b838 perf_counter: Clean up global vs counter enable Ingo noticed that both AMD and P6 call x86_pmu_disable_counter() on *_pmu_enable_counter(). This is because we rely on the side effect of that call to program the event config but not touch the EN bit. We change that for AMD by having enable_all() simply write the full config in, and for P6 by explicitly coding it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> 10 July 2009, 08:28:29 UTC
9c74fb5 perf_counter: Fix up P6 PMU details The P6 doesn't seem to support cache ref/hit/miss counts, so we extend the generic hardware event codes to have 0 and -1 mean the same thing as for the generic cache events. Furthermore, it turns out the 0 event does not count (that is, its reported that on PPro it actually does count something), therefore use a event configuration that's specified not to count to disable the counters. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> 10 July 2009, 08:28:27 UTC
11d1578 perf_counter: Add P6 PMU support Add basic P6 PMU support. The P6 uses the EVNTSEL0 EN bit to enable/disable both its counters. We use this for the global enable/disable, and clear all config bits (except EN) to disable individual counters. Actual ia32 hardware doesn't support lfence, so use a locked op without side-effect to implement a full barrier. perf stat and perf record seem to function correctly. [a.p.zijlstra@chello.nl: cleanups and complete the enable/disable code] Signed-off-by: Vince Weaver <vince@deater.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <Pine.LNX.4.64.0907081718450.2715@pianoman.cluster.toy> Signed-off-by: Ingo Molnar <mingo@elte.hu> 10 July 2009, 08:28:26 UTC
9590b7b perf_counter tools: Rename cache events to remove $ The cache events contain '$' which will hit shell variable expansion. To avoid confusion change this to 'cache', ie L1-d$-loads becomes L1-dcache-loads. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: Roland Dreier <rdreier@cisco.com> Cc: Jaswinder Singh Rajput <jaswinder@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090706120131.GB4391@kryten> Signed-off-by: Ingo Molnar <mingo@elte.hu> 10 July 2009, 08:04:06 UTC
805d127 perf report: Add "Fractal" mode output - support callchains with relative overhead rate The current callchain displays the overhead rates as absolute: relative to the total overhead. This patch provides relative overhead percentage, in which each branch of the callchain tree is a independant instrumentated object. This provides a 'fractal' view of the call-chain profile: each sub-graph looks like a profile in itself - relative to its parent. You can produce such output by using the "fractal" mode that you can abbreviate via f, fr, fra, frac, etc... ./perf report -s sym -c fractal Example: 8.46% [k] copy_user_generic_string | |--52.01%-- generic_file_aio_read | do_sync_read | vfs_read | | | |--97.20%-- sys_pread64 | | system_call_fastpath | | pread64 | | | --2.81%-- sys_read | system_call_fastpath | __read | |--39.85%-- generic_file_buffered_write | __generic_file_aio_write_nolock | generic_file_aio_write | do_sync_write | reiserfs_file_write | vfs_write | | | |--97.05%-- sys_pwrite64 | | system_call_fastpath | | __pwrite64 | | | --2.95%-- sys_write | system_call_fastpath | __write_nocancel [...] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-5-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 05 July 2009, 08:30:23 UTC
e05b876 perf_counter tools: callchains: Manage the cumul hits on the fly The cumul hits are the number of hits of every childs of a node plus the hits of the current nodes, required for percentage computing of a branch. Theses numbers are calculated during the sorting of the branches of the callchain tree using a depth first postfix traversal, so that cumulative hits are propagated in the right order. But if we plan to implement percentages relative to the parent and not absolute percentages (relative to the whole overhead), we need to know the cumulative hits of the parent before computing the children because the relative minimum acceptable number of entries (ie: minimum rate against the cumulative hits from the parent) is the basis to filter the children against a given rate. Then we need to handle the cumul hits on the fly to prepare the implementation of relative overhead rates. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 05 July 2009, 08:30:22 UTC
94a8eb0 perf report: Change default callchain parameters The default callchain parameters are set to use the flat mode and never filter any overhead threshold of backtrace. But flat mode is boring compared to graph mode. Also the number of callchains may be very high if none is filtered. Let's change this to set the graph view and a minimum overhead of 0.5% as default parameters. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 05 July 2009, 08:30:22 UTC
be90388 perf report: Use a modifiable string for default callchain options If the user doesn't provide options to tune his callchain output (ie: if he uses -c without arguments) then the default value passed in the OPT_CALLBACK_DEFAULT() macro is used. But it's parsed later by strtok() which will replace comma separators to a zero. This may segfault as we are using a read-only string. Use a modifiable one instead, and also fix the "100%" default minimum threshold value by turning it into a 0 (output every callchains) as it was intended in the origin. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 05 July 2009, 08:30:21 UTC
91b4eae perf report: Warn on callchain output request from non-callchain file perf report segfaults while trying to handle callchains from a non callchain data file. Instead of a segfault, print a useful message to the user. Reported-by: Jens Axboe <jens.axboe@oracle.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 05 July 2009, 08:30:21 UTC
a79f0da x86: atomic64: Inline atomic64_read() again Now atomic64_read() is light weight (no register pressure and small icache), we can inline it again. Also use "=&A" constraint instead of "+A" to avoid warning about unitialized 'res' variable. (gcc had to force 0 in eax/edx) $ size vmlinux.prev vmlinux.after text data bss dec hex filename 4908667 451676 1684868 7045211 6b805b vmlinux.prev 4908651 451676 1684868 7045195 6b804b vmlinux.after Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <4A4E1AA2.30002@gmail.com> [ Also fix typo in atomic64_set() export ] Signed-off-by: Ingo Molnar <mingo@elte.hu> 04 July 2009, 09:45:00 UTC
ddf9a00 x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative() Linus noticed that the variable name 'old_val' is confusingly named in these functions - the correct naming is 'new_val'. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907030942260.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 19:15:08 UTC
3a8d178 x86: atomic64: Improve atomic64_xchg() Remove the read-first logic from atomic64_xchg() and simplify the loop. This function was the last user of __atomic64_read() - remove it. Also, change the 'real_val' assumption from the somewhat quirky 1ULL << 32 value to the (just as arbitrary, but simpler) value of 0. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 18:23:55 UTC
1fde902 x86: atomic64: Export APIs to modules atomic64_t primitives are used by a handful of drivers, so export the APIs consistently. These were inlined before. Also mark atomic64_32.o a core object, so that the symbols are available even if not linked to core kernel pieces. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 18:23:52 UTC
67d7178 x86: atomic64: Improve atomic64_read() Optimize atomic64_read() as a special open-coded cmpxchg8b variant. This generates nicer code: arch/x86/lib/atomic64_32.o: text data bss dec hex filename 435 0 0 435 1b3 atomic64_32.o.before 431 0 0 431 1af atomic64_32.o.after md5: bd8ab95e69c93518578bfaf0ea3be4d9 atomic64_32.o.before.asm 2bdfd4bd1f6b7b61b7fc127aef90ce3b atomic64_32.o.after.asm Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 12:42:59 UTC
8e049ef x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP Occasionally we get bugs where atomic_read or atomic_set are used on atomic64_t variables or vice versa. These bugs don't generate warnings on x86 because atomic_read and atomic_set are coded as macros rather than C functions, so we don't get any type-checking on their arguments; similarly for atomic64_read and atomic64_set in 64-bit kernels. This converts them to C functions so that the arguments are type-checked and bugs like this will get caught more easily. It also converts atomic_cmpxchg and atomic_xchg, and atomic64_cmpxchg and atomic64_xchg on 64-bit, so we get type-checking on their arguments too. Compiling a typical 64-bit x86 config, this generates no new warnings, and the vmlinux text is 86 bytes smaller. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 12:42:39 UTC
199e237 x86: atomic64: Fix unclean type use in atomic64_xchg() Linus noticed that atomic64_xchg() uses atomic_read(), which happens to work because atomic_read() is a macro so the .counter value gets u64-read on 32-bit too - but this is really bogus and serious bugs are waiting to happen. Fix atomic64_xchg() to use __atomic64_read() instead. No code changed: arch/x86/lib/atomic64_32.o: text data bss dec hex filename 435 0 0 435 1b3 atomic64_32.o.before 435 0 0 435 1b3 atomic64_32.o.after md5: bd8ab95e69c93518578bfaf0ea3be4d9 atomic64_32.o.before.asm bd8ab95e69c93518578bfaf0ea3be4d9 atomic64_32.o.after.asm Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 11:26:46 UTC
3217120 x86: atomic64: Make atomic_read() type-safe Linus noticed that atomic64_xchg() uses atomic_read(), which happens to work because atomic_read() is a macro so the .counter value gets u64-read on 32-bit too - but this is really bogus and serious bugs are waiting to happen. Change atomic_read() to be a type-safe inline, and this exposes the atomic64 bogosity as well: arch/x86/lib/atomic64_32.c: In function ‘atomic64_xchg’: arch/x86/lib/atomic64_32.c:39: warning: passing argument 1 of ‘atomic_read’ from incompatible pointer type Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 11:26:45 UTC
3ac805d x86: atomic64: Reduce size of functions cmpxchg8b is a huge instruction in terms of register footprint, we almost never want to inline it, not even within the same code module. GCC 4.3 still messes up for two functions, under-judging the true cost of this instruction - so annotate two key functions to reduce the bloat: arch/x86/lib/atomic64_32.o: text data bss dec hex filename 1763 0 0 1763 6e3 atomic64_32.o.before 435 0 0 435 1b3 atomic64_32.o.after Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 11:26:43 UTC
824975e x86: atomic64: Improve atomic64_add_return() Linus noted (based on Eric Dumazet's numbers) that we would probably be better off not trying an atomic_read() in atomic64_add_return() but intead intentionally let the first cmpxchg8b fail - to get a cache-friendly 'give me ownership of this cacheline' transaction. That can then be followed by the real cmpxchg8b which sets the value local to the CPU. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 11:26:42 UTC
69237f9 x86: atomic64: Improve cmpxchg8b() Rewrite cmpxchg8b() to not use %edi register but a generic "+m" constraint, to increase compiler freedom in code generation and possibly better code. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 11:26:41 UTC
aacf682 x86: atomic64: Improve atomic64_read() Linus noticed that the 32-bit version of atomic64_read() was being overly complex with re-reading the value and doing a retry loop over that. Instead we can just rely on cmpxchg8b returning either the new value or returning the current value. We can use any 'old' value, which will be faster as it can be loaded via immediates. Using some value that is not equal to the real value in memory the instruction gets faster. This also has the advantage that the CPU could avoid dirtying the cacheline. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 11:26:40 UTC
b7882b7 x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file Linus noted that the atomic64_t primitives are all inlines currently which is crazy because these functions have a large register footprint anyway. Move them to a separate file: arch/x86/lib/atomic64_32.c Also, while at it, rename all uses of 'unsigned long long' to the much shorter u64. This makes the appearance of the prototypes a lot nicer - and it also uncovered a few bugs where (yet unused) API variants had 'long' as their return type instead of u64. [ More intrusive changes are not yet done in this patch. ] Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 11:26:39 UTC
bbf2a33 x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too Locked instructions on two cache lines at once are painful. If atomic64_t uses two cache lines, my test program is 10x slower. The chance for that is significant: 4/32 or 12.5%. Make sure an atomic64_t is 8 bytes aligned. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> [ changed it to __aligned(8) as per Andrew's suggestion ] Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 11:26:38 UTC
029e5b1 perf report: Annotate variable initialization Certain versions of GCC dont see the initialization that is done here: builtin-report.c: In function ‘__cmd_report’: builtin-report.c:1038: warning: ‘syms’ may be used uninitialized in this function So annotate it with a NULL initialization. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 11:17:28 UTC
30d7a77 perf_counter tools: Adjust symbols in ET_EXEC files too Ingo Molnar wrote: > i just bisected a 'perf report' bug that would cause us to not > resolve all user-space symbols in a 'git gc' run to: > > f5812a7a336fb952d819e4427b9a2dce02368e82 is first bad commit > commit f5812a7a336fb952d819e4427b9a2dce02368e82 > Author: Arnaldo Carvalho de Melo <acme@redhat.com> > Date: Tue Jun 30 11:43:17 2009 -0300 > > perf_counter tools: Adjust only prelinked symbol's addresses Rename ->prelinked to ->adjust_symbols and making what was done only for prelinked libraries also to ET_EXEC binaries, such as /usr/bin/git: [acme@doppio pahole]$ readelf -h /usr/bin/git | grep Type Type: EXEC (Executable file) [acme@doppio pahole]$ And after installing the 'git-debuginfo' package, I get correct results: [acme@doppio linux-2.6-tip]$ perf report --sort comm,dso,symbol -d /usr/bin/git | head -20 # # (1139614 samples) # # Overhead Command Shared Object Symbol # ........ ................ ......................... ...... # 34.98% git /usr/bin/git [.] send_sideband 33.39% git /usr/bin/git [.] enter_repo 6.81% git /usr/bin/git [.] diff_opt_parse 4.95% git /usr/bin/git [.] is_repository_shallow 3.24% git /usr/bin/git [.] odb_mkstemp 1.39% git /usr/bin/git [.] output 1.34% git /usr/bin/git [.] xmmap 1.25% git /usr/bin/git [.] receive_pack_config 1.16% git /usr/bin/git [.] git_pathdup 0.90% git /usr/bin/git [.] read_object_with_reference 0.86% git /usr/bin/git [.] show_patch_diff 0.85% git /usr/bin/git 0x00000000095e2e 0.69% git /usr/bin/git [.] display [acme@doppio linux-2.6-tip]$ I'll check what are the last cases where we can't resolve symbols, like this 0x00000000095e2e later. And I guess this will fix the problems Mike were seeing too: [acme@doppio linux-2.6-tip]$ readelf -h ../build/perf/vmlinux | grep Type Type: EXEC (Executable file) [acme@doppio linux-2.6-tip]$ Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> 03 July 2009, 06:24:13 UTC
24b57c6 perf_counter tools: Display percents of hits in callchain with overhead colors This adds the use of colors to signal at a glance the important overhead thresholds in callchains hit rates. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246558475-10624-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 19:38:38 UTC
1e11fd8 perf_counter tools: Provide helper to print percents color Among perf annotate, perf report and perf top, we can find the common colored printing of percents according to the following rules: High overhead = > 5%, colored in red Mid overhead = > 0.5%, colored in green Low overhead = < 0.5%, default color Factorize these multiple checks in a single function named percent_color_fprintf() and also provide a get_percent_color() for sites which print percentages and other things at the same time. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246558475-10624-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 19:38:37 UTC
c20ab37 perf_counter tools: Set the minimum percent for callchains to be displayed Callchains output may become a burden on a trace because even rarely hit site are exposed. This can be too much information. Let the user set a threshold as a minimum percent of hits using the new pattern for the -c option: -c mode,min_percent Example: $ perf report -s sym -c flat,4 8.25% [k] copy_user_generic_string 4.19% copy_user_generic_string generic_file_aio_read do_sync_read vfs_read sys_pread64 system_call_fastpath pread64 5.39% [k] search_by_key 4.63% 0x00000000009e0a 2.36% [k] memcpy_c [...] $ perf report -s sym -c graph,2 8.25% [k] copy_user_generic_string | |--4.31%-- generic_file_aio_read | do_sync_read | vfs_read | | | --4.19%-- sys_pread64 | system_call_fastpath | pread64 | --3.24%-- generic_file_buffered_write __generic_file_aio_write_nolock generic_file_aio_write do_sync_write reiserfs_file_write vfs_write | --3.14%-- sys_pwrite64 system_call_fastpath __pwrite64 5.39% [k] search_by_key | --2.23%-- reiserfs_update_sd_size 4.63% 0x00000000009e0a 2.36% [k] memcpy_c [...] You can also omit it and it will default to 0. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246558475-10624-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 19:38:37 UTC
4eb3e47 perf report: Add support for callchain graph output Currently, the printing of callchains is done in a single vertical level, this is the "flat" mode: 8.25% [k] copy_user_generic_string 4.19% copy_user_generic_string generic_file_aio_read do_sync_read vfs_read sys_pread64 system_call_fastpath pread64 This patch introduces a new "graph" mode which provides a hierarchical output of factorized paths recursively sorted: 8.25% [k] copy_user_generic_string | |--4.31%-- generic_file_aio_read | do_sync_read | vfs_read | | | |--4.19%-- sys_pread64 | | system_call_fastpath | | pread64 | | | --0.12%-- sys_read | system_call_fastpath | __read | |--3.24%-- generic_file_buffered_write | __generic_file_aio_write_nolock | generic_file_aio_write | do_sync_write | reiserfs_file_write | vfs_write | | | |--3.14%-- sys_pwrite64 | | system_call_fastpath | | __pwrite64 | | | --0.10%-- sys_write [...] The command line has then changed. By providing the -c option, the callchain will output in the flat mode by default. But you can override it: perf report -c graph or perf report -c flat You can also pass the abreviated mode: perf report -c g or perf report -c gra will both make use of the graph mode. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246550301-8954-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 18:47:15 UTC
5a4b181 perf_counter tools: Add new OPT_CALLBACK_DEFAULT option There is no predefined macro to create an option that can have a custom value or a default one if none is given. This patch provides a new helper OPT_CALLBACK_DEFAULT() which defines such kind of option. For example, considering an option -c, we want to get the default value in the following cases: perf command -c -d perf command -d -c And the foo value when it's given: perf command -c foo -d perf command -d -c foo That's also why PARSE_OPT_LASTARG_DEFAULT is extended here to support default values whatever the position of the option, not only in the end. Should it now be renamed to PARSE_OPT_ARG_DEFAULT ? Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: git@vger.kernel.org LKML-Reference: <1246550301-8954-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 18:47:14 UTC
14f4654 perf_counter tools: Create new chain_for_each_child() iterator Iterating through children of a node in the callchain tree shows something that may be quite confusing at a first glance. The head is the children field of the parent and the list nodes are in the brothers field of the children. This is because the childs are linked to the parent as a list of "brothers" using the "children" list of the parent as a head: --------------- | Parent (head) |------------------------------------- --------------- | | | children | | | ----------- ----------- | | 1st child |---brother---| 2nd child |---brother----- ----------- ----------- This makes the following strange pattern often occuring: list_for_each_entry(child, &parent->children, brothers) { // do something with children } Abstract it to chain_for_each_child() to factorize and simplify this pattern. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246550301-8954-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 18:47:14 UTC
4297648 perf_counter tools: Enable kernel module symbol loading in tools Add the -m/--modules option to perf report and perf annotate, which enables live module symbol/image loading. To be used with -k/--vmlinux. (Also give perf annotate a -P/--full-paths option.) Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246514986.13293.48.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 06:42:21 UTC
6cfcc53 perf_counter tools: Connect module support infrastructure to symbol loading infrastructure Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246514916.13293.46.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 06:42:21 UTC
208b4b4 perf_counter tools: Add infrastructure to support loading of kernel module symbols Add infrastructure for module path discovery and section load addresses. Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246514830.13293.44.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 06:42:20 UTC
9974f49 perf_counter tools: Make symbol loading consistently return number of loaded symbols perf_counter tools: Make symbol loading consistently return number of loaded symbols. Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246514758.13293.42.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> 02 July 2009, 06:42:20 UTC
a92bef0 perf stat: Handle pipe read failures in perf stat Building builtin-stat.c reports the following errors: cc1: warnings being treated as errors builtin-stat.c: In function ‘run_perf_stat’: builtin-stat.c:242: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result builtin-stat.c:255: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result make: *** [builtin-stat.o] Erreur 1 This patch handles the possible pipe read failures. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246474930-6088-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 20:37:24 UTC
0406ca6 perf_counter: Ignore the nmi call frames in the x86-64 backtraces About every callchains recorded with perf record are filled up including the internal perfcounter nmi frame: perf_callchain perf_counter_overflow intel_pmu_handle_irq perf_counter_nmi_handler notifier_call_chain atomic_notifier_call_chain notify_die do_nmi nmi We want ignore this frame as it's not interesting for instrumentation. To solve this, we simply ignore every frames from nmi context. New example of "perf report -s sym -c" after this patch: 9.59% [k] search_by_key 4.88% search_by_key reiserfs_read_locked_inode reiserfs_iget reiserfs_lookup do_lookup __link_path_walk path_walk do_path_lookup user_path_at vfs_fstatat vfs_lstat sys_newlstat system_call_fastpath __lxstat 0x406fb1 3.19% search_by_key search_by_entry_key reiserfs_find_entry reiserfs_lookup do_lookup __link_path_walk path_walk do_path_lookup user_path_at vfs_fstatat vfs_lstat sys_newlstat system_call_fastpath __lxstat 0x406fb1 [...] For now this patch only solves the problem in x86-64. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 20:37:23 UTC
5da5025 perf_counter tools: Share list.h with the kernel The copy we were using came from another copy I did for the dwarves (pahole) package, that came from the kernel years ago. The only function that is used by the perf tools and that isn't in the kernel is list_del_range, that I'm leaving in the perf tools only for now. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090701174608.GA5823@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 20:37:23 UTC
43cbcd8 perf_counter tools: Share rbtree.with the kernel The tools/perf/util/rbtree.c copy already drifted by three csets: 4b324126e0c6c3a5080ca3ec0981e8766ed6f1ee 4c60117811171d867d4f27f17ea07d7419d45dae 16c047add3ceaf0ab882e3e094d1ec904d02312d So remove the copy and use the lib/rbtree.c directly, sharing the source code while still generating a separate object file, since tools/perf uses a far more agressive -O6 switch. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090701152837.GG15682@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 20:37:22 UTC
73c24cb perf list: Add cache events After: $ ./perf list List of pre-defined events (to be used in -e): cpu-cycles OR cycles [Hardware event] instructions [Hardware event] cache-references [Hardware event] cache-misses [Hardware event] branch-instructions OR branches [Hardware event] branch-misses [Hardware event] bus-cycles [Hardware event] cpu-clock [Software event] task-clock [Software event] page-faults OR faults [Software event] minor-faults [Software event] major-faults [Software event] context-switches OR cs [Software event] cpu-migrations OR migrations [Software event] L1-d$-loads [Hardware cache event] L1-d$-load-misses [Hardware cache event] L1-d$-stores [Hardware cache event] L1-d$-store-misses [Hardware cache event] L1-d$-prefetches [Hardware cache event] L1-d$-prefetch-misses [Hardware cache event] L1-i$-loads [Hardware cache event] L1-i$-load-misses [Hardware cache event] L1-i$-prefetches [Hardware cache event] L1-i$-prefetch-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-stores [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-prefetches [Hardware cache event] LLC-prefetch-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-stores [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-prefetches [Hardware cache event] dTLB-prefetch-misses [Hardware cache event] iTLB-loads [Hardware cache event] iTLB-load-misses [Hardware cache event] branch-loads [Hardware cache event] branch-load-misses [Hardware cache event] rNNN [raw hardware event descriptor] Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1246453578.3072.1.camel@ht.satnam> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 13:25:03 UTC
b9ebdcc perf stat: Define MATCH_EVENT for easy attr checking MATCH_EVENT is useful: 1. for multiple attrs checking 2. avoid repetition of PERF_TYPE_ and PERF_COUNT_ and save space 3. avoids line breakage Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1246440909.3403.5.camel@hpdv5.satnam> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 11:28:38 UTC
f37a291 perf_counter tools: Add more warnings and fix/annotate them Enable -Wextra. This found a few real bugs plus a number of signed/unsigned type mismatches/uncleanlinesses. It also required a few annotations All things considered it was still worth it so lets try with this enabled for now. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 10:49:48 UTC
88a69df perf report: Fix HV bit mismerge Fix: builtin-report.c: In function ‘hist_entry__add’: builtin-report.c:1015: error: case label not within a switch statement builtin-report.c:1017: error: break statement not within loop or switch Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 09:17:40 UTC
61c4598 perf_counter tools: Rework event string parsing/syntax This reworks the parser for event descriptors to make it more consistent in what it accepts. It is now structured as a recursive descent parser for the following grammar: events ::= event ( ("," | space) space* event )* event ::= ( raw_event | numeric_event | symbolic_event | generic_hw_event ) [ event_modifier ] raw_event ::= "r" hex_number numeric_event ::= number ":" number number ::= decimal_number | "0x" hex_number | "0" octal_number symbolic_event ::= string_from_event_symbols_array generic_hw_event::= cache_type ( "-" ( cache_op | cache_result ) )* event_modifier ::= ":" ( "u" | "k" | "h" )+ with the extra restriction that you can have at most one cache_op and at most one cache_result. We pass the current string pointer by reference (i.e. as a const char **) to the various parsing functions so that they can advance the pointer to indicate how much they consumed. They return 0 if they didn't recognize the thing at the pointer or 1 if they did (and advance the pointer past it). This also fixes parse_aliases to take the longest matching alias from the table, not the first one. Otherwise "l1-data" would match the "l1-d" alias and the "ata" would not be consumed. This allows event modifiers indicating what processor modes to count in to be applied to any event, not just numeric events, and adds a ":h" modifier to indicate counting in hypervisor mode. Specifying ":u" now sets both exclude_kernel and exclude_hv, and so on. Multiple modes can be specified, e.g. ":uk" will count in user or hypervisor mode (i.e. only exclude_kernel will be set). Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <19018.53826.843815.189847@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 08:23:17 UTC
0a456fc powerpc/perf_counter: Enable alternate PR/HV bits for POWER7 POWER7 has the same PR/HV bit layout as POWER6, so set the flag. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Paul Mackerras <paulus@samba.org> Cc: a.p.zijlstra@chello.nl Cc: benh@kernel.crashing.org LKML-Reference: <20090701030701.GI3563@kryten> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 08:20:28 UTC
deac911 perf_counter tools: Various fixes for callchains The symbol resolving has of course revealed some bugs in the callchain tree handling. This patch fixes some of them, including: - inherit the children from the parents while splitting a node - fix list range moving - fix indexes setting in callchains - create a child on the current node if the path doesn't match in the existent children (was only done on the root) - compare using symbols when possible so that we can match a function using any ip inside by referring to its start address. The practical effects are: - remove double callchains - fix upside down or any random order of callchains - fix wrong paths - fix bad hits and percentage accounts Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246419315-9968-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 07:58:52 UTC
4424961 perf_counter tools: Resolve symbols in callchains This patch resolves the names, when possible, of each ip present in the callchains while using the -c option with perf report. Example: 5.40% [k] __d_lookup 5.37% perf_callchain perf_counter_overflow intel_pmu_handle_irq perf_counter_nmi_handler notifier_call_chain atomic_notifier_call_chain notify_die do_nmi nmi do_lookup __link_path_walk path_walk do_path_lookup user_path_at sys_faccessat sys_access system_call_fastpath 0x7fb609846f77 0.01% perf_callchain perf_counter_overflow intel_pmu_handle_irq perf_counter_nmi_handler notifier_call_chain atomic_notifier_call_chain notify_die do_nmi nmi do_lookup __link_path_walk path_walk do_path_lookup user_path_at sys_faccessat Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246419315-9968-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 07:58:26 UTC
9198aa7 perf_counter tools: Fix storage size allocation of callchain list Fix a confusion while giving the size of a callchain list during its allocation. We are using the wrong structure size. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246419315-9968-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 07:58:23 UTC
087021b Merge branch 'linus' into perfcounters/urgent Merge reason: this branch was on a .30-ish base before, update it to an almost-.31-rc2 upstream base to pick up fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu> 01 July 2009, 07:56:26 UTC
e83c2b0 Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6 * 'kmemleak' of git://linux-arm.org/linux-2.6: kmemleak: Inform kmemleak about pid_hash kmemleak: Do not warn if an unknown object is freed kmemleak: Do not report new leaked objects if the scanning was stopped kmemleak: Slightly change the policy on newly allocated objects kmemleak: Do not trigger a scan when reading the debug/kmemleak file kmemleak: Simplify the reports logged by the scanning thread kmemleak: Enable task stacks scanning by default kmemleak: Allow the early log buffer to be configurable. 01 July 2009, 02:04:53 UTC
9fcfc91 Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm table: fix blk_stack_limits arg to use bytes not sectors dm exception store: really fix type lookup 01 July 2009, 02:04:14 UTC
55bcab4 Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (47 commits) perf report: Add --symbols parameter perf report: Add --comms parameter perf report: Add --dsos parameter perf_counter tools: Adjust only prelinked symbol's addresses perf_counter: Provide a way to enable counters on exec perf_counter tools: Reduce perf stat measurement overhead/skew perf stat: Use percentages for scaling output perf_counter, x86: Update x86_pmu after WARN() perf stat: Micro-optimize the code: memcpy is only required if no event is selected and !null_run perf stat: Improve output perf stat: Fix multi-run stats perf stat: Add -n/--null option to run without counters perf_counter tools: Remove dead code perf_counter: Complete counter swap perf report: Print sorted callchains per histogram entries perf_counter tools: Prepare a small callchain framework perf record: Fix unhandled io return value perf_counter tools: Add alias for 'l1d' and 'l1i' perf-report: Add bare minimum PERF_EVENT_READ parsing perf-report: Add modes for inherited stats and no-samples ... 01 July 2009, 02:02:59 UTC
58580c8 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: Add Fenghua Yu as temporary co-maintainer for ia64 [IA64] address compiler warnings perfmon.c/salinfo.c [IA64] Remove unnecessary semicolons [IA64] sprintf should not be used with same source & destination address 01 July 2009, 02:01:52 UTC
6086071 MN10300: Wire up new syscalls Wire up new syscalls rt_tgsigqueueinfo and perf_counter_open. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:58:37 UTC
aee3ff1 FRV: Wire up new syscalls Wire up new syscalls rt_tgsigqueueinfo and perf_counter_open. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:58:37 UTC
752fa51 hostfs: set maximum filesize in superblock for proper LFS support Maximum file size for hostfs mounts defaults to 2GB, so bigger files cannot be read/written through hostfs. This patch initializes the maximum file size to MAX_LFS_SIZE. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13531 Signed-off-by: Wolfgang Illmeyer <wolfgang@illmeyer.com> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:03 UTC
8516a50 floppy: fix lock imbalance A crappy macro prevents us unlocking on a fail path. Expand the macro and unlock appropriatelly. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:01 UTC
9980060 bfin: delay IRQ registration until driver is ready Make sure we do not actually request the RTC IRQ until the device driver is fully ready to handle and process any interrupt. This way a spurious interrupt won't crash the system (which may happen if the bootloader was poking the RTC right before booting Linux). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:01 UTC
ee905d0 atyfb: fix alignment for block writes Block writes require 64 byte alignment. Since block writes could be used with SGRAM or WRAM also refine the memory type detection to check for either type before deciding to use the 64 byte alignment. Signed-off-by: Ville Syrjala <syrjala@sci.fi> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:01 UTC
eafad22 atyfb: fix HP OmniBook 500 reboot hang Apparently HP OmniBook 500's BIOS doesn't like the way atyfb reprograms the hardware. The BIOS will simply hang after a reboot. Fix the problem by restoring the hardware to it's original state on reboot. Signed-off-by: Ville Syrjala <syrjala@sci.fi> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:01 UTC
50efacf gpio: pl061: fix IRQ handling for GPIOs >= PL061_GPIO_NR IRQ handling is wrong for any GPIO >= PL061_GPIO_NR. Fix this by implementing and using a proper .to_irq method. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:01 UTC
79d7f4e gpio: pl061: fix probe error handling code Note that IRQ has not been initialized when kmalloc() fails. Also, use DECLARE_BITMAP() to make the code clearer. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:01 UTC
66918dc x86: only clear node_states for 64bit Nathan reported that | commit 73d60b7f747176dbdff826c4127d22e1fd3f9f74 | Author: Yinghai Lu <yinghai@kernel.org> | Date: Tue Jun 16 15:33:00 2009 -0700 | | page-allocator: clear N_HIGH_MEMORY map before we set it again | | SRAT tables may contains nodes of very small size. The arch code may | decide to not activate such a node. However, currently the early boot | code sets N_HIGH_MEMORY for such nodes. These nodes therefore seem to be | active although these nodes have no present pages. | | For 64bit N_HIGH_MEMORY == N_NORMAL_MEMORY, so that works for 64 bit too unintentionally and incorrectly clears the cpuset.mems cgroup attribute on an i386 kvm guest, meaning that cpuset.mems can not be used. Fix this by only clearing node_states[N_NORMAL_MEMORY] for 64bit only. and need to do save/restore for that in find_zone_movable_pfn Reported-by: Nathan Lynch <ntl@pobox.com> Tested-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu>, Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:01 UTC
b37f2d4 cpusets: document adding/removing cpus to cpuset elaborately By writing a tasks's pid to the file, a process adds that task to that cgroup/cpuset. But to add a cpu/mem to a cpuset, the new list of cpus should be written to the cpuset.mems file which would replace the old list of cpus. Make this clearer in the documentation. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:01 UTC
d7831a0 mm: prevent balance_dirty_pages() from doing too much work balance_dirty_pages can overreact and move all of the dirty pages to writeback unnecessarily. balance_dirty_pages makes its decision to throttle based on the number of dirty plus writeback pages that are over the calculated limit,so it will continue to move pages even when there are plenty of pages in writeback and less than the threshold still dirty. This allows it to overshoot its limits and move all the dirty pages to writeback while waiting for the drives to catch up and empty the writeback list. A simple fio test easily demonstrates this problem. fio --name=f1 --directory=/disk1 --size=2G -rw=write --name=f2 --directory=/disk2 --size=1G --rw=write --startdelay=10 This is the simplest fix I could find, but I'm not entirely sure that it alone will be enough for all cases. But it certainly is an improvement on my desktop machine writing to 2 disks. Do we need something more for machines with large arrays where bdi_threshold * number_of_drives is greater than the dirty_ratio ? Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:01 UTC
df279ca bsdacct: fix access to invalid filp in acct_on() The file opened in acct_on and freshly stored in the ns->bacct struct can be closed in acct_file_reopen by a concurrent call after we release acct_lock and before we call mntput(file->f_path.mnt). Record file->f_path.mnt in a local variable and use this variable only. Signed-off-by: Renaud Lottiaux <renaud.lottiaux@kerlabs.com> Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:00 UTC
b4f9018 MAINTAINERS: STARFIRE/DURALAN update Ion's cs.columbia.edu email address no longer works. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Ion Badulescu <ionut@badula.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:00 UTC
8bc1ad7 kernel/resource.c: fix sign extension in reserve_setup() When the 32-bit signed quantities get assigned to the u64 resource_size_t, they are incorrectly sign-extended. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13253 Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9905 Signed-off-by: Zhang Rui <rui.zhang@intel.com> Reported-by: Leann Ogasawara <leann@ubuntu.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Reported-by: <pablomme@googlemail.com> Tested-by: <pablomme@googlemail.com> Cc: <stable@kernel.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:00 UTC
529ba0d spi: bitbang bugfix in message setup Bugfix to spi_bitbang infrastructure: make sure to always set transfer parameters on the first pass through the message's per-transfer loop. This can matter with drivers that replace the per-word or per-buffer transfer primitives, on busses with multiple SPI devices. Previously, this could have started messages using the settings left after previous messages. The problem was observed when a high speed chip (m25p80 type flash) was running very slowly because a low speed device (avr8 microcontroller) had previously used the bus. Similar faults could have driven the low speed device too fast, or used an unexpected word size. Acked-by: Steven A. Falco <sfalco@harris.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:00 UTC
537a1bf fbdev: add mutex for fb_mmap locking Add a mutex to avoid a circular locking problem between the mm layer semaphore and fbdev ioctl mutex through the fb_mmap() call. Also, add mutex to all places where smem_start and smem_len fields change so the mutex inside the fb_mmap() is actually used. Changing of these fields before calling the framebuffer_register() are not mutexed. This is 2.6.31 material. It removes one lockdep (fb_mmap() and register_framebuffer()) but there is still another one (fb_release() and register_framebuffer()). It also cleans up handling of the smem_start and smem_len fields used by mutexed section of the fb_mmap(). Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:00 UTC
70d6027 spi: add spi_master flag word Add a new spi_master.flags word listing constraints relevant to that controller. Define the first constraint bit: a half duplex restriction. Include that constraint in the OMAP1 MicroWire controller driver. Have the mmc_spi host be the first customer of this flag. Its coding relies heavily on full duplex transfers, so it must fail when the underlying controller driver won't perform them. (The spi_write_then_read routine could use it too: use the temporarily-withdrawn full-duplex speedup unless this flag is set, in which case the existing code applies. Similarly, any spi_master implementing only SPI_3WIRE should set the flag.) Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:00 UTC
b55f627 spi: new spi->mode bits Add two new spi_device.mode bits to accomodate more protocol options, and pass them through to usermode drivers: * SPI_NO_CS ... a second 3-wire variant, where the chipselect line is removed instead of a data line; transfers are still full duplex. This obviously has STRONG protocol implications since the chipselect transitions can't be used to synchronize state transitions with the SPI master. * SPI_READY ... defines open drain signal that's pulled low to pause the clock. This defines a 5-wire variant (normal 4-wire SPI plus READY) and two 4-wire variants (READY plus each of the 3-wire flavors). Such hardware flow control can be a big win. There are ADC converters and flash chips that expose READY signals, but not many host controllers support it today. The spi_bitbang code should be changed to use SPI_NO_CS instead of its current nonportable hack. That's a mode most hardware can easily support (unlike SPI_READY). Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: "Paulraj, Sandeep" <s-paulraj@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:00 UTC
c495682 dmapools: protect page_list walk in show_pools() show_pools() walks the page_list of a pool w/o protection against the list modifications in alloc/free. Take pool->lock to avoid stomping into nirvana. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:00 UTC
4d6c13f ext2: return -EIO not -ESTALE on directory traversal through deleted inode ext2_iget() returns -ESTALE if invoked on a deleted inode, in order to report errors to NFS properly. However, in ext[234]_lookup(), this -ESTALE can be propagated to userspace if the filesystem is corrupted such that a directory entry references a deleted inode. This leads to a misleading error message - "Stale NFS file handle" - and confusion on the part of the admin. The bug can be easily reproduced by creating a new filesystem, making a link to an unused inode using debugfs, then mounting and attempting to ls -l said link. This patch thus changes ext2_lookup to return -EIO if it receives -ESTALE from ext2_iget(), as ext2 does for other filesystem metadata corruption; and also invokes the appropriate ext*_error functions when this case is detected. Signed-off-by: Bryan Donlan <bdonlan@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:56:00 UTC
341c87b elf: limit max map count to safe value With ELF, at generating coredump, some more headers other than used vmas are added. When max_map_count == 65536, a core generated by following kinds of code can be unreadable because the number of ELF's program header is written in 16bit in Ehdr (please see elf.h) and the number overflows. == ... = mmap(); (munmap, mprotect, etc...) if (failed) abort(); == This can happen in mmap/munmap/mprotect/etc...which calls split_vma(). I think 65536 is not safe as _default_ and reduce it to 65530 is good for avoiding unexpected corrupted core. Anyway, max_map_count can be enlarged by sysctl if a user is brave.. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Jakub Jelinek <jakub@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:55:59 UTC
b1cfebc edac: add DDR3 memory type for MPC85xx EDAC Since some new MPC85xx SOCs support DDR3 memory now, so add DDR3 memory type for MPC85xx EDAC. Signed-off-by: Yang Shi <yang.shi@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:55:59 UTC
c4285b4 parport/serial: add support for NetMos 9901 Multi-IO card Add support for the PCI-Express NetMos 9901 Multi-IO card. 0001:06:00.0 Serial controller [0700]: NetMos Technology Device [9710:9901] (prog-if 02 [16550]) Subsystem: Device [a000:1000] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 65 Region 0: I/O ports at 0030 [size=8] Region 1: Memory at 80105000 (32-bit, non-prefetchable) [size=4K] Region 4: Memory at 80104000 (32-bit, non-prefetchable) [size=4K] Capabilities: <access denied> Kernel driver in use: serial Kernel modules: 8250_pci 0001:06:00.1 Serial controller [0700]: NetMos Technology Device [9710:9901] (prog-if 02 [16550]) Subsystem: Device [a000:1000] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin B routed to IRQ 65 Region 0: I/O ports at 0020 [size=8] Region 1: Memory at 80103000 (32-bit, non-prefetchable) [size=4K] Region 4: Memory at 80102000 (32-bit, non-prefetchable) [size=4K] Capabilities: <access denied> Kernel driver in use: serial Kernel modules: 8250_pci 0001:06:00.2 Parallel controller [0701]: NetMos Technology Device [9710:9901] (prog-if 03 [IEEE1284]) Subsystem: Device [a000:2000] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin C routed to IRQ 65 Region 0: I/O ports at 0010 [size=8] Region 1: I/O ports at <unassigned> Region 2: Memory at 80101000 (32-bit, non-prefetchable) [size=4K] Region 4: Memory at 80100000 (32-bit, non-prefetchable) [size=4K] Capabilities: <access denied> Kernel driver in use: parport_pc Kernel modules: parport_pc [ 16.760181] PCI parallel port detected: 416c:0100, I/O at 0x812010(0x0), IRQ 65 [ 16.760225] parport0: PC-style at 0x812010, irq 65 [PCSPP,TRISTATE,EPP] [ 16.851842] serial 0001:06:00.0: enabling device (0004 -> 0007) [ 16.883776] 0001:06:00.0: ttyS0 at I/O 0x812030 (irq = 65) is a ST16650V2 [ 16.893832] serial 0001:06:00.1: enabling device (0004 -> 0007) [ 16.926537] 0001:06:00.1: ttyS1 at I/O 0x812020 (irq = 65) is a ST16650V2 Signed-off-by: Michael Buesch <mb@bu3sch.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:55:59 UTC
972c71a gcov: fix documentation Commonly available versions of cp and tar don't work well with special files created using seq_file. Mention this problem in the gcov documentation and update the helper script example to work around these problems. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:55:59 UTC
b01e8dc alpha: fix percpu build breakage alpha percpu access requires custom SHIFT_PERCPU_PTR() definition for modules to work around addressing range limitation. This is done via generating inline assembly using C preprocessing which forces the assembler to generate external reference. This happens behind the compiler's back and makes the compiler think that static percpu variables in modules are unused. This used to be worked around by using __unused attribute for percpu variables which prevent the compiler from omitting the variable; however, recent declare/definition attribute unification change broke this as __used can't be used for declaration. Also, in the process, PER_CPU_ATTRIBUTES definition in alpha percpu.h got broken. This patch adds PER_CPU_DEF_ATTRIBUTES which is only used for definitions and make alpha use it to add __used for percpu variables in modules. This also fixes the PER_CPU_ATTRIBUTES double definition bug. Signed-off-by: Tejun Heo <tj@kernel.org> Tested-by: maximilian attems <max@stro.at> Acked-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:55:59 UTC
15e3252 fbdev: work around old compiler bug When building with a 4.1.x compiler on powerpc64 (at least) we get this error: drivers/video/logo/logo_linux_mono.c:81: error: logo_linux_mono causes a section type conflict This was introduced by commit ae52bb2384f721562f15f719de1acb8e934733cb ("fbdev: move logo externs to header file"). This is a partial revert of that commit sufficient to not hit the compiler bug. Also convert _clut arrays from __initconst to __initdata. Sam said: Al analysed this some time ago. When we say something is const then _sometimes_ gcc annotate the section as const(?) - sometimes not. So if we have two variables/functions annotated __*const and gcc decides to annotate the section const only in one case we get a section type conflict. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 01 July 2009, 01:55:59 UTC
back to top