Revision - 218aa3a - reuse cached commit buffer when parsing signatures

Revision 218aa3a6162b80696a82b8745daa38fa826985ae authored by Jeff King on 13 June 2014, 06:32:11 UTC, committed by Junio C Hamano on 13 June 2014, 19:10:13 UTC

reuse cached commit buffer when parsing signatures

When we call show_signature or show_mergetag, we read the
commit object fresh via read_sha1_file and reparse its
headers. However, in most cases we already have the object
data available, attached to the "struct commit". This is
partially laziness in dealing with the memory allocation
issues, but partially defensive programming, in that we
would always want to verify a clean version of the buffer
(not one that might have been munged by other users of the
commit).

However, we do not currently ever munge the commit buffer,
and not using the already-available buffer carries a fairly
big performance penalty when we are looking at a large
number of commits. Here are timings on linux.git:

  [baseline, no signatures]
  $ time git log >/dev/null
  real    0m4.902s
  user    0m4.784s
  sys     0m0.120s

  [before]
  $ time git log --show-signature >/dev/null
  real    0m14.735s
  user    0m9.964s
  sys     0m0.944s

  [after]
  $ time git log --show-signature >/dev/null
  real    0m9.981s
  user    0m5.260s
  sys     0m0.936s

Note that our user CPU time drops almost in half, close to
the non-signature case, but we do still spend more
wall-clock and system time, presumably from dealing with
gpg.

An alternative to this is to note that most commits do not
have signatures (less than 1% in this repo), yet we pay the
re-parsing cost for every commit just to find out if it has
a mergetag or signature. If we checked that when parsing the
commit initially, we could avoid re-examining most commits
later on. Even if we did pursue that direction, however,
this would still speed up the cases where we _do_ have
signatures. So it's probably worth doing either way.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

1 parent 8597ea3

Files
Changes

Permalinks

prio-queue.h

#ifndef PRIO_QUEUE_H
#define PRIO_QUEUE_H

/*
 * A priority queue implementation, primarily for keeping track of
 * commits in the 'date-order' so that we process them from new to old
 * as they are discovered, but can be used to hold any pointer to
 * struct.  The caller is responsible for supplying a function to
 * compare two "things".
 *
 * Alternatively, this data structure can also be used as a LIFO stack
 * by specifying NULL as the comparison function.
 */

/*
 * Compare two "things", one and two; the third parameter is cb_data
 * in the prio_queue structure.  The result is returned as a sign of
 * the return value, being the same as the sign of the result of
 * subtracting "two" from "one" (i.e. negative if "one" sorts earlier
 * than "two").
 */
typedef int (*prio_queue_compare_fn)(const void *one, const void *two, void *cb_data);

struct prio_queue {
	prio_queue_compare_fn compare;
	void *cb_data;
	int alloc, nr;
	void **array;
};

/*
 * Add the "thing" to the queue.
 */
extern void prio_queue_put(struct prio_queue *, void *thing);

/*
 * Extract the "thing" that compares the smallest out of the queue,
 * or NULL.  If compare function is NULL, the queue acts as a LIFO
 * stack.
 */
extern void *prio_queue_get(struct prio_queue *);

extern void clear_prio_queue(struct prio_queue *);

/* Reverse the LIFO elements */
extern void prio_queue_reverse(struct prio_queue *);

#endif /* PRIO_QUEUE_H */

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...