swh:1:snp:bb8853bfef8fcf2b1d37fd6404912c7606c98e48
Revision ab7cd7bb8c02dc40ca3a909653e8f56226f9e440 authored by Junio C Hamano on 16 February 2006, 19:55:51 UTC, committed by Junio C Hamano on 22 February 2006, 21:14:57 UTC
This introduces --no-reuse-delta option to disable reusing of
existing delta, which is a large part of the optimization
introduced by this series.  This may become necessary if
repeated repacking makes delta chain too long.  With this, the
output of the command becomes identical to that of the older
implementation.  But the performance suffers greatly.

It still allows reusing non-deltified representations; there is
no point uncompressing and recompressing the whole text.

It also adds a couple more statistics output, while squelching
it under -q flag, which the last round forgot to do.

  $ time old-git-pack-objects --stdout >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects....................
  real    12m8.530s       user    11m1.450s       sys     0m57.920s
  $ time git-pack-objects --stdout >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects.....................
  Total 184141, written 184141 (delta 138297), reused 178833 (delta 134081)
  real    0m59.549s       user    0m56.670s       sys     0m2.400s
  $ time git-pack-objects --stdout --no-reuse-delta >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects.....................
  Total 184141, written 184141 (delta 134833), reused 47904 (delta 0)
  real    11m13.830s      user    9m45.240s       sys     0m44.330s

There is one remaining issue when --no-reuse-delta option is not
used.  It can create delta chains that are deeper than specified.

    A<--B<--C<--D   E   F   G

Suppose we have a delta chain A to D (A is stored in full either
in a pack or as a loose object. B is depth1 delta relative to A,
C is depth2 delta relative to B...) with loose objects E, F, G.
And we are going to pack all of them.

B, C and D are left as delta against A, B and C respectively.
So A, E, F, and G are examined for deltification, and let's say
we decided to keep E expanded, and store the rest as deltas like
this:

    E<--F<--G<--A

Oops.  We ended up making D a bit too deep, didn't we?  B, C and
D form a chain on top of A!

This is because we did not know what the final depth of A would
be, when we checked objects and decided to keep the existing
delta.  Unfortunately, deferring the decision until just before
the deltification is not an option.  To be able to make B, C,
and D candidates for deltification with the rest, we need to
know the type and final unexpanded size of them, but the major
part of the optimization comes from the fact that we do not read
the delta data to do so -- getting the final size is quite an
expensive operation.

To prevent this from happening, we should keep A from being
deltified.  But how would we tell that, cheaply?

To do this most precisely, after check_object() runs, each
object that is used as the base object of some existing delta
needs to be marked with the maximum depth of the objects we
decided to keep deltified (in this case, D is depth 3 relative
to A, so if no other delta chain that is longer than 3 based on
A exists, mark A with 3).  Then when attempting to deltify A, we
would take that number into account to see if the final delta
chain that leads to D becomes too deep.

However, this is a bit cumbersome to compute, so we would cheat
and reduce the maximum depth for A arbitrarily to depth/4 in
this implementation.

Signed-off-by: Junio C Hamano <junkio@cox.net>
1 parent 3f9ac8d
History
Tip revision: 19981daefd7c147444462739375462b49412ce33 authored by Junio C Hamano on 05 April 2024, 17:49:37 UTC
The fifteenth batch
Tip revision: 19981da
File Mode Size
Documentation
arm
compat
mozilla-sha1
ppc
t
templates
.gitignore -rw-r--r-- 1.7 KB
COPYING -rw-r--r-- 18.3 KB
GIT-VERSION-GEN -rwxr-xr-x 624 bytes
INSTALL -rw-r--r-- 4.5 KB
Makefile -rw-r--r-- 15.6 KB
README -rw-r--r-- 24.7 KB
apply.c -rw-r--r-- 42.6 KB
blob.c -rw-r--r-- 1.3 KB
blob.h -rw-r--r-- 311 bytes
cache.h -rw-r--r-- 12.3 KB
cat-file.c -rw-r--r-- 1.2 KB
check-ref-format.c -rw-r--r-- 248 bytes
checkout-index.c -rw-r--r-- 4.9 KB
clone-pack.c -rw-r--r-- 3.9 KB
combine-diff.c -rw-r--r-- 22.4 KB
commit-tree.c -rw-r--r-- 3.1 KB
commit.c -rw-r--r-- 14.9 KB
commit.h -rw-r--r-- 2.0 KB
config.c -rw-r--r-- 12.3 KB
connect.c -rw-r--r-- 14.2 KB
convert-objects.c -rw-r--r-- 7.1 KB
copy.c -rw-r--r-- 692 bytes
count-delta.c -rw-r--r-- 1.9 KB
count-delta.h -rw-r--r-- 196 bytes
csum-file.c -rw-r--r-- 2.9 KB
csum-file.h -rw-r--r-- 566 bytes
ctype.c -rw-r--r-- 890 bytes
daemon.c -rw-r--r-- 16.2 KB
date.c -rw-r--r-- 13.5 KB
delta.h -rw-r--r-- 903 bytes
describe.c -rw-r--r-- 3.6 KB
diff-delta.c -rw-r--r-- 6.1 KB
diff-files.c -rw-r--r-- 5.1 KB
diff-index.c -rw-r--r-- 5.7 KB
diff-stages.c -rw-r--r-- 2.3 KB
diff-tree.c -rw-r--r-- 7.3 KB
diff.c -rw-r--r-- 34.9 KB
diff.h -rw-r--r-- 5.0 KB
diffcore-break.c -rw-r--r-- 8.5 KB
diffcore-order.c -rw-r--r-- 2.2 KB
diffcore-pathspec.c -rw-r--r-- 1.2 KB
diffcore-pickaxe.c -rw-r--r-- 2.5 KB
diffcore-rename.c -rw-r--r-- 12.7 KB
diffcore.h -rw-r--r-- 3.4 KB
entry.c -rw-r--r-- 3.6 KB
environment.c -rw-r--r-- 1.8 KB
epoch.c -rw-r--r-- 16.9 KB
epoch.h -rw-r--r-- 476 bytes
exec_cmd.c -rw-r--r-- 2.3 KB
exec_cmd.h -rw-r--r-- 283 bytes
fetch-clone.c -rw-r--r-- 5.5 KB
fetch-pack.c -rw-r--r-- 10.0 KB
fetch.c -rw-r--r-- 4.7 KB
fetch.h -rw-r--r-- 1.4 KB
fsck-objects.c -rw-r--r-- 12.5 KB
get-tar-commit-id.c -rw-r--r-- 514 bytes
git-add.sh -rwxr-xr-x 604 bytes
git-am.sh -rwxr-xr-x 9.6 KB
git-applymbox.sh -rwxr-xr-x 2.8 KB
git-applypatch.sh -rwxr-xr-x 5.4 KB
git-archimport.perl -rwxr-xr-x 34.2 KB
git-bisect.sh -rwxr-xr-x 5.3 KB
git-branch.sh -rwxr-xr-x 2.2 KB
git-checkout.sh -rwxr-xr-x 4.7 KB
git-cherry.sh -rwxr-xr-x 1.9 KB
git-clone.sh -rwxr-xr-x 5.9 KB
git-commit.sh -rwxr-xr-x 13.7 KB
git-compat-util.h -rw-r--r-- 3.3 KB
git-count-objects.sh -rwxr-xr-x 706 bytes
git-cvsexportcommit.perl -rwxr-xr-x 6.5 KB
git-cvsimport.perl -rwxr-xr-x 22.6 KB
git-diff.sh -rwxr-xr-x 1.5 KB
git-fetch.sh -rwxr-xr-x 9.1 KB
git-fmt-merge-msg.perl -rwxr-xr-x 3.2 KB
git-format-patch.sh -rwxr-xr-x 6.8 KB
git-grep.sh -rwxr-xr-x 1000 bytes
git-log.sh -rwxr-xr-x 372 bytes
git-lost-found.sh -rwxr-xr-x 461 bytes
git-ls-remote.sh -rwxr-xr-x 1.8 KB
git-merge-octopus.sh -rwxr-xr-x 2.4 KB
git-merge-one-file.sh -rwxr-xr-x 2.7 KB
git-merge-ours.sh -rwxr-xr-x 356 bytes
git-merge-recursive.py -rwxr-xr-x 30.8 KB
git-merge-resolve.sh -rwxr-xr-x 955 bytes
git-merge-stupid.sh -rwxr-xr-x 1.4 KB
git-merge.sh -rwxr-xr-x 6.6 KB
git-mv.perl -rwxr-xr-x 4.7 KB
git-parse-remote.sh -rwxr-xr-x 4.1 KB
git-prune.sh -rwxr-xr-x 783 bytes
git-pull.sh -rwxr-xr-x 2.4 KB
git-push.sh -rwxr-xr-x 1.6 KB
git-rebase.sh -rwxr-xr-x 1.5 KB
git-relink.perl -rwxr-xr-x 4.0 KB
git-repack.sh -rwxr-xr-x 1.6 KB
git-request-pull.sh -rwxr-xr-x 856 bytes
git-rerere.perl -rwxr-xr-x 4.8 KB
git-reset.sh -rwxr-xr-x 2.2 KB
git-resolve.sh -rwxr-xr-x 2.3 KB
git-revert.sh -rwxr-xr-x 3.9 KB
git-send-email.perl -rwxr-xr-x 8.7 KB
git-sh-setup.sh -rwxr-xr-x 1.0 KB
git-shortlog.perl -rwxr-xr-x 4.0 KB
git-svnimport.perl -rwxr-xr-x 19.5 KB
git-tag.sh -rwxr-xr-x 2.1 KB
git-verify-tag.sh -rwxr-xr-x 434 bytes
git-whatchanged.sh -rwxr-xr-x 934 bytes
git.c -rw-r--r-- 5.7 KB
git.spec.in -rw-r--r-- 6.3 KB
gitMergeCommon.py -rw-r--r-- 6.9 KB
gitk -rwxr-xr-x 102.1 KB
hash-object.c -rw-r--r-- 1.8 KB
http-fetch.c -rw-r--r-- 23.8 KB
http-push.c -rw-r--r-- 36.2 KB
http.c -rw-r--r-- 10.6 KB
http.h -rw-r--r-- 2.3 KB
ident.c -rw-r--r-- 4.5 KB
index-pack.c -rw-r--r-- 11.6 KB
index.c -rw-r--r-- 1.1 KB
init-db.c -rw-r--r-- 7.4 KB
local-fetch.c -rw-r--r-- 5.7 KB
ls-files.c -rw-r--r-- 16.3 KB
ls-tree.c -rw-r--r-- 3.0 KB
mailinfo.c -rw-r--r-- 16.1 KB
mailsplit.c -rw-r--r-- 3.9 KB
merge-base.c -rw-r--r-- 6.0 KB
merge-index.c -rw-r--r-- 2.6 KB
mktag.c -rw-r--r-- 3.1 KB
name-rev.c -rw-r--r-- 5.2 KB
object.c -rw-r--r-- 5.8 KB
object.h -rw-r--r-- 1.5 KB
pack-check.c -rw-r--r-- 3.7 KB
pack-objects.c -rw-r--r-- 22.3 KB
pack-redundant.c -rw-r--r-- 14.2 KB
pack.h -rw-r--r-- 818 bytes
patch-delta.c -rw-r--r-- 1.7 KB
patch-id.c -rw-r--r-- 1.5 KB
path.c -rw-r--r-- 5.3 KB
peek-remote.c -rw-r--r-- 1.0 KB
pkt-line.c -rw-r--r-- 2.4 KB
pkt-line.h -rw-r--r-- 270 bytes
prune-packed.c -rw-r--r-- 1.5 KB
quote.c -rw-r--r-- 5.6 KB
quote.h -rw-r--r-- 1.5 KB
read-cache.c -rw-r--r-- 16.6 KB
read-tree.c -rw-r--r-- 16.4 KB
receive-pack.c -rw-r--r-- 7.6 KB
refs.c -rw-r--r-- 8.3 KB
refs.h -rw-r--r-- 1.0 KB
repo-config.c -rw-r--r-- 3.1 KB
rev-list.c -rw-r--r-- 21.6 KB
rev-parse.c -rw-r--r-- 6.7 KB
rsh.c -rw-r--r-- 2.2 KB
rsh.h -rw-r--r-- 159 bytes
run-command.c -rw-r--r-- 1.5 KB
run-command.h -rw-r--r-- 511 bytes
send-pack.c -rw-r--r-- 8.5 KB
server-info.c -rw-r--r-- 5.1 KB
setup.c -rw-r--r-- 4.4 KB
sha1_file.c -rw-r--r-- 37.7 KB
sha1_name.c -rw-r--r-- 9.7 KB
shell.c -rw-r--r-- 1.2 KB
show-branch.c -rw-r--r-- 17.3 KB
show-index.c -rw-r--r-- 593 bytes
ssh-fetch.c -rw-r--r-- 3.5 KB
ssh-pull.c -rw-r--r-- 154 bytes
ssh-push.c -rw-r--r-- 155 bytes
ssh-upload.c -rw-r--r-- 2.8 KB
strbuf.c -rw-r--r-- 807 bytes
strbuf.h -rw-r--r-- 216 bytes
stripspace.c -rw-r--r-- 786 bytes
symbolic-ref.c -rw-r--r-- 785 bytes
tag.c -rw-r--r-- 2.6 KB
tag.h -rw-r--r-- 471 bytes
tar-tree.c -rw-r--r-- 10.0 KB
test-date.c -rw-r--r-- 416 bytes
test-delta.c -rw-r--r-- 1.8 KB
tree-diff.c -rw-r--r-- 6.2 KB
tree.c -rw-r--r-- 5.6 KB
tree.h -rw-r--r-- 1.1 KB
unpack-file.c -rw-r--r-- 709 bytes
unpack-objects.c -rw-r--r-- 6.6 KB
update-index.c -rw-r--r-- 13.1 KB
update-ref.c -rw-r--r-- 2.0 KB
update-server-info.c -rw-r--r-- 457 bytes
upload-pack.c -rw-r--r-- 6.1 KB
usage.c -rw-r--r-- 639 bytes
var.c -rw-r--r-- 1.3 KB
verify-pack.c -rw-r--r-- 1.1 KB
write-tree.c -rw-r--r-- 3.9 KB

README

back to top