git.99rst.org Git - git.git/log

git-gui: convert new/amend commit radiobutton to checkbutton

Its a bi-state anyway and also saves one line in the menu.

Signed-off-by: Bert Wesarg <redacted>
Signed-off-by: Pratyush Yadav <redacted>

completion: teach archive to use __gitcomp_builtin

Currently, _git_archive() uses a hardcoded list of options for its
completion. However, we can use __gitcomp_builtin() to get a dynamically
generated list of completions instead.

Teach _git_archive() to use __gitcomp_builtin() so that newly
implemented options in archive will be automatically completed without
any mucking around in git-completion.bash. While we're at it, teach it
to complete the missing `--worktree-attributes` option as well.

Unfortunately, since some args are passed through from cmd_archive() to
write_archive() (which calls parse_archive_args()), there's no way that a
`--git-completion-helper` arg can end up reaching parse_archive_args()
since the first call to parse_options() will end up calling exit(0). As
a result, we have to carry the options supported by write_archive() in
the hardcoded string.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

completion: teach rebase to use __gitcomp_builtin

Currently, _git_rebase() uses a hardcoded list of options for its
completion. However, we can use __gitcomp_builtin() to get a dynamically
generated list of completions instead.

Teach _git_rebase() to use __gitcomp_builtin() so that newly implemented
options in rebase will be automatically completed without any mucking
around in git-completion.bash.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

completion: add missing completions for log, diff, show

The bash completion script knows some options to "git log" and
"git show" only in the positive form, (e.g. "--abbrev-commit"), but not
in their negative form (e.g. "--no-abbrev-commit"). Add them.

Also, the bash completion script is missing some other options to
"git diff", and "git show" (and thus, all other commands that take
"git diff"'s options). Add them. Of note, since "--indent-heuristic" is
no longer experimental, add that too.

Signed-off-by: Max Rothman <redacted>
Acked-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

upload-pack: disable commit graph more gently for shallow traversal

When the client has asked for certain shallow options like
"deepen-since", we do a custom rev-list walk that pretends to be
shallow. Before doing so, we have to disable the commit-graph, since it
is not compatible with the shallow view of the repository. That's
handled by 829a321569 (commit-graph: close_commit_graph before shallow
walk, 2018-08-20). That commit literally closes and frees our
repo->objects->commit_graph struct.

That creates an interesting problem for commits that have _already_ been
parsed using the commit graph. Their commit->object.parsed flag is set,
their commit->graph_pos is set, but their commit->maybe_tree may still
be NULL. When somebody later calls repo_get_commit_tree(), we see that
we haven't loaded the tree oid yet and try to get it from the commit
graph. But since it has been freed, we segfault!

So the root of the issue is a data dependency between the commit's
lazy-load of the tree oid and the fact that the commit graph can go
away mid-process. How can we resolve it?

There are a couple of general approaches:

  1. The obvious answer is to avoid loading the tree from the graph when
     we see that it's NULL. But then what do we return for the tree oid?
     If we return NULL, our caller in do_traverse() will rightly
     complain that we have no tree. We'd have to fallback to loading the
     actual commit object and re-parsing it. That requires teaching
     parse_commit_buffer() to understand re-parsing (i.e., not starting
     from a clean slate and not leaking any allocated bits like parent
     list pointers).

  2. When we close the commit graph, walk through the set of in-memory
     objects and clear any graph_pos pointers. But this means we also
     have to "unparse" any such commits so that we know they still need
     to open the commit object to fill in their trees. So it's no less
     complicated than (1), and is more expensive (since we clear objects
     we might not later need).

  3. Stop freeing the commit-graph struct. Continue to let it be used
     for lazy-loads of tree oids, but let upload-pack specify that it
     shouldn't be used for further commit parsing.

  4. Push the whole shallow rev-list out to its own sub-process, with
     the commit-graph disabled from the start, giving it a clean memory
     space to work from.

I've chosen (3) here. Options (1) and (2) would work, but are
non-trivial to implement. Option (4) is more expensive, and I'm not sure
how complicated it is (shelling out for the actual rev-list part is
easy, but we do then parse the resulting commits internally, and I'm not
clear which parts need to be handling shallow-ness).

The new test in t5500 triggers this segfault, but see the comments there
for how horribly intimate it has to be with how both upload-pack and
commit graphs work.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph: bump DIE_ON_LOAD check to actual load-time

Commit 43d3561805 (commit-graph write: don't die if the existing graph
is corrupt, 2019-03-25) added an environment variable we use only in the
test suite, $GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD. But it put the check for
this variable at the very top of prepare_commit_graph(), which is called
every time we want to use the commit graph. Most importantly, it comes
_before_ we check the fast-path "did we already try to load?", meaning
we end up calling getenv() for every single use of the commit graph,
rather than just when we load.

getenv() is allowed to have unexpected side effects, but that shouldn't
be a problem here; we're lazy-loading the graph so it's clear that at
least _one_ invocation of this function is going to call it.

But it is inefficient. getenv() typically has to do a linear search
through the environment space.

We could memoize the call, but it's simpler still to just bump the check
down to the actual loading step. That's fine for our sole user in t5318,
and produces this minor real-world speedup:

  [before]
  Benchmark #1: git -C linux rev-list HEAD >/dev/null
    Time (mean ± σ):      1.460 s ±  0.017 s    [User: 1.174 s, System: 0.285 s]
    Range (min … max):    1.440 s …  1.491 s    10 runs

  [after]
  Benchmark #1: git -C linux rev-list HEAD >/dev/null
    Time (mean ± σ):      1.391 s ±  0.005 s    [User: 1.118 s, System: 0.273 s]
    Range (min … max):    1.385 s …  1.399 s    10 runs

Of course that actual speedup depends on how big your environment is. We
can game it like this:

  for i in $(seq 10000); do
    export dummy$i=$i
  done

in which case I get:

  [before]
  Benchmark #1: git -C linux rev-list HEAD >/dev/null
    Time (mean ± σ):      6.257 s ±  0.061 s    [User: 6.005 s, System: 0.250 s]
    Range (min … max):    6.174 s …  6.337 s    10 runs

  [after]
  Benchmark #1: git -C linux rev-list HEAD >/dev/null
  Time (mean ± σ):      1.403 s ±  0.005 s    [User: 1.146 s, System: 0.256 s]
  Range (min … max):    1.396 s …  1.412 s    10 runs

So this is really more about avoiding the pathological case than
providing a big real-world speedup.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

list-objects: don't queue root trees unless revs->tree_objects is set

When traverse_commit_list() processes each commit, it queues the
commit's root tree in the pending array. Then, after all commits are
processed, it calls traverse_trees_and_blobs() to walk over the pending
list, calling process_tree() on each. But if revs->tree_objects is not
set, process_tree() just exists immediately!

We can save ourselves some work by not even bothering to queue these
trees in the first place. There are a few subtle points to make:

  - we also detect commits with a NULL tree pointer here. But this isn't
    an interesting check for broken commits, since the lookup_tree()
    we'd have done during commit parsing doesn't actually check that we
    have the tree on disk. So we're not losing any robustness.

  - besides queueing, we also set the NOT_USER_GIVEN flag on the tree
    object. This is used by the traverse_commit_list_filtered() variant.
    But if we're not exploring trees, then we won't actually care about
    this flag, which is used only inside process_tree() code-paths.

  - queueing trees eventually leads to us queueing blobs, too. But we
    don't need to check revs->blob_objects here. Even in the current
    code, we still wouldn't find those blobs, because we'd never open up
    the tree objects to list their contents.

  - the user-visible impact to the caller is minimal. The pending trees
    are all cleared by the time the function returns anyway, by
    traverse_trees_and_blobs(). We do call a show_commit() callback,
    which technically could be looking at revs->pending during the
    callback. But it seems like a rather unlikely thing to do (if you
    want the tree of the current commit, then accessing the tree struct
    member is a lot simpler).

So this should be safe to do. Let's look at the benefits:

  [before]
  Benchmark #1: git -C linux rev-list HEAD >/dev/null
    Time (mean ± σ):      7.651 s ±  0.021 s    [User: 7.399 s, System: 0.252 s]
    Range (min … max):    7.607 s …  7.683 s    10 runs

  [after]
  Benchmark #1: git -C linux rev-list HEAD >/dev/null
    Time (mean ± σ):      7.593 s ±  0.023 s    [User: 7.329 s, System: 0.264 s]
    Range (min … max):    7.565 s …  7.634 s    10 runs

Not too impressive, but then we're really just avoiding sticking a
pointer into a growable array. But still, I'll take a free 0.75%
speedup.

Let's try it after running "git commit-graph write":

  [before]
  Benchmark #1: git -C linux rev-list HEAD >/dev/null
    Time (mean ± σ):      1.458 s ±  0.011 s    [User: 1.199 s, System: 0.259 s]
    Range (min … max):    1.447 s …  1.481 s    10 runs

  [after]
  Benchmark #1: git -C linux rev-list HEAD >/dev/null
    Time (mean ± σ):      1.126 s ±  0.023 s    [User: 896.5 ms, System: 229.0 ms]
    Range (min … max):    1.106 s …  1.181 s    10 runs

Now that's more like it. We saved over 22% of the total time. Part of
that is because the runtime is shorter overall, but the absolute
improvement is also much larger. What's going on?

When we fill in a commit struct using the commit graph, we don't bother
to set the tree pointer, and instead lazy-load it when somebody calls
get_commit_tree(). So we're not only skipping the pointer write to the
pending queue, but we're skipping the lazy-load of the tree entirely.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

doc: minor formatting fix

Move a closing backtick that was placed one character too soon.

Signed-off-by: Cameron Steffen <redacted>
Signed-off-by: Junio C Hamano <redacted>

Quit passing 'now' to date code

Commit b841d4ff43 (Add `human` format to test-tool, 2019-01-28) added
a get_time() function which allows $GIT_TEST_DATE_NOW in the
environment to override the current time. So we no longer need to
interpret that variable in cmd__date().

Therefore, we can stop passing the "now" parameter down through the
date functions, since nobody uses them. Note that we do need to make
sure all of the previous callers that took a "now" parameter are
correctly using get_time().

Signed-off-by: Stephen P. Smith <redacted>
Signed-off-by: Junio C Hamano <redacted>

Merge branch 'py/revert-hunks-lines'

git-gui learned to revert selected lines and hunks, just like it can
stage selected lines and hunks. To provide a safety net for accidental
revert, the most recent revert can be undone.

* py/revert-hunks-lines:
  git-gui: allow undoing last revert
  git-gui: return early when patch fails to apply
  git-gui: allow reverting selected hunk
  git-gui: allow reverting selected lines

Merge branch 'bp/widget-focus-hotkeys'

git-gui learned to switch focus between widgets "unstaged commits",
"staged commits", "diff", and "commit message" using the keyboard
shortcuts Alt+1, Alt+2, Alt+3, and Alt+4 respectively.

* bp/widget-focus-hotkeys:
git-gui: add hotkeys to set widget focus

git-gui: add hotkeys to set widget focus

The user cannot change focus between the list of files, the diff view and
the commit message widgets without using the mouse (clicking either of
the four widgets).

With this patch, the user may set ui focus to the previously selected path
in either the "Unstaged Changes" or "Staged Changes" widgets, using
ALT+1 or ALT+2.

The user may also set the ui focus to the diff view widget with
ALT+3, or to the commit message widget with ALT+4.

This enables the user to select/unselect files, view the diff and create a
commit in git-gui using keyboard-only.

Signed-off-by: Birger Skogeng Pedersen <redacted>
Signed-off-by: Pratyush Yadav <redacted>

cache-tree: do not lazy-fetch tentative tree

The cache-tree datastructure is used to speed up the comparison
between the HEAD and the index, and when the index is updated by
a cherry-pick (for example), a tree object that would represent
the paths in the index in a directory is constructed in-core, to
see if such a tree object exists already in the object store.

When the lazy-fetch mechanism was introduced, we converted this
"does the tree exist?" check into an "if it does not, and if we
lazily cloned, see if the remote has it" call by mistake. Since
the whole point of this check is to repair the cache-tree by
recording an already existing tree object opportunistically, we
shouldn't even try to fetch one from the remote.

Pass the OBJECT_INFO_SKIP_FETCH_OBJECT flag to make sure we only
check for existence in the local object store without triggering the
lazy fetch mechanism.

Signed-off-by: Jonathan Tan <redacted>
[jc: rewritten the proposed log message]
Signed-off-by: Junio C Hamano <redacted>

Second batch

Signed-off-by: Junio C Hamano <redacted>

Merge branch 'bc/reread-attributes-during-rebase'

The "git am" based backend of "git rebase" ignored the result of
updating ".gitattributes" done in one step when replaying
subsequent steps.

* bc/reread-attributes-during-rebase:
am: reload .gitattributes after patching it
path: add a function to check for path suffix

Merge branch 'tg/t0021-racefix'

A test fix.

* tg/t0021-racefix:
t0021: make sure clean filter runs

Merge branch 'mp/for-each-ref-missing-name-or-email'

"for-each-ref" and friends that shows refs did not protect themselves
against ancient tags that did not record tagger names when asked to
show "%(taggername)", which have been corrected.

* mp/for-each-ref-missing-name-or-email:
ref-filter: initialize empty name or email fields

Merge branch 'sb/userdiff-dts'

Device-tree files learned their own userdiff patterns.

* sb/userdiff-dts:
userdiff: add a builtin pattern for dts files

Merge branch 'rs/sort-oid-array-thread-safe'

Prepare get_short_oid() codepath to be thread-safe.

* rs/sort-oid-array-thread-safe:
sha1-name: make sort_ambiguous_oid_array() thread-safe

Merge branch 'nd/diff-parseopt'

Compilation fix.

* nd/diff-parseopt:
parseopt: move definition of enum parse_opt_result up

Merge branch 'jt/diff-lazy-fetch-submodule-fix'

On-demand object fetching in lazy clone incorrectly tried to fetch
commits from submodule projects, while still working in the
superproject, which has been corrected.

* jt/diff-lazy-fetch-submodule-fix:
diff: skip GITLINK when lazy fetching missing objs

Merge branch 'ds/midx-expire-repack'

Code cleanup.

* ds/midx-expire-repack:
packfile.h: drop extern from function declaration

Merge branch 'cb/fetch-set-upstream'

"git fetch" learned "--set-upstream" option to help those who first
clone from their private fork they intend to push to, add the true
upstream via "git remote add" and then "git fetch" from it.

* cb/fetch-set-upstream:
pull, fetch: add --set-upstream option

Merge branch 'rs/pax-extended-header-length-fix'

"git archive" recorded incorrect length in extended pax header in
some corner cases, which has been corrected.

* rs/pax-extended-header-length-fix:
  archive-tar: turn length miscalculation warning into BUG
  archive-tar: use size_t in strbuf_append_ext_header()
  archive-tar: fix pax extended header length calculation
  archive-tar: report wrong pax extended header length

Merge branch 'bm/repository-layout-typofix'

Typofix.

* bm/repository-layout-typofix:
repository-layout.txt: correct pluralization of 'object'

Merge branch 'en/checkout-mismerge-fix'

Fix a mismerge that happened in 2.22 timeframe.

* en/checkout-mismerge-fix:
checkout: remove duplicate code

Merge branch 'sg/diff-indent-heuristic-non-experimental'

We promoted the "indent heuristics" that decides where to split
diff hunks from experimental to the default a few years ago, but
some stale documentation still marked it as experimental, which has
been corrected.

* sg/diff-indent-heuristic-non-experimental:
diff: 'diff.indentHeuristic' is no longer experimental

Merge branch 'ds/feature-macros'

A mechanism to affect the default setting for a (related) group of
configuration variables is introduced.

* ds/feature-macros:
  repo-settings: create feature.experimental setting
  repo-settings: create feature.manyFiles setting
  repo-settings: parse core.untrackedCache
  commit-graph: turn on commit-graph by default
  t6501: use 'git gc' in quiet mode
  repo-settings: consolidate some config settings

Merge branch 'jk/eoo'

The command line parser learned "--end-of-options" notation; the
standard convention for scripters to have hardcoded set of options
first on the command line, and force the command to treat end-user
input as non-options, has been to use "--" as the delimiter, but
that would not work for commands that use "--" as a delimiter
between revs and pathspec.

* jk/eoo:
  gitcli: document --end-of-options
  parse-options: allow --end-of-options as a synonym for "--"
  revision: allow --end-of-options to end option parsing

Merge branch 'jk/repo-init-cleanup'

Further clean-up of the initialization code.

* jk/repo-init-cleanup:
  config: stop checking whether the_repository is NULL
  common-main: delay trace2 initialization
  t1309: use short branch name in includeIf.onbranch test

Merge branch 'py/git-gui-do-quit'

"git gui" learned to call the clean-up procedure before exiting.

* py/git-gui-do-quit:
git-gui: call do_quit before destroying the main window

grep: skip UTF8 checks explicitly

18547aacf5 ("grep/pcre: support utf-8", 2016-06-25) that was released
with git 2.10 added the PCRE_UTF8 flag to PCRE1 matching including a
call to has_non_ascii() to try to avoid breakage if there was non-utf8
encoded content in the haystack.

Usually PCRE is compiled with JIT support (even if is not the default),
and therefore the codepath used includes calling pcre_jit_exec, which
skips UTF-8 validation by design (which might result in crashes or hangs)
but when JIT support wasn't compiled we use pcre_exec instead with the
posibility that grep might be aborted if invalid UTF-8 is found in the
haystack.

PCRE1 provides a flag since Mar 5, 2007 that could be used to skip the
checks explicitly so use that to make both codepaths equivalent (the
flag is ignored by pcre1_jit_exec)

this fix is only implemented for PCRE1 because PCRE2 is likely to have
a better solution (without the risks) instead in the future

Helped-by: Johannes Schindelin <redacted>
Helped-by: Eric Sunshine <redacted>
Helped-by: Ævar Arnfjörð Bjarmason <redacted>
Suggested-by: Junio C Hamano <redacted>
Signed-off-by: Carlo Marcelo Arenas Belón <redacted>
Signed-off-by: Junio C Hamano <redacted>

log-tree: call load_ref_decorations() in get_name_decoration()

Load a default set of ref name decorations at the first lookup.  This
frees direct and indirect callers from doing so.  They can still do it
if they want to use a filter or are interested in full decorations
instead of the default short ones -- the first load_ref_decorations()
call wins.

This means that the load in builtin/log.c::cmd_log_init_finish() is
respected even if --simplify-by-decoration is given, as the previously
dominating earlier load in handle_revision_opt() is gone.  So a filter
given with --decorate-refs-exclude is used for simplification in that
case, as expected.

Signed-off-by: René Scharfe <redacted>
Signed-off-by: Junio C Hamano <redacted>

log: test --decorate-refs-exclude with --simplify-by-decoration

Demonstrate that a decoration filter given with --decorate-refs-exclude
is inadvertently overruled by --simplify-by-decoration.

Reported-by: Étienne SERVAIS <redacted>
Signed-off-by: René Scharfe <redacted>
Signed-off-by: Junio C Hamano <redacted>

gitweb.conf.txt: switch pluses to backticks to help Asciidoctor

This paragraph uses a lot of +pluses+ to render text as monospace. That
works fine with AsciiDoc (8.6.10), and almost fine with Asciidoctor
(1.5.5), which renders the third of these literally ("+$projname+"). The
reason seems to be that Asciidoctor trips on the lone plus a bit
earlier, even though it is escaped.

Switch +$projname+ to `$projname`, and change the next, similar instance
too (+$projname/+), because otherwise, we'd trip on /that one/ instead.
If we would stop there, we would now start falling over on the escaped
plus ('\+') mentioned earlier, rendering /it/ literally. So change that
too...

In other words, unescape the lone '+' and change all the pluses that
follow it to backticks.

AsciiDoc renders this paragraph identically before and after this
commit, and Asciidoctor now renders this the same as AsciiDoc.

I did try to switch the whole paragraph to using backticks rather than
pluses. That worked great with Asciidoctor, but confused AsciiDoc...
Let's go with this rather surgical change instead.

Signed-off-by: Martin Ågren <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-merge-index.txt: wrap shell listing in "----"

The example output of `git merge-index` has been enriched by a second
"column" of helpful comments. When Asciidoctor renders this, the cells
in that second column aren't aligned.

Fix this by marking the example shell session as a code listing by
wrapping it in "----". Also drop some of the horizontal space between
the two columns so that we fit into 80 columns. This changes the
rendering with both AsciiDoc and Asciidoctor. They now render this
identically, nicely aligned, and within 80 columns.

Signed-off-by: Martin Ågren <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-receive-pack.txt: wrap shell [script] listing in "----"

The indented lines in the example shell script listing are indented
differently by AsciiDoc and Asciidoctor.

Fix this by marking the example shell script as a code listing by
wrapping it in "----". Because this gives us some extra indentation, we
can remove the one that we have been carrying explicitly. That is, drop
the first tab of indentation on each line. For consistency, make the
same change to the short example shell session further down.

With AsciiDoc, this results in identical rendering before and after this
commit. Asciidoctor now renders this the same as AsciiDoc does.

Signed-off-by: Martin Ågren <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-ls-remote.txt: wrap shell listing in "----"

The second "column" in the output of `git ls-remote` is typeset
differently by AsciiDoc and Asciidoctor, similar to various examples
touched by the last few commits.

Fix this by marking the example shell session as a code listing by
wrapping it in "----". Because this gives us some extra indentation, we
can remove the one that we have been carrying explicitly. That is, drop
the first tab of indentation on each line. With AsciiDoc, this results
in identical rendering before and after this commit. Asciidoctor now
renders this the same as AsciiDoc does.

Signed-off-by: Martin Ågren <redacted>
Signed-off-by: Junio C Hamano <redacted>

Documentation: wrap config listings in "----"

The indented lines in these example config-file listings are indented
differently by AsciiDoc and Asciidoctor.

Fix this by marking the example config-files as code listings by
wrapping them in "----". Because this gives us some extra indentation,
we can remove the one that we have been carrying explicitly. That is,
drop the first tab of indentation on each line.

With AsciiDoc, this results in identical rendering before and after this
commit. Asciidoctor now renders this the same as AsciiDoc does.

git-config.txt pretty consistently uses twelve dashes rather than the
minimum four to spell "----". Let's stick to the file-local convention
there.

Signed-off-by: Martin Ågren <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-merge-base.txt: render indentations correctly under Asciidoctor

There are several graphs in this document. For most of them, we use a
single leading tab to indent the whole graph, and then we use spaces
(possibly eight or more) to align things within the graph.

In the larger graph, we use a different strategy: We use 1-N tabs and
just a small number of spaces (<8). This is how we usually prefer to do
our indenting, but Asciidoctor ends up rendering this differently from
AsciiDoc. Same thing for the if-then-fi examples where the conditional
code is indented by two tabs, which renders differently under AsciiDoc
and Asciidoctor.

Similar to 379805051d ("Documentation: render revisions correctly under
Asciidoctor", 2018-05-06), use an explicit literal block to indicate
that we want to keep the leading whitespace in the tables. Change not
just the ones that render differently, but all of them for consistency.

Because this gives us some extra indentation, we can remove the one that
we have been carrying explicitly. That is, drop the first tab of
indentation on each line. With AsciiDoc, this results in identical
rendering before and after this commit, both for git-merge-base.1 and
git-merge-base.html.

A less intrusive change would be to replace tabs 2-N on each line with
eight spaces. But let's follow the example set by 379805051d, so that we
can use our preferred way of indenting.

Signed-off-by: Martin Ågren <redacted>
Signed-off-by: Junio C Hamano <redacted>

Documentation: wrap blocks with "--"

The documentation for each of these options contains a list. After the
list, AsciiDoc interprets the continuation as a continuation of the
*list*, not as a continution of the larger block. As a result, we get
too much indentation. Wrap the entire blocks in "--" to fix this. With
Asciidoctor, this commit is a no-op, and the two programs now render
these identically.

These two files share the same problem and indeed, they both document
`--untracked-files` in quite similar ways. I haven't checked to what
extent that is intentional or warranted, and to what extent they have
simply drifted apart. I consider such an investigation and possible
cleanup as out of scope for this commit and this patch series.

Signed-off-by: Martin Ågren <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph: turn off save_commit_buffer

The commit-graph tool may read a lot of commits, but it only cares about
parsing their metadata (parents, trees, etc) and doesn't ever show the
messages to the user. And so it should not need save_commit_buffer,
which is meant for holding onto the object data of parsed commits so
that we can show them later. In fact, it's quite harmful to do so.
According to massif, the max heap of "git commit-graph write
--reachable" in linux.git before/after this patch (removing the commit
graph file in between) goes from ~1.1GB to ~270MB.

Which isn't surprising, since the difference is about the sum of the
uncompressed sizes of all commits in the repository, and this was
equivalent to leaking them.

This obviously helps if you're under memory pressure, but even without
it, things go faster. My before/after times for that command (without
massif) went from 12.521s to 11.874s, a speedup of ~5%.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph: don't show progress percentages while expanding reachable commits

Commit 49bbc57a57 (commit-graph write: emit a percentage for all
progress, 2019-01-19) was a bit overeager when it added progress
percentages to the "Expanding reachable commits in commit graph" phase
as well, because most of the time the number of commits that phase has
to iterate over is not known in advance and grows significantly, and,
consequently, we end up with nonsensical numbers:

  $ git commit-graph write --reachable
  Expanding reachable commits in commit graph: 138606% (824706/595), done.
  [...]

  $ git rev-parse v5.0 | git commit-graph write --stdin-commits
  Expanding reachable commits in commit graph: 81264400% (812644/1), done.
  [...]

Even worse, because the percentage grows so quickly, the progress code
outputs much more often than it should (because it ticks every second,
or every 1%), slowing the whole process down. My time for "git
commit-graph write --reachable" on linux.git went from 13.463s to
12.521s with this patch, ~7% savings.

Therefore, don't show progress percentages in the "Expanding reachable
commits in commit graph" phase.

Note that the current code does sometimes do the right thing, if we
picked up all commits initially (e.g., omitting "--reachable" in a
fully-packed repository would get the correct count without any parent
traversal). So it may be possible to come up with a way to tell when we
could use a percentage here. But in the meantime, let's make sure we
robustly avoid printing nonsense.

Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph.c: handle corrupt/missing trees

Apply similar treatment as in the previous commit to handle an unchecked
call to 'get_commit_tree_oid()'. Previously, a NULL return value from
this function would be immediately dereferenced with '->hash', and then
cause a segfault.

Before dereferencing to access the 'hash' member, check the return value
of 'get_commit_tree_oid()' to make sure that it is not NULL.

To make this check correct, a related change is also needed in
'commit.c', which is to check the return value of 'get_commit_tree'
before taking its address. If 'get_commit_tree' returns NULL, we
encounter an undefined behavior when taking the address of the return
value of 'get_commit_tree' and then taking '->object.oid'. (On my system,
this is memory address 0x8, which is obviously wrong).

Fix this by making sure that 'get_commit_tree' returns something
non-NULL before digging through a structure that is not there, thus
preventing a segfault down the line in the commit graph code.

Signed-off-by: Taylor Blau <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph.c: handle commit parsing errors

To write a commit graph chunk, 'write_graph_chunk_data()' takes a list
of commits to write and parses each one before writing the necessary
data, and continuing on to the next commit in the list.

Since the majority of these commits are not parsed ahead of time (an
exception is made for the *last* commit in the list, which is parsed
early within 'copy_oids_to_commits'), it is possible that calling
'parse_commit_no_graph()' on them may return an error. Failing to catch
these errors before de-referencing later calls can result in a undefined
memory access and a SIGSEGV.

One such example of this is 'get_commit_tree_oid()', which expects a
parsed object as its input (in this case, the commit-graph code passes
'*list'). If '*list' causes a parse error, the subsequent call will
fail.

Prevent such an issue by checking the return value of
'parse_commit_no_graph()' to avoid passing an unparsed object to a
function which expects a parsed object, thus preventing a segfault.

It is worth noting that this fix is really skirting around the issue in
object.c's 'parse_object()', which makes it difficult to tell how
corrupt an object is without digging into it. Presumably one could
change the meaning of 'parse_object' returns, but this would require
adjusting each callsite accordingly. Instead of that, add an additional
check to the object parsed.

Signed-off-by: Taylor Blau <redacted>
Signed-off-by: Junio C Hamano <redacted>

t/t5318: introduce failing 'git commit-graph write' tests

When invoking 'git commit-graph' in a corrupt repository, one can cause
a segfault when ancestral commits are corrupt in one way or another.
This is due to two function calls in the 'commit-graph.c' code that may
return NULL, but are not checked for NULL-ness before dereferencing.

Before fixing the bug, introduce two failing tests that demonstrate the
problem. The first test corrupts an ancestral commit's parent to point
to a non-existent object. The second test instead corrupts an ancestral
tree by removing the 'tree' information entirely from the commit. Both
of these cases cause segfaults, each at different lines.

Signed-off-by: Taylor Blau <redacted>
Acked-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

builtin/rebase.c: Remove pointless message

When doing 'git rebase --autostash <upstream> <master>' with a dirty worktree
a 'HEAD is now at ...' message is emitted, which is pointless as it refers to
the old active branch which isn't actually moved.

This commit removes the 'HEAD is now at...' message.

Signed-off-by: Ben Wijen <redacted>
Signed-off-by: Junio C Hamano <redacted>

builtin/rebase.c: make sure the active branch isn't moved when autostashing

Consider the following scenario:
    git checkout not-the-master
    work work work
    git rebase --autostash upstream master

Here 'rebase --autostash <upstream> <branch>' incorrectly moves the
active branch (not-the-master) to master (before the rebase).

The expected behavior: (58794775:/git-rebase.sh:526)
    AUTOSTASH=$(git stash create autostash)
    git reset --hard
    git checkout master
    git rebase upstream
    git stash apply $AUTOSTASH

The actual behavior: (6defce2b:/builtin/rebase.c:1062)
    AUTOSTASH=$(git stash create autostash)
    git reset --hard master
    git checkout master
    git rebase upstream
    git stash apply $AUTOSTASH

This commit reinstates the 'legacy script' behavior as introduced with
58794775: rebase: implement --[no-]autostash and rebase.autostash

Signed-off-by: Ben Wijen <redacted>
Signed-off-by: Junio C Hamano <redacted>

t: use common $SQ variable

In many test scripts, there are bespoke definitions of the single quote
that are some variation of this:

SQ="'"

Define a common $SQ variable in test-lib.sh and replace all usages of
these bespoke variables with the common one.

This change was done by running `git grep =\"\'\" t/` and
`git grep =\\\\\'` and manually changing the resulting definitions and
corresponding usages.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

pack-objects: drop packlist index_pos optimization

Once upon a time, the code to add an object to our packing list in
pack-objects all lived in a single function. It computed the position
within the hash table once, then used it to check if the object was
already present, and if not, to add it.

Later, in 2834bc27c1 (pack-objects: refactor the packing list,
2013-10-24), this was split into two functions: packlist_find() and
packlist_alloc(). We ended up with an "index_pos" variable that gets
passed through several functions to make it from one to the other.

The resulting code is rather confusing to follow. The "index_pos"
variable is sometimes undefined, if we don't yet have a hash table. This
works out in practice because in that case packlist_alloc() won't use it
at all, since it will have to create/grow the hash table. But it's hard
to verify that, and it does cause gcc 9.2.1's -Wmaybe-uninitialized to
complain when compiled with "-flto -O3" (rightfully, since we do pass
the uninitialized value as a function parameter, even if nobody ends up
using it).

All of this is to save computing the hash index again when we're
inserting into the hash table, which I found doesn't make a measurable
difference in the program runtime (which is not surprising, since we're
doing all kinds of other heavyweight things for each object).

Let's just drop this index_pos variable entirely, simplifying the code
(and pleasing the compiler).

We might be better still refactoring this custom hash table to use one
of our existing implementations (an oidmap, or a kh_oid_map). I stopped
short of that here, but this would be the likely first step towards that
anyway.

Reported-by: Stephan Beyer <redacted>
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

test-read-cache: drop namelen variable

Early in the function we set "namelen = strlen(name)" if "name" is
non-NULL. Later, we use "namelen" only if "name" is non-NULL. However,
it's hard to immediately see this, and it seems to confuse gcc 9.2.1
(with "-flto" interestingly, though all of the involved logic is in
inline functions; it also triggers when building with ASan).

Let's simplify the code and remove the variable entirely. There's only
one use of namelen anyway, so we can just call strlen() then. It's true
this is in a loop, so we might execute strlen() more often. But:

  - this is test code that only ever loops twice in our test suite (we
    do loop 1000 times in a t/perf test, but without using this option).

  - a decent compiler ought to be able to hoist that out of the loop
    anyway (though I wouldn't count on gcc 9.2.1 doing so!)

Reported-by: Stephan Beyer <redacted>
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

diff-delta: set size out-parameter to 0 for NULL delta

When we cannot generate a delta, we return NULL but leave delta_size
untouched. This is generally OK, as callers rely on NULL to decide if
the output is usable or not. But it can confuse compilers; in
particular, gcc 9.2.1 with "-flto -O3" complains in fast-import's
store_object() that delta_len may be used uninitialized.

Let's change the diff-delta code to set the size explicitly to 0 for a
NULL return. That silences the compiler and makes it easier to reason
about the result.

Reported-by: Stephan Beyer <redacted>
Helped-by: Junio C Hamano <redacted>
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

bulk-checkin: zero-initialize hashfile_checkpoint

We declare a "struct hashfile_checkpoint" but only sometimes actually
call hashfile_checkpoint() on it. That makes it not immediately obvious
that it's valid when we later access its members.

In fact, the code is fine: we fill it in unconditionally in the while(1)
loop as long as "idx" is non-NULL. And then if "idx" is NULL, we exit
early from the function (because we're just computing the hash, not
actually writing), before we look at the struct.

However, this does seem to confuse gcc 9.2.1's -Wmaybe-uninitialized
when compiled with "-flto -O2" (probably because with LTO it can now
realize that our call to hashfile_truncate() does not set the members
either). Let's zero-initialize the struct to tell the compiler, as well
as any readers of the code, that all is well.

Reported-by: Stephan Beyer <redacted>
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

pack-objects: use object_id in packlist_alloc()

The only caller of packlist_alloc() already has a "struct object_id",
and we immediately copy the hash they pass us into our own object_id.
Let's avoid the unnecessary round-trip to a raw sha1 pointer.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-am: handle missing "author" when parsing commit

We try to parse the "author" line out of a commit buffer. We handle the
case that split_ident_line() doesn't work, but we don't do any error
checking that we found an "author" line in the first place! This would
cause us to segfault on such a corrupt object.

Let's put in an explicit NULL check (we can just die(), which is what a
bogus split would do, too). As a bonus, this silences a warning when
compiling with gcc 9.2.1 using "-flto -O3", which claims that ident_len
may be uninitialized (it would only be if we had a NULL here).

Reported-by: Stephan Beyer <redacted>
Helped-by: René Scharfe <redacted>
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

ci: restore running httpd tests

Once upon a time GIT_TEST_HTTPD was a tristate variable and we
exported 'GIT_TEST_HTTPD=YesPlease' in our CI scripts to make sure
that we run the httpd tests in the Linux Clang and GCC build jobs, or
error out if they can't be run for any reason [1].

Then 3b072c577b (tests: replace test_tristate with "git env--helper",
2019-06-21) came along, turned GIT_TEST_HTTPD into a bool, but forgot
to update our CI scripts accordingly. So, since GIT_TEST_HTTPD is set
explicitly, but its value is not one of the standardized true values,
our CI jobs have been simply skipping the httpd tests in the last
couple of weeks.

Set 'GIT_TEST_HTTPD=true' to restore running httpd tests in our CI
jobs.

[1] a1157b76eb (travis-ci: set GIT_TEST_HTTPD in 'ci/lib-travisci.sh',
2017-12-12)

Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>

t/lib-git-svn.sh: check GIT_TEST_SVN_HTTPD when running SVN HTTP tests

Once upon a time the GIT_SVN_TEST_HTTPD environment variable needed to
be set to enable SVN HTTP tests [1].

Then 3b072c577b (tests: replace test_tristate with "git env--helper",
2019-06-21) came along, and attempted to turn GIT_SVN_TEST_HTTPD into
a bool, but while doing so it mistyped the variable name, and started
to check GIT_TEST_HTTPD instead. Consequently, if someone explicitly
set GIT_TEST_HTTPD to true and has only the general 'git-svn'
dependencies installed but not the Subversion server modules for
Apache (libapache2-mod-svn), then a couple of 'git-svn' tests fail,
because they can't start httpd due to the missing module.

We could simply fix this by checking the GIT_SVN_TEST_HTTPD
variablewith 'git env--helper', but notice that the name of this
variable doesn't conform to our usual GIT_TEST_* convention.

So let's check the GIT_TEST_SVN_HTTPD instead.

[1] a8a5d25118 (git svn: migrate tests to use lib-httpd, 2016-07-23)

Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>

use get_tagged_oid()

Avoid derefencing ->tagged without checking for NULL by using the
convenience wrapper for getting the ID of the tagged object. It die()s
when encountering a broken tag instead of segfaulting.

Signed-off-by: René Scharfe <redacted>
Signed-off-by: Junio C Hamano <redacted>

tag: factor out get_tagged_oid()

Add a function for accessing the ID of the object referenced by a tag
safely, i.e. without causing a segfault when encountering a broken tag
where ->tagged is NULL.

Signed-off-by: René Scharfe <redacted>
Signed-off-by: Junio C Hamano <redacted>

unpack-trees: rename 'is_excluded_from_list()'

The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list' makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

Now that this library is renamed to use 'struct pattern_list'
and 'struct pattern', we can now rename the method used by
the sparse-checkout feature to determine which paths should
appear in the working directory.

The method is_excluded_from_list() is only used by the
sparse-checkout logic in unpack-trees and list-objects-filter.
The confusing part is that it returned 1 for "excluded" (i.e.
it matches the list of exclusions) but that really manes that
the path matched the list of patterns for _inclusion_ in the
working directory.

Rename the method to be path_matches_pattern_list() and have
it return an explicit 'enum pattern_match_result'. Here, the
values MATCHED = 1, UNMATCHED = 0, and UNDECIDED = -1 agree
with the previous integer values. This shift allows future
consumers to better understand what the retur values mean,
and provides more type checking for handling those values.

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

treewide: rename 'exclude' methods to 'pattern'

The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list' makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.

This commit renames several methods defined in dir.h to make
more sense with the renamed 'struct exclude_list' to 'struct
pattern_list' and 'struct exclude' to 'struct path_pattern':

* last_exclude_matching() -> last_matching_pattern()
* parse_exclude() -> parse_path_pattern()

In addition, the word 'exclude' was replaced with 'pattern'
in the methods below:

* add_exclude_list()
* add_excludes_from_file_to_list()
* add_excludes_from_file()
* add_excludes_from_blob_to_list()
* add_exclude()
* clear_exclude_list()

A few methods with the word "exclude" remain. These will
be handled seperately. In particular, the method
"is_excluded()" is concretely about the .gitignore file
relative to a specific directory. This is the important
boundary between library and consumer: is_excluded() cares
about .gitignore, but is_excluded() calls
last_matching_pattern() to make that decision.

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

treewide: rename 'EXCL_FLAG_' to 'PATTERN_FLAG_'

The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list' makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.

This commit replaces 'EXCL_FLAG_' to 'PATTERN_FLAG_' in the
names of the flags used on 'struct path_pattern'.

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

treewide: rename 'struct exclude_list' to 'struct pattern_list'

The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list' makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.

This commit renames 'struct exclude_list' to 'struct pattern_list'
and renames several variables called 'el' to 'pl'.

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

treewide: rename 'struct exclude' to 'struct path_pattern'

The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a list of 'struct exclude' items makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.

This commit renames 'struct exclude' to 'struct path_pattern'
and renames several variable names to match. 'struct pattern'
was already taken by attr.c, and this more completely describes
that the patterns are specific to file paths.

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

t9902: use a non-deprecated command for testing

t9902 had a list of three random porcelain commands as a sanity check,
one of which was filter-branch. Since we are recommending people not
use filter-branch, let's update this test to use rebase instead of
filter-branch.

Signed-off-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

Recommend git-filter-repo instead of git-filter-branch

filter-branch suffers from a deluge of disguised dangers that disfigure
history rewrites (i.e. deviate from the deliberate changes).  Many of
these problems are unobtrusive and can easily go undiscovered until the
new repository is in use.  This can result in problems ranging from an
even messier history than what led folks to filter-branch in the first
place, to data loss or corruption.  These issues cannot be backward
compatibly fixed, so add a warning to both filter-branch and its manpage
recommending that another tool (such as filter-repo) be used instead.

Also, update other manpages that referenced filter-branch.  Several of
these needed updates even if we could continue recommending
filter-branch, either due to implying that something was unique to
filter-branch when it applied more generally to all history rewriting
tools (e.g. BFG, reposurgeon, fast-import, filter-repo), or because
something about filter-branch was used as an example despite other more
commonly known examples now existing.  Reword these sections to fix
these issues and to avoid recommending filter-branch.

Finally, remove the section explaining BFG Repo Cleaner as an
alternative to filter-branch.  I feel somewhat bad about this,
especially since I feel like I learned so much from BFG that I put to
good use in filter-repo (which is much more than I can say for
filter-branch), but keeping that section presented a few problems:
  * In order to recommend that people quit using filter-branch, we need
    to provide them a recomendation for something else to use that
    can handle all the same types of rewrites.  To my knowledge,
    filter-repo is the only such tool.  So it needs to be mentioned.
  * I don't want to give conflicting recommendations to users
  * If we recommend two tools, we shouldn't expect users to learn both
    and pick which one to use; we should explain which problems one
    can solve that the other can't or when one is much faster than
    the other.
  * BFG and filter-repo have similar performance
  * All filtering types that BFG can do, filter-repo can also do.  In
    fact, filter-repo comes with a reimplementation of BFG named
    bfg-ish which provides the same user-interface as BFG but with
    several bugfixes and new features that are hard to implement in
    BFG due to its technical underpinnings.
While I could still mention both tools, it seems like I would need to
provide some kind of comparison and I would ultimately just say that
filter-repo can do everything BFG can, so ultimately it seems that it
is just better to remove that section altogether.

Signed-off-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

t6006: simplify, fix, and optimize empty message test

Test t6006.71 ("oneline with empty message") was creating two commits
with simple commit messages, and then running filter-branch to rewrite
the commit messages to be "empty". This test was introduced in commit
1fb5fdd25f0 ("rev-list: fix --pretty=oneline with empty message",
2010-03-21) and written this way because the --allow-empty-message
option to git commit did not exist at the time.

However, the filter-branch invocation used differed slightly from
--allow-empty-message in that it would have a commit message consisting
solely of a single newline, and as such was not testing what the
original commit intended to test. Since both a truly empty commit
message and a commit message with a single linefeed could trigger the
original bug, modify the test slightly to include an example of each.

Despite only being one piece of the 71st test and there being 73 tests
overall, this small change to just this one test speeds up the overall
execution time of t6006 (as measured by the best of 3 runs of `time
./t6006-rev-list-format.sh`) by about 11% on Linux, 13% on Mac, and
about 15% on Windows.

Signed-off-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

config/format.txt: specify default value of format.coverLetter

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

Doc: add more detail for git-format-patch

In git-format-patch.txt, we were missing some key user information.
First of all, document the special value of `--base=auto`.

Next, while we're at it, surround option arguments with <> and change
existing names such as "Message-Id" to "message id", which conforms with
how existing documentation is written.

Finally, document the `format.outputDirectory` config and change
`format.coverletter` to use camel case.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: stop losing return codes of git commands

Currently, there are two ways where the return codes of Git commands are
lost. The first way is when a command is in the upstream of a pipe. In a
pipe, only the return code of the last command is used. Thus, all other
commands will have their return codes masked. Rewrite pipes so that
there are no Git commands upstream.

The other way is when a command is in a non-assignment subshell. The
return code will be lost in favour of the surrounding command's. Rewrite
instances of this such that Git commands output to a file and
surrounding commands only call subshells with non-Git commands.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: remove confusing pipe in check_threading()

In check_threading(), there was a Git command in the upstream of a pipe.
In order to not lose its status code, it was saved into a file. However,
this may be confusing so rewrite to redirect IO to file. This allows us
to directly use the conventional &&-chain.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: use test_line_count() where possible

Convert all instances of `cnt=$(... | wc -l) && test $cnt = N` into uses
of `test_line_count()`.

While we're at it, convert one instance of a Git command upstream of a
pipe into two commands. This prevents a failure of a Git command from
being masked since only the return code of the last member of the pipe
is shown.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: let sed open its own files

In some cases, we were using a redirection operator to feed input into
sed. However, since sed is capable of opening its own files, make sed
open its own files instead of redirecting input into it.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: drop redirections to /dev/null

Since output is silenced when running without `-v` and debugging output
is useful with `-v`, remove redirections to /dev/null as it is not
useful.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: use indentable here-docs

The convention is to use indentable here-docs within test cases so that
the here-docs line up with the rest of the code within the test case.
Change here-docs from `<<\EOF` to `<<-\EOF` so that they can be indented
along with the rest of the test case.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: remove spaces after redirect operators

For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove these
spaces wherever they appear.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: use sq for test case names

The usual convention is for test case names to be written between
single-quotes. Change all double-quoted test case names to single-quotes
except for one test case name that uses a sq for a contraction.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: move closing sq onto its own line

The usual convention for test cases is for the closing sq to be on its
own line. Move the sq onto its own line for cases that do not conform to
this style.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4014: s/expected/expect/

For test cases, the usual convention is to name expected output files
"expect", not "expected". Replace all instances of "expected" with
"expect", except for one case where the "expected" is used as the name
of a test case.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

t3005: remove unused variable

Since the beginning of the script, $new_line variable was never used.
Remove it.

Signed-off-by: Junio C Hamano <redacted>

t: use LF variable defined in the test harness

A few test scripts assign a single LF to $LF, but that is already
given by test-lib.sh to everybody.

Remove the unnecessary reassignment.

Signed-off-by: Junio C Hamano <redacted>

compat/*.[ch]: remove extern from function declarations using spatch

In 554544276a (*.[ch]: remove extern from function declarations using
spatch, 2019-04-29), we removed externs from function declarations using
spatch but we intentionally excluded files under compat/ since some are
directly copied from an upstream and we should avoid churning them so
that manually merging future updates will be simpler.

In the last commit, we determined the files which taken from an upstream
so we can exclude them and run spatch on the remainder.

This was the Coccinelle patch used:

@@
type T;
identifier f;
@@
- extern
T f(...);

and it was run with:

$ git ls-files compat/\*\*.{c,h} |
xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place
$ git checkout -- \
compat/regex/ \
compat/inet_ntop.c \
compat/inet_pton.c \
compat/nedmalloc/ \
compat/obstack.{c,h} \
compat/poll/

Coccinelle has some trouble dealing with `__attribute__` and varargs so
we ran the following to ensure that no remaining changes were left
behind:

$ git ls-files compat/\*\*.{c,h} |
xargs sed -i'' -e 's/^$\s*$extern $[^(]*([^*]$/\1\2/'
$ git checkout -- \
compat/regex/ \
compat/inet_ntop.c \
compat/inet_pton.c \
compat/nedmalloc/ \
compat/obstack.{c,h} \
compat/poll/

Signed-off-by: Denton Liu <redacted>
Acked-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

mingw: apply array.cocci rule

After running Coccinelle on all sources inside compat/ that were created
by us[1], it was found that compat/mingw.c violated an array.cocci rule
in two places and, thus, a patch was generated. Apply this patch so that
all compat/ sources created by us follows all cocci rules.

[1]: Do not run Coccinelle on files that are taken from some upstream
because in case we need to pull updates from them, we would like to have
diverged as little as possible in order to make merging updates simpler.

The following sources were determined to have been taken from some
upstream:

* compat/regex/
* compat/inet_ntop.c
* compat/inet_pton.c
* compat/nedmalloc/
* compat/obstack.{c,h}
* compat/poll/

Signed-off-by: Denton Liu <redacted>
Acked-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

t3427: accelerate this test by using fast-export and fast-import

fast-export and fast-import can easily handle the simple rewrite that
was being done by filter-branch, and should be faster on systems with a
slow fork. Measuring the overall time taken for all of t3427 (not just
the difference between filter-branch and fast-export/fast-import) shows
a speedup of about 5% on Linux and 11% on Mac.

Signed-off-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

.gitignore: stop ignoring `.manifest` files

On Windows, it is possible to embed additional metadata into an
executable by linking in a "manifest", i.e. an XML document that
describes capabilities and requirements (such as minimum or maximum
Windows version). These XML documents are expected to be stored in
`.manifest` files.

At least _some_ Visual Studio versions auto-generate `.manifest` files
when none is specified explicitly, therefore we used to ask Git to
ignore them.

However, we do have a beautiful `.manifest` file now:
`compat/win32/git.manifest`, so neither does Visual Studio auto-generate
a manifest for us, nor do we want Git to ignore the `.manifest` files
anymore.

Further reading on auto-generated `.manifest` files:
https://docs.microsoft.com/en-us/cpp/build/manifest-generation-in-visual-studio

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

am: reload .gitattributes after patching it

When applying multiple patches with git am, or when rebasing using the
am backend, it's possible that one of our patches has updated a
gitattributes file. Currently, we cache this information, so if a
file in a subsequent patch has attributes applied, the file will be
written out with the attributes in place as of the time we started the
rebase or am operation, not with the attributes applied by the previous
patch. This problem does not occur when using the -m or -i flags to
rebase.

To ensure we write the correct data into the working tree, expire the
cache after each patch that touches a path ending in ".gitattributes".
Since we load these attributes in multiple separate files, we must
expire them accordingly.

Verify that both the am and rebase code paths work correctly, including
the conflict marker size with am -3.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

tree: simplify parse_tree_indirect()

Reduce code duplication by turning parse_tree_indirect() into a wrapper
of repo_peel_to_type(). This avoids a segfault when handling a broken
tag where ->tagged is NULL. The new version also checks the return
value of parse_object() that was ignored by the old one.

Initial-patch-by: Stefan Sperling <redacted>
Signed-off-by: René Scharfe <redacted>
Signed-off-by: Junio C Hamano <redacted>

fetch: add fetch.writeCommitGraph config setting

The commit-graph feature is now on by default, and is being
written during 'git gc' by default. Typically, Git only writes
a commit-graph when a 'git gc --auto' command passes the gc.auto
setting to actualy do work. This means that a commit-graph will
typically fall behind the commits that are being used every day.

To stay updated with the latest commits, add a step to 'git
fetch' to write a commit-graph after fetching new objects. The
fetch.writeCommitGraph config setting enables writing a split
commit-graph, so on average the cost of writing this file is
very small. Occasionally, the commit-graph chain will collapse
to a single level, and this could be slow for very large repos.

For additional use, adjust the default to be true when
feature.experimental is enabled.

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

push: disallow --all and refspecs when remote.<name>.mirror is set

Pushes with --all, or refspecs are disallowed when --mirror is given
to 'git push', or when 'remote.<name>.mirror' is set in the config of
the repository, because they can have surprising
effects. 800a4ab399 ("push: check for errors earlier", 2018-05-16)
refactored this code to do that check earlier, so we can explicitly
check for the presence of flags, instead of their sideeffects.

However when 'remote.<name>.mirror' is set in the config, the
TRANSPORT_PUSH_MIRROR flag would only be set after we calling
'do_push()', so the checks would miss it entirely.

This leads to surprises for users [*1*].

Fix this by making sure we set the flag (if appropriate) before
checking for compatibility of the various options.

*1*: https://twitter.com/FiloSottile/status/1163918701462249472

Reported-by: Filippo Valsorda <redacted>
Helped-by: Saleem Rashid
Signed-off-by: Thomas Gummerer <redacted>
Signed-off-by: Junio C Hamano <redacted>

merge-options.txt: clarify meaning of various ff-related options

As discovered on the mailing list, some of the descriptions of the
ff-related options were unclear. Try to be more precise with what these
options do.

Signed-off-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

clarify documentation for remote helpers

Signed-off-by: David Turner <redacted>
Signed-off-by: Junio C Hamano <redacted>

help: make help_unknown_ref() NORETURN

Announce that calling help_unknown_ref() exits the program.

Signed-off-by: René Scharfe <redacted>
Signed-off-by: Junio C Hamano <redacted>

checkout: add simple check for 'git checkout -b'

The 'git switch' command was created to separate half of the
behavior of 'git checkout'. It specifically has the mode to
do nothing with the index and working directory if the user
only specifies to create a new branch and change HEAD to that
branch. This is also the behavior most users expect from
'git checkout -b', but for historical reasons it also performs
an index update by scanning the working directory. This can be
slow for even moderately-sized repos.

A performance fix for 'git checkout -b' was introduced by
fa655d8411 (checkout: optimize "git checkout -b <new_branch>"
2018-08-16). That change includes details about the config
setting checkout.optimizeNewBranch when the sparse-checkout
feature is required. The way this change detected if this
behavior change is safe was through the skip_merge_working_tree()
method. This method was complex and needed to be updated
as new options were introduced.

This behavior was essentially reverted by 65f099b ("switch:
no worktree status unless real branch switch happens"
2019-03-29). Instead, two members of the checkout_opts struct
were used to distinguish between 'git checkout' and 'git switch':

* switch_branch_doing_nothing_is_ok
* only_merge_on_switching_branches

These settings have opposite values depending on if we start
in cmd_checkout or cmd_switch.

The message for 64f099b includes "Users of big repos are
encouraged to move to switch." Making this change while
'git switch' is still experimental is too aggressive.

Create a happy medium between these two options by making
'git checkout -b <branch>' behave just like 'git switch',
but only if we read exactly those arguments. This must
be done in cmd_checkout to avoid the arguments being
consumed by the option parsing logic.

This differs from the previous change by fa644d8 in that
the config option checkout.optimizeNewBranch remains
deleted. This means that 'git checkout -b' will ignore
the index merge even if we have a sparse-checkout file.
While this is a behavior change for 'git checkout -b',
it matches the behavior of 'git switch -c'.

Signed-off-by: Derrick Stolee <redacted>
Acked-by: Elijah Newren <redacted>
Acked-by: Taylor Blau <redacted>
Signed-off-by: Junio C Hamano <redacted>

gitk: Make web links clickable

This makes gitk look for http or https URLs in the commit description
and make the URLs clickable.  Clicking on them will invoke an external
web browser with the URL.

The web browser command is by default "xdg-open" on Linux, "open" on
MacOS, and "cmd /c start" on Windows.  The command can be changed in
the preferences window, and it can include parameters as well as the
command name.  If it is set to the empty string then URLs will no
longer be made clickable.

Signed-off-by: Paul Mackerras <redacted>

git-gui: allow undoing last revert

Accidental clicks on the revert hunk/lines buttons can cause loss of
work, and can be frustrating. So, allow undoing the last revert.

Right now, a stack or deque are not being used for the sake of
simplicity, so only one undo is possible. Any reverts before the
previous one are lost.

Signed-off-by: Pratyush Yadav <redacted>

rebase: teach rebase --keep-base

A common scenario is if a user is working on a topic branch and they
wish to make some changes to intermediate commits or autosquash, they
would run something such as

git rebase -i --onto master... master

in order to preserve the merge base. This is useful when contributing a
patch series to the Git mailing list, one often starts on top of the
current 'master'. While developing the patches, 'master' is also
developed further and it is sometimes not the best idea to keep rebasing
on top of 'master', but to keep the base commit as-is.

In addition to this, a user wishing to test individual commits in a
topic branch without changing anything may run

git rebase -x ./test.sh master... master

Since rebasing onto the merge base of the branch and the upstream is
such a common case, introduce the --keep-base option as a shortcut.

This allows us to rewrite the above as

git rebase -i --keep-base master

and

git rebase -x ./test.sh --keep-base master

respectively.

Add tests to ensure --keep-base works correctly in the normal case and
fails when there are multiple merge bases, both in regular and
interactive mode. Also, test to make sure conflicting options cause
rebase to fail. While we're adding test cases, add a missing
set_fake_editor call to 'rebase -i --onto master...side'.

While we're documenting the --keep-base option, change an instance of
"merge-base" to "merge base", which is the consistent spelling.

Helped-by: Eric Sunshine <redacted>
Helped-by: Junio C Hamano <redacted>
Helped-by: Ævar Arnfjörð Bjarmason <redacted>
Helped-by: Johannes Schindelin <redacted>
Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

rebase tests: test linear branch topology

Add tests rebasing a linear branch topology to linear rebase tests
added in 2aad7cace2 ("add simple tests of consistency across rebase
types", 2013-06-06).

These tests are duplicates of two surrounding tests that do the same
with tags pointing to the same objects. Right now there's no change in
behavior being introduced, but as we'll see in a subsequent change
rebase can have different behaviors when working implicitly with
remote tracking branches.

While I'm at it add a --fork-point test, strictly speaking this is
redundant to the existing '' test, as no argument to rebase implies
--fork-point. But now it's easier to grep for tests that explicitly
stress --fork-point.

Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

rebase: fast-forward --fork-point in more cases

Before, when we rebased with a --fork-point invocation where the
fork-point wasn't empty, we would be setting options.restrict_revision.
The fast-forward logic would automatically declare that the rebase was
not fast-forwardable if it was set. However, this was painting with a
very broad brush.

Refine the logic so that we can fast-forward in the case where the
restricted revision is equal to the merge base, since we stop rebasing
at the merge base anyway.

Helped-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

rebase: fast-forward --onto in more cases

Before, when we had the following graph,

A---B---C (master)
     \
      D (side)

running 'git rebase --onto master... master side' would result in D
being always rebased, no matter what. However, the desired behavior is
that rebase should notice that this is fast-forwardable and do that
instead.

Add detection to `can_fast_forward` so that this case can be detected
and a fast-forward will be performed. First of all, rewrite the function
to use gotos which simplifies the logic. Next, since the

options.upstream &&
!oidcmp(&options.upstream->object.oid, &options.onto->object.oid)

conditions were removed in `cmd_rebase`, we reintroduce a substitute in
`can_fast_forward`. In particular, checking the merge bases of
`upstream` and `head` fixes a failing case in t3416.

The abbreviated graph for t3416 is as follows:

    F---G topic
   /
  A---B---C---D---E master

and the failing command was

git rebase --onto master...topic F topic

Before, Git would see that there was one merge base (C), and the merge
and onto were the same so it would incorrectly return 1, indicating that
we could fast-forward. This would cause the rebased graph to be 'ABCFG'
when we were expecting 'ABCG'.

With the additional logic, we detect that upstream and head's merge base
is F. Since onto isn't F, it means we're not rebasing the full set of
commits from master..topic. Since we're excluding some commits, a
fast-forward cannot be performed and so we correctly return 0.

Add '-f' to test cases that failed as a result of this change because
they were not expecting a fast-forward so that a rebase is forced.

Helped-by: Phillip Wood <redacted>
Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

rebase: refactor can_fast_forward into goto tower

Before, can_fast_forward was written with an if-else statement. However,
in the future, we may be adding more termination cases which would lead
to deeply nested if statements.

Refactor to use a goto tower so that future cases can be easily
inserted.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>