git.99rst.org Git - git.git/log

t4018: drop "debugging" cat from hunk-header tests

We run a series of hunk-header tests in a loop, and each one does this:

test_when_finished 'cat actual' && # for debugging only

This is pretty pointless. When the test succeeds, we waste time running
a useless cat process. If you're debugging a failure with "-i", then we
won't run the when-finished part at all. So it helps only if you're
running with something like "--verbose-log".

Since we expect the tests to succeed most of the time, a better way to
do this would be a helper that checks the output and dumps "actual" only
when it fails. But it's probably not even worth the effort, as anyone
debugging a failure could just run with "-i" and investigate the
"actual" file themselves.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

Makefile: use compat regex with SANITIZE=address

Recent versions of the gcc and clang Address Sanitizer produce test
failures related to regexec(). This triggers with gcc-10 and clang-8
(but not gcc-9 nor clang-7). Running:

  make CC=gcc-10 SANITIZE=address test

results in failures in t4018, t3206, and t4062.

The cause seems to be that when built with ASan, we use a different
version of regexec() than normal. And this version doesn't understand
the REG_STARTEND flag. Here's my evidence supporting that.

The failure in t4062 is an ASan warning:

  expecting success of 4062.2 '-G matches':
   git diff --name-only -G "^(0{64}){64}$" HEAD^ >out &&
   test 4096-zeroes.txt = "$(cat out)"

  =================================================================
  ==672994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fa76f672000 at pc 0x7fa7726f75b6 bp 0x7ffe41bdda70 sp 0x7ffe41bdd220
  READ of size 4097 at 0x7fa76f672000 thread T0
      #0 0x7fa7726f75b5  (/lib/x86_64-linux-gnu/libasan.so.6+0x4f5b5)
      #1 0x562ae0c9c40e in regexec_buf /home/peff/compile/git/git-compat-util.h:1117
      #2 0x562ae0c9c40e in diff_grep /home/peff/compile/git/diffcore-pickaxe.c:52
      #3 0x562ae0c9cc28 in pickaxe_match /home/peff/compile/git/diffcore-pickaxe.c:166
      [...]

In this case we're looking in a buffer which was mmap'd via
reuse_worktree_file(), and whose size is 4096 bytes. But libasan's
regex tries to look at byte 4097 anyway! If we tweak Git like this:

  diff --git a/diff.c b/diff.c
  index 8e2914c031..cfae60c120 100644
  --- a/diff.c
  +++ b/diff.c
  @@ -3880,7 +3880,7 @@ static int reuse_worktree_file(struct index_state *istate,
           */
          if (ce_uptodate(ce) ||
              (!lstat(name, &st) && !ie_match_stat(istate, ce, &st, 0)))
  -               return 1;
  +               return 0;

          return 0;
   }

to use a regular buffer (with a trailing NUL) instead of an mmap, then
the complaint goes away.

The other failures are actually diff output with an incorrect funcname
header. If I instrument xdiff to show the funcname matching like so:

  diff --git a/xdiff-interface.c b/xdiff-interface.c
  index 8509f9ea22..f6c3dc1986 100644
  --- a/xdiff-interface.c
  +++ b/xdiff-interface.c
  @@ -197,6 +197,7 @@ struct ff_regs {
    struct ff_reg {
    regex_t re;
    int negate;
  + char *printable;
    } *array;
   };

  @@ -218,7 +219,12 @@ static long ff_regexp(const char *line, long len,

    for (i = 0; i < regs->nr; i++) {
    struct ff_reg *reg = regs->array + i;
  - if (!regexec_buf(&reg->re, line, len, 2, pmatch, 0)) {
  + int ret = regexec_buf(&reg->re, line, len, 2, pmatch, 0);
  + warning("regexec %s:\n  regex: %s\n  buf: %.*s",
  + ret == 0 ? "matched" : "did not match",
  + reg->printable,
  + (int)len, line);
  + if (!ret) {
    if (reg->negate)
    return -1;
    break;
  @@ -264,6 +270,7 @@ void xdiff_set_find_func(xdemitconf_t *xecfg, const char *value, int cflags)
    expression = value;
    if (regcomp(&reg->re, expression, cflags))
    die("Invalid regexp to look for hunk header: %s", expression);
  + reg->printable = xstrdup(expression);
    free(buffer);
    value = ep + 1;
    }

then when compiling with ASan and gcc-10, running the diff from t4018.66
produces this:

  $ git diff -U1 cpp-skip-access-specifiers
  warning: regexec did not match:
    regex: ^[     ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*])
    buf: private:
  warning: regexec matched:
    regex: ^((::[[:space:]]*)?[A-Za-z_].*)$
    buf: private:
  diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers
  index 4d4a9db..ebd6f42 100644
  --- a/cpp-skip-access-specifiers
  +++ b/cpp-skip-access-specifiers
  @@ -6,3 +6,3 @@ private:
          void DoSomething();
          int ChangeMe;
  };
          void DoSomething();
  -       int ChangeMe;
  +       int IWasChanged;
   };

That first regex should match (and is negated, so it should be telling
us _not_ to match "private:"). But it wouldn't if regexec() is looking
at the whole buffer, and not just the length-limited line we've fed to
regexec_buf(). So this is consistent again with REG_STARTEND being
ignored.

The correct output (compiling without ASan, or gcc-9 with Asan) looks
like this:

  warning: regexec matched:
    regex: ^[     ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*])
    buf: private:
  [...more lines that we end up not using...]
  warning: regexec matched:
    regex: ^((::[[:space:]]*)?[A-Za-z_].*)$
    buf: class RIGHT : public Baseclass
  diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers
  index 4d4a9db..ebd6f42 100644
  --- a/cpp-skip-access-specifiers
  +++ b/cpp-skip-access-specifiers
  @@ -6,3 +6,3 @@ class RIGHT : public Baseclass
          void DoSomething();
  -       int ChangeMe;
  +       int IWasChanged;
   };

So it really does seem like libasan's regex engine is ignoring
REG_STARTEND. We should be able to work around it by compiling with
NO_REGEX, which would use our local regexec(). But to make matters even
more interesting, this isn't enough by itself.

Because ASan has support from the compiler, it doesn't seem to intercept
our call to regexec() at the dynamic library level. It actually
recognizes when we are compiling a call to regexec() and replaces it
with ASan-specific code at that point. And unlike most of our other
compat code, where we might have git_mmap() or similar, the actual
symbol name in the compiled compat/regex code is regexec(). So just
compiling with NO_REGEX isn't enough; we still end up in libasan!

We can work around that by having the preprocessor replace regexec with
git_regexec (both in the callers and in the actual implementation), and
we truly end up with a call to our custom regex code, even when
compiling with ASan. That's probably a good thing to do anyway, as it
means anybody looking at the symbols later (e.g., in a debugger) would
have a better indication of which function is which. So we'll do the
same for the other common regex functions (even though just regexec() is
enough to fix this ASan problem).

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

built-in add -i: accept open-ended ranges again

The interactive `add` command allows selecting multiple files for some
of its sub-commands, via unique prefixes, indices or index ranges.

When re-implementing `git add -i` in C, we even added a code comment
talking about ranges with a missing end index, such as `2-`, but the
code did not actually accept those, as pointed out in
https://github.com/git-for-windows/git/issues/2466#issuecomment-574142760.

Let's fix this, and add a test case to verify that this stays fixed
forever.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

built-in add -i: do not try to `patch`/`diff` an empty list of files

When the user does not select any files to `patch` or `diff`, there is
no need to call `run_add_p()` on them.

Even worse: we _have_ to avoid calling `parse_pathspec()` with an empty
list because that would trigger this error:

BUG: pathspec.c:557: PATHSPEC_PREFER_CWD requires arguments

So let's avoid doing any work on a list of files that is empty anyway.

This fixes https://github.com/git-for-windows/git/issues/2466.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

submodule.c: mark more strings for translation

Signed-off-by: Ralf Thielow <redacted>
Signed-off-by: Junio C Hamano <redacted>

dir: point treat_leading_path() warning to the right place

Commit 777b420347 (dir: synchronize treat_leading_path() and
read_directory_recursive(), 2019-12-19) tried to add two warning
comments in those functions, pointing at each other. But the one in
treat_leading_path() just points at itself.

Let's fix that. Since the comment also redirects the reader for more
details to "the commit that added this warning", and since we're now
modifying the warning (creating a new commit without those details),
let's mention the actual commit id.

Signed-off-by: Jeff King <redacted>
Reviewed-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

dir: restructure in a way to avoid passing around a struct dirent

Restructure the code slightly to avoid passing around a struct dirent
anywhere, which also enables us to avoid trying to manufacture one.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

dir: treat_leading_path() and read_directory_recursive(), round 2

I was going to title this "dir: more synchronizing of
treat_leading_path() and read_directory_recursive()", a nod to commit
777b42034764 ("dir: synchronize treat_leading_path() and
read_directory_recursive()", 2019-12-19), but the title was too long.

Anyway, first the backstory...

fill_directory() has always had a slightly error-prone interface: it
returns a subset of paths which *might* match the specified pathspec; it
was intended to prune away some paths which didn't match the specified
pathspec and keep at least all the ones that did match it.  Given this
interface, callers were responsible to post-process the results and
check whether each actually matched the pathspec.

builtin/clean.c did this.  It would first prune out duplicates (e.g. if
"dir" was returned as well as all files under "dir/", then it would
simplify this to just "dir"), and after pruning duplicates it would
compare the remaining paths to the specified pathspec(s).  This
post-processing itself could run into problems, though, as noted in
commit 404ebceda01c ("dir: also check directories for matching
pathspecs", 2019-09-17):

    For the case of git-clean and a set of pathspecs of "dir/file" and
    "more", this caused a problem because we'd end up with dir entries
    for both of
      "dir"
      "dir/file"
    Then correct_untracked_entries() would try to helpfully prune
    duplicates for us by removing "dir/file" since it's under "dir",
    leaving us with
      "dir"
    Since the original pathspec only had "dir/file", the only entry left
    doesn't match and leaves nothing to be removed.  (Note that if only
    one pathspec was specified, e.g. only "dir/file", then the
    common_prefix_len optimizations in fill_directory would cause us to
    bypass this problem, making it appear in simple tests that we could
    correctly remove manually specified pathspecs.)

That commit fixed the issue -- when multiple pathspecs were specified --
by making sure fill_directory() wouldn't return both "dir" and
"dir/file" outside the common_prefix_len optimization path.  This is
where it starts to get fun.

In commit b9670c1f5e6b ("dir: fix checks on common prefix directory",
2019-12-19), we noticed that the common_prefix_len wasn't doing
appropriate checks and letting all kinds of stuff through, resulting in
recursing into .git/ directories and other craziness.  So it started
locking down and doing checks on pathnames within that code path.  That
continued with commit 777b42034764 ("dir: synchronize
treat_leading_path() and read_directory_recursive()", 2019-12-19), which
noted the following:

    Our optimization to avoid calling into read_directory_recursive()
    when all pathspecs have a common leading directory mean that we need
    to match the logic that read_directory_recursive() would use if we
    had just called it from the root.  Since it does more than call
    treat_path() we need to copy that same logic.

...and then it more forcefully addressed the issue with this wonderfully
ironic statement:

    Needing to duplicate logic like this means it is guaranteed someone
    will eventually need to make further changes and forget to update
    both locations.  It is tempting to just nuke the leading_directory
    special casing to avoid such bugs and simplify the code, but
    unpack_trees' verify_clean_subdirectory() also calls
    read_directory() and does so with a non-empty leading path, so I'm
    hesitant to try to restructure further.  Add obnoxious warnings to
    treat_leading_path() and read_directory_recursive() to try to warn
    people of such problems.

You would think that with such a strongly worded description, that its
author would have actually ensured that the logic in
treat_leading_path() and read_directory_recursive() did actually match
and that *everything* that was needed had at least been copied over at
the time that this paragraph was written.  But you'd be wrong, I messed
it up by missing part of the logic.

Copy the missing bits to fix the new final test in t7300.

Signed-off-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

clean: demonstrate a bug with pathspecs

b9670c1f5e (dir: fix checks on common prefix directory, 2019-12-19)
modified the way pathspecs are handled when handling a directory
during "git clean -f <path>". While this improved the behavior for
known test breakages, it also regressed in how the clean command
handles cleaning a specified file.

Add a test case that demonstrates this behavior. This test passes
before b9670c1f5e then fails after.

Helped-by: Kevin Willford <redacted>
Signed-off-by: Derrick Stolee <redacted>
Reviewed-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.x

With the upgrade, the library names changed from libeay32/ssleay32 to
libcrypto/libssl.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5604: make hash independent

To make our values hash independent, we turn the directory of the object
into "Y" and the file name into "Z" after having sorted items by their
name. However, when using SHA-256, one of our file names begins with an
"a" character, which means it sorts into the wrong place in the list,
causing the test to fail.

Since we don't care about the order of these items, just sort them after
stripping actual hash contents, which means they'll work with any hash
algorithm.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5601: switch into repository to hash object

This test performs a clone from outside any repository. Consequently,
the hash algorithm used defaults to SHA-1. When the test is running with
SHA-256, this results in an object ID that is not usable by the rest of
the test. In order to ensure that we provide a usable value, switch into
the source repository before hashing.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5562: use $ZERO_OID

This test uses $_z40 to express an all-zeros object ID, which doesn't
work for SHA-256. Use $ZERO_OID instead, which is the right size for
all hash values.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5540: make hash size independent

Use regex values based on $OID_REGEX instead of hard-coding them based
on expected object ID lengths.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5537: make hash size independent

This test modifies a pkt-line stream with sed to change a line with
"shallow" to "unshallow". However, this modification is dependent on
the size of the hash in use; with SHA-256, the pkt-line length is
different, leading to the sed command having no effect.

Use test_oid_cache to specify the correct values for each hash so that
the test continues to work.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5530: compute results based on object length

Compute the various pkt-line values based on the length of the object
IDs in use.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5512: abstract away SHA-1-specific constants

Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5510: make hash size independent

Use $OID_REGEX instead of hard-coding 40-based regular expressions.
Change invocations of cut with a hard-coded constant to split using a
delimiter instead.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5504: make hash algorithm independent

Instead of hard-coding invalid object IDs in this test, use test_oid to
look up ones of the appropriate length.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5324: make hash size independent

There are some offsets in the commit graph files used to corrupt data.
Compute these offsets for both SHA-1 and SHA-256 so that the test works
with either.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5319: make test work with SHA-256

This test corrupts various locations in a multi-pack index to test
various error responses. However, these offsets differ between SHA-1
indexes and SHA-256 indexes due to differences in object length. Use
test_oid to look up the correct offsets based on the algorithm.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5319: change invalid offset for SHA-256 compatibility

When using SHA-1, the existing value of the byte we use is 0x13, so
writing the byte 0x07 serves to corrupt the test and verify that we
detect corruption. However, when we use SHA-256, the value at that
offset is already 0x07, so our "corruption" doesn't work and the test
fails to detect it.

To provide a value that is truly out of range, let's use 0xff, which is
not likely to be a valid value as the high byte of a two-byte offset in
a multi-pack index this small.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5318: update for SHA-256

When running with SHA-256 as the hash algorithm, the hash version octet
is 2 instead of 1. Pick the right value depending on the hash algorithm
and use it where we look for the existing value. To ensure the test
checking for invalid data passes, use 3 as the test value for an invalid
hash version.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4300: abstract away SHA-1-specific constants

Adjust the test so that it computes values for object IDs instead of
using hard-coded hashes. Move the heredocs later in the tests so we can
take advantage of computed values.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4204: make hash size independent

Use $OID_REGEX instead of a hard-coded regular expression.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4202: abstract away SHA-1-specific constants

Adjust the test so that it computes values for object IDs instead of
using hard-coded hashes. Additionally, update the sanitize_output
function to sanitize the index lines in diff output, since it's clear
from the assertions in question that we are not interested in the
specific object IDs.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4200: make hash size independent

Instead of hard-coding a fixed length example object ID in the test,
look one up using the translation tables.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4134: compute appropriate length constant

Instead of using a specific invalid hard-coded object ID, generate one
of the appropriate length by looking one up in the translation tables.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4066: compute index line in diffs

Since the object ID used in the index line will differ between different
algorithms, compute these values instead of hard-coding them.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t4054: make hash-size independent

Instead of hard-coding the length of an object ID when creating a tree,
generate it based on $ZERO_OID.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

completion: list paths and refs for 'git worktree add'

Complete paths after 'git worktree add <TAB>' and refs after 'git
worktree add -b <TAB>' and 'git worktree add some/dir <TAB>'.

Uncharacteristically for a Git command, 'git worktree add' takes a
mandatory path parameter before a commit-ish as its optional last
parameter. In addition, it has both standalone --options and options
with a mandatory unstuck parameter ('-b <new-branch>'). Consequently,
trying to complete refs for that last optional commit-ish parameter
resulted in a more convoluted than usual completion function, but
hopefully all the included comments will make it not too hard to
digest.

Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>

completion: list existing working trees for 'git worktree' subcommands

Complete the paths of existing working trees for 'git worktree's
'move', 'remove', 'lock', and 'unlock' subcommands.

Note that 'git worktree list --porcelain' shows absolute paths, so for
simplicity's sake we'll complete full absolute paths as well (as
opposed to turning them into relative paths by finding common leading
directories between $PWD and the working tree's path and removing
them, risking trouble with symbolic links or Windows drive letters; or
completing them one path component at a time).

Never list the path of the main working tree, as it cannot be moved,
removed, locked, or unlocked.

Ideally we would only list unlocked working trees for the 'move',
'remove', and 'lock' subcommands, and only locked ones for 'unlock'.
Alas, 'git worktree list --porcelain' doesn't indicate which working
trees are locked, so for now we'll complete the paths of all existing
working trees.

Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>

completion: simplify completing 'git worktree' subcommands and options

The completion function for 'git worktree' uses separate but very
similar case arms to complete --options for each subcommand.

Combine these into a single case arm to avoid repetition.

Note that after this change we won't complete 'git worktree remove's
'--force' option, but that is consistent with our general stance on
not offering '--force', as it should be used with care.

Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>

completion: return the index of found word from __git_find_on_cmdline()

When using the __git_find_on_cmdline() helper function so far we've
only been interested in which one of a set of words appear on the
command line. To complete options for some of 'git worktree's
subcommands in the following patches we'll need not only that, but the
index of that word on the command line as well.

Extend __git_find_on_cmdline() to optionally show the index of the
found word on the command line (IOW in the $words array) when the
'--show-idx' option is given.

Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>

completion: clean up the __git_find_on_cmdline() helper function

The __git_find_on_cmdline() helper function started its life as
__git_find_subcommand() [1], but it served a more general purpose than
looking for subcommands, so later it was renamed accordingly [2].
However, that rename didn't touch the body of the function, and left
the $subcommand local variable behind, still reminiscent of the
function's original purpose.

Let's clean up the names of __git_find_on_cmdline()'s local variables
and get rid of that $subcommand variable name.

While at it, add a short comment describing the function's purpose.

[1] 3ff1320d4b (bash: refactor searching for subcommands on the
command line, 2008-03-10),
[2] 918c03c2a7 (bash: rename __git_find_subcommand() to
__git_find_on_cmdline(), 2009-09-15)

Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>

t9902-completion: add tests for the __git_find_on_cmdline() helper

The following two patches will refactor and extend the
__git_find_on_cmdline() helper function, so let's add a few tests
first to make sure that its basic behavior doesn't change.

Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>

gpg-interface: add minTrustLevel as a configuration option

Previously, signature verification for merge and pull operations checked
if the key had a trust-level of either TRUST_NEVER or TRUST_UNDEFINED in
verify_merge_signature().  If that was the case, the process die()d.

The other code paths that did signature verification relied entirely on
the return code from check_commit_signature().  And signatures made with
a good key, irregardless of its trust level, was considered valid by
check_commit_signature().

This difference in behavior might induce users to erroneously assume
that the trust level of a key in their keyring is always considered by
Git, even for operations where it is not (e.g. during a verify-commit or
verify-tag).

The way it worked was by gpg-interface.c storing the result from the
key/signature status *and* the lowest-two trust levels in the `result`
member of the signature_check structure (the last of these status lines
that were encountered got written to `result`).  These are documented in
GPG under the subsection `General status codes` and `Key related`,
respectively [1].

The GPG documentation says the following on the TRUST_ status codes [1]:

    """
    These are several similar status codes:

    - TRUST_UNDEFINED <error_token>
    - TRUST_NEVER     <error_token>
    - TRUST_MARGINAL  [0  [<validation_model>]]
    - TRUST_FULLY     [0  [<validation_model>]]
    - TRUST_ULTIMATE  [0  [<validation_model>]]

    For good signatures one of these status lines are emitted to
    indicate the validity of the key used to create the signature.
    The error token values are currently only emitted by gpgsm.
    """

My interpretation is that the trust level is conceptionally different
from the validity of the key and/or signature.  That seems to also have
been the assumption of the old code in check_signature() where a result
of 'G' (as in GOODSIG) and 'U' (as in TRUST_NEVER or TRUST_UNDEFINED)
were both considered a success.

The two cases where a result of 'U' had special meaning were in
verify_merge_signature() (where this caused git to die()) and in
format_commit_one() (where it affected the output of the %G? format
specifier).

I think it makes sense to refactor the processing of TRUST_ status lines
such that users can configure a minimum trust level that is enforced
globally, rather than have individual parts of git (e.g. merge) do it
themselves (except for a grace period with backward compatibility).

I also think it makes sense to not store the trust level in the same
struct member as the key/signature status.  While the presence of a
TRUST_ status code does imply that the signature is good (see the first
paragraph in the included snippet above), as far as I can tell, the
order of the status lines from GPG isn't well-defined; thus it would
seem plausible that the trust level could be overwritten with the
key/signature status if they were stored in the same member of the
signature_check structure.

This patch introduces a new configuration option: gpg.minTrustLevel.  It
consolidates trust-level verification to gpg-interface.c and adds a new
`trust_level` member to the signature_check structure.

Backward-compatibility is maintained by introducing a special case in
verify_merge_signature() such that if no user-configurable
gpg.minTrustLevel is set, then the old behavior of rejecting
TRUST_UNDEFINED and TRUST_NEVER is enforced.  If, on the other hand,
gpg.minTrustLevel is set, then that value overrides the old behavior.

Similarly, the %G? format specifier will continue show 'U' for
signatures made with a key that has a trust level of TRUST_UNDEFINED or
TRUST_NEVER, even though the 'U' character no longer exist in the
`result` member of the signature_check structure.  A new format
specifier, %GT, is also introduced for users that want to show all
possible trust levels for a signature.

Another approach would have been to simply drop the trust-level
requirement in verify_merge_signature().  This would also have made the
behavior consistent with other parts of git that perform signature
verification.  However, requiring a minimum trust level for signing keys
does seem to have a real-world use-case.  For example, the build system
used by the Qubes OS project currently parses the raw output from
verify-tag in order to assert a minimum trust level for keys used to
sign git tags [2].

[1] https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=blob;f=doc/doc/DETAILS;h=bd00006e933ac56719b1edd2478ecd79273eae72;hb=refs/heads/master
[2] https://github.com/QubesOS/qubes-builder/blob/9674c1991deef45b1a1b1c71fddfab14ba50dccf/scripts/verify-git-tag#L43

Signed-off-by: Hans Jerry Illikainen <redacted>
Signed-off-by: Junio C Hamano <redacted>

fetch: default to protocol version 2

The Git users at $DAYJOB have been using protocol v2 as a default for
~1.5 years now and others have been also reporting good experiences
with it, so it seems like a good time to bump the default version.  It
produces a significant performance improvement when fetching from
repositories with many refs, such as
https://chromium.googlesource.com/chromium/src.

This only affects the client, not the server.  (The server already
defaults to supporting protocol v2.)  The protocol change is backward
compatible, so this should produce no significant effect when
contacting servers that only speak protocol v0.

Signed-off-by: Jonathan Nieder <redacted>
Signed-off-by: Junio C Hamano <redacted>

protocol test: let protocol.version override GIT_TEST_PROTOCOL_VERSION

The GIT_TEST_PROTOCOL_VERSION environment variable can be used to
upgrade the version of Git protocol used in tests.  If both
GIT_TEST_PROTOCOL_VERSION and 'protocol.version' are set, the higher
value wins.

For usage within tests, these semantics are too complex.  Instead,
always use the value from protocol.version configuration when it is
set, falling back to GIT_TEST_PROTOCOL_VERSION.  This way, the envvar
provides a reliable preview of what will happen if the default
protocol version is changed.

Signed-off-by: Jonathan Nieder <redacted>
Signed-off-by: Junio C Hamano <redacted>

test: request GIT_TEST_PROTOCOL_VERSION=0 when appropriate

Since 8cbeba0632 (tests: define GIT_TEST_PROTOCOL_VERSION,
2019-02-25), it has been possible to run tests with a newer protocol
version by setting the GIT_TEST_PROTOCOL_VERSION envvar to a version
number. Tests that assume protocol v0 handle this by explicitly
setting

GIT_TEST_PROTOCOL_VERSION=

or similar constructs like 'test -z "$GIT_TEST_PROTOCOL_VERSION" ||
return 0' to declare that they only handle the default (v0) protocol.

The emphasis there is a bit off: it would be clearer to specify
GIT_TEST_PROTOCOL_VERSION=0 to inform the reader that these tests are
specifically testing and relying on details of protocol v0. Do so.

This way, a reader does not need to know what the default protocol
version is, and the tests can continue to work when the default
protocol version used by Git advances past v0.

Signed-off-by: Jonathan Nieder <redacted>
Signed-off-by: Junio C Hamano <redacted>

config doc: protocol.version is not experimental

Git's protocol version 2 has been working well in production for over
a year. Simplify documentation by no longer referring to it as
experimental.

Signed-off-by: Jonathan Nieder <redacted>
Signed-off-by: Junio C Hamano <redacted>

fetch test: use more robust test for filtered objects

"git cat-file -e" uses has_object_file, which can fetch from promisor
remotes when an object is missing.  These tests end up checking that
that fetch fails instead of for the object being missing.

By luck, the tests pass anyway:

- in one of these tests ("filtering by size"), the fetch fails because
  (in protocol v0) the server does not support fetches by SHA-1

- in the second, the object is present but the test could pass even if
  it weren't if the fetch succeeds

- in the third, the test sets extensions.partialClone to "arbitrary
  string" so that when it tries to fetch, it looks up the "arbitrary
  string" remote which does not exist

Use "git rev-list --objects --missing=allow-any", so that the tests
pass for the right reason.

Noticed while testing with protocol v2, which allows fetching by sha1
by default, causing the first fetch to succeed and the test to fail.

Signed-off-by: Jonathan Nieder <redacted>
Signed-off-by: Junio C Hamano <redacted>

fetch test: mark test of "skipping" haves as v0-only

Since 633a53179e (fetch test: avoid use of "VAR= cmd" with a shell
function, 2019-12-26), t5552.5 (do not send "have" with ancestors of
commits that server ACKed) fails when run with
GIT_TEST_PROTOCOL_VERSION=2.

The cause:

The progression of "have"s sent in negotiation depends on whether we
are using a stateless RPC based transport or a stateful bidirectional
one (see for example 44d8dc54e7, "Fix potential local deadlock during
fetch-pack", 2011-03-29).  In protocol v2, all transports are
stateless transports, while in protocol v0, transports such as local
access and ssh are stateful.

In stateful transports, the number of "have"s to send multiplies by
two each round until we reach PIPESAFE_FLUSH (that is, 32), and then
it increases by PIPESAFE_FLUSH each round.  In stateless transport,
the count multiplies by two each round until we reach LARGE_FLUSH
(which is 16384) and then multiplies by 1.1 each round after that.

Moreover, in stateful transports, as fetch-pack.c explains:

We keep one window "ahead" of the other side, and will wait
for an ACK only on the next one.

This affects t5552.5 because it looks for "have"s from the negotiator
that appear in that second window.  With protocol version 2, the
second window never arrives, and the test fails.

Until 633a53179e (2019-12-26), a previous test in the same file
contained

GIT_TEST_PROTOCOL_VERSION= trace_fetch client origin to_fetch

In many common shells (e.g. bash when run as "sh"), the setting of
GIT_TEST_PROTOCOL_VERSION to the empty string lasts beyond the
intended duration of the trace_fetch invocation.  This causes it to
override the GIT_TEST_PROTOCOL_VERSION setting that was passed in to
the test during the remainder of the test script, so t5552.5 never got
run using protocol v2 on those shells, regardless of the
GIT_TEST_PROTOCOL_VERSION setting from the environment.  633a53179e
fixed that, revealing the failing test.

Signed-off-by: Jonathan Nieder <redacted>
Signed-off-by: Junio C Hamano <redacted>

t/check-non-portable-shell: detect "FOO= shell_func", too

Just like assigning a nonempty value, assigning an empty value to a
shell variable when calling a function produces non-portable behavior:
in some shells, the assignment lasts for the duration of the function
invocation, and in others, it persists after the function returns.

Signed-off-by: Jonathan Nieder <redacted>
Signed-off-by: Junio C Hamano <redacted>

fetch test: avoid use of "VAR= cmd" with a shell function

Just like assigning a nonempty value, assigning an empty value to a
shell variable when calling a function produces non-portable behavior:
in some shells, the assignment lasts for the duration of the function
invocation, and in others, it persists after the function returns.

Use an explicit subshell with the envvar exported to make the behavior
consistent across shells and crystal clear.

All previous instances of this pattern used "VAR=value" (with nonempty
`value`), which is already diagnosed automatically by "make test-lint"
since a0a630192d (t/check-non-portable-shell: detect "FOO=bar
shell_func", 2018-07-13).

Noticed using an improved "make test-lint".

Signed-off-by: Jonathan Nieder <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: use python3's input() everywhere

Python3 deprecates raw_input() from 2.7 and replaced it with input().
Since we do not need 2.7's input() semantics, `raw_input()` is aliased
to `input()` for easy forward compatability.

Signed-off-by: Yang Zhao <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: simplify regex pattern generation for parsing diff-tree

It is not clear why a generator was used to create the regex used to
parse git-diff-tree output; I assume an early implementation required
it, but is not part of the mainline change.

Simply use a lazily initialized global instead.

Signed-off-by: Yang Zhao <redacted>
Reviewed-by: Ben Keene <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: use dict.items() iteration for python3 compatibility

Python3 uses dict.items() instead of .iteritems() to provide
iteratoration over dict. Although items() is technically less efficient
for python2.7 (allocates a new list instead of simply iterating), the
amount of data involved is very small and the penalty negligible.

Signed-off-by: Yang Zhao <redacted>
Reviewed-by: Ben Keene <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: use functools.reduce instead of reduce

For python3, reduce() has been moved to functools.reduce(). This is
also available in python2.7.

Signed-off-by: Yang Zhao <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: fix freezing while waiting for fast-import progress

As part of its importing process, git-p4 sends a `checkpoint` followed
immediately by `progress` to fast-import to force synchronization.
Due to buffering, it is possible for the `progress` command to not be
flushed before git-p4 begins to wait for the corresponding response.
This causes the script to freeze completely, and is consistently
observable at least on python-3.6.9.

Make sure this command sequence is completely flushed before waiting.

Signed-off-by: Yang Zhao <redacted>
Reviewed-by: Ben Keene <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: use marshal format version 2 when sending to p4

p4 does not appear to understand marshal format version 3 and above.
Version 2 was the latest supported by python-2.7.

Signed-off-by: Yang Zhao <redacted>
Reviewed-by: Ben Keene <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: open .gitp4-usercache.txt in text mode

Opening .gitp4-usercache.txt in text mode makes python 3 happy without
explicitly adding encoding and decoding.

Signed-off-by: Yang Zhao <redacted>
Reviewed-by: Ben Keene <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: convert path to unicode before processing them

P4 allows essentially arbitrary encoding for path data while we would
perfer to be dealing only with unicode strings. Since path data need to
survive round-trip back to p4, this patch implements the general policy
that we store path data as-is, but decode them to unicode before doing
any non-trivial processing.

A new `decode_path()` method is provided that generally does the correct
conversion, taking into account `git-p4.pathEncoding` configuration.

For python2.7, path strings will be left as-is if it only contains ASCII
characters.

For python3, decoding is always done so that we have str objects.

Signed-off-by: Yang Zhao <redacted>
Reviewed-by: Ben Keene <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: encode/decode communication with git for python3

Under python3, calls to write() on the stream to `git fast-import` must
be encoded. This patch wraps the IO object such that this encoding is
done transparently.

Conversely, any text data read from subprocesses must also be decoded
before running through the rest of the pipeline.

Signed-off-by: Yang Zhao <redacted>
Reviewed-by: Ben Keene <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: encode/decode communication with p4 for python3

The marshalled dict in the response given on STDOUT by p4 uses `str` for
keys and string values. When run using python3, these values are
deserialized as `bytes`, leading to a whole host of problems as the rest
of the code assumes `str` is used throughout.

This patch changes the deserialization behaviour such that, as much as
possible, text output from p4 is decoded to native unicode strings.
Exceptions are made for the field `data` as it is usually arbitrary
binary data. `depotFile[0-9]*`, `path`, and `clientFile` are also exempt
as they contain path strings not encoded with UTF-8, and must survive
round-trip back to p4.

Conversely, text data being piped to p4 must always be encoded when
running under python3.

encode_text_stream() and decode_text_stream() were added to make these
transformations more convenient.

Signed-off-by: Yang Zhao <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: remove string type aliasing

Now that python2.7 is the minimum required version and we no longer use
the basestring type, it is not necessary to use type aliasing to ensure
python3 compatibility.

Signed-off-by: Yang Zhao <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: change the expansion test from basestring to list

Python 3 handles strings differently than Python 2.7. Since Python 2
is reaching it's end of life, a series of changes are being submitted to
enable python 3.5 and following support. The current code fails basic
tests under python 3.5.

Some codepaths can represent a command line the program
internally prepares to execute either as a single string
(i.e. each token properly quoted, concatenated with $IFS) or
as a list of argv[] elements, and there are 9 places where
we say "if X is isinstance(_, basestring), then do this
thing to handle X as a command line in a single string; if
not, X is a command line in a list form".

This does not work well with Python 3, as there is no
basestring (everything is Unicode now), and even with Python
2, it was not an ideal way to tell the two cases apart,
because an internally formed command line could have been in
a single Unicode string.

Flip the check to say "if X is not a list, then handle X as
a command line in a single string; otherwise treat it as a
command line in a list form".

This will get rid of references to 'basestring', to migrate
the code ready for Python 3.

Thanks-to: Junio C Hamano <redacted>
Signed-off-by: Ben Keene <redacted>
Signed-off-by: Yang Zhao <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-p4: make python2.7 the oldest supported version

Python2.6 and earlier have been end-of-life'd for many years now, and
we actually already use 2.7-only features in the code. Make the version
check reflect current realities.

This also removes the need to explicitly define CalledProcessError if
it's not available.

Signed-off-by: Yang Zhao <redacted>
Signed-off-by: Junio C Hamano <redacted>

add: use advise function to display hints

Use the advise function in advice.c to display hints to the users, as
it provides a neat and a standard format for hint messages, i.e: the
text is colored in yellow and the line starts by the word "hint:".

Also this will enable us to control the messages using advice.*
configuration variables.

Signed-off-by: Heba Waly <redacted>
Signed-off-by: Junio C Hamano <redacted>

graph: fix collapse of multiple edges

This fix resolves the previously-added test_expect_failure in
t4215-log-skewed-merges.sh.

The issue lies in the "else" condition while updating the mapping
inside graph_output_collapsing_line(). In 0f0f389f (graph: tidy up
display of left-skewed merges, 2019-10-15), the output of left-
skewed merges was changed to allow an immediate horizontal edge in
the first parent, output by graph_output_post_merge_line() instead
of by graph_output_collapsing_line(). This condensed the first line
behavior as follows:

Before 0f0f389f:

| | | | | | *-.
| | | | | | |\ \
| |_|_|_|_|/ | |
|/| | | | | / /

After 0f0f389f:

| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /

However, a very subtle issue arose when the second and third parent
edges are collapsed in later steps. The second parent edge is now
immediately adjacent to a vertical edge. This means that the
condition

} else if (graph->mapping[i - 1] < 0) {

in graph_output_collapsing_line() evaluates as false. The block for
this condition was the only place where we connected the target
column with the current position with horizontal edge markers.

In this case, the final "else" block is run, and the edge is marked
as horizontal, but did not back-fill the blank columns between the
target and the current edge. Since the second parent edge is marked
as horizontal, the third parent edge is not marked as horizontal.
This causes the output to continue as follows:

Before this change:

| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /
| | | |/| |/
| | |/| |/|
| |/| |/| |
| | |/| | |

By adding the logic for "filling" a horizontal edge between the
target column and the current column, we are able to resolve the
issue.

After this change:

| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | |_|_|/| /
| |/| | | |/
| | | |_|/|
| | |/| | |

This output properly matches the expected blend of the edge
behavior before 0f0f389f and the merge commit rendering from
0f0f389f.

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

graph: add test to demonstrate horizontal line bug

A previous test in t4215-log-skewed-merges.sh was added to demonstrate
exactly the topology of a reported failure in "git log --graph". While
investigating the fix, we realized that multiple edges that could
collapse with horizontal lines were not doing so.

Specifically, examine this section of the graph:

| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /
| | | |/| |/
| | |/| |/|
| |/| |/| |
| | |/| | |
| | * | | |

Document this behavior with a test. This behavior is new, as the
behavior in v2.24.1 has the following output:

| | | | | | *-.
| | | | | | |\ \
| |_|_|_|_|/ / /
|/| | | | | / /
| | |_|_|_|/ /
| |/| | | | /
| | | |_|_|/
| | |/| | |
| | * | | |

The behavior changed logically in 479db18b ("graph: smooth appearance
of collapsing edges on commit lines", 2019-10-15), but was actually
broken due to an assert() bug in 458152cc ("graph: example of graph
output that can be simplified", 2019-10-15). A future change could
modify this behavior to do the following instead:

| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | |_|_|/| /
| |/| | | |/
| | | |_|/|
| | |/| | |
| | * | | |

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

t: directly test parse_pathspec_file()

Previously, `parse_pathspec_file()` was tested indirectly by invoking
git commands with properly crafted inputs. As demonstrated by the
previous bugfix, testing complicated black boxes indirectly can lead to
tests that silently test the wrong thing.

Introduce direct tests for `parse_pathspec_file()`.

Signed-off-by: Alexandr Miloslavskiy <redacted>
Signed-off-by: Junio C Hamano <redacted>

t: fix quotes tests for --pathspec-from-file

While working on the next patch, I also noticed that quotes testing via
`"\"file\\101.t\""` was somewhat incorrect: I escaped `\` one time while
I had to escape it two times! Tests still worked due to `"` being
preserved which in turn prevented pathspec from matching files.

Fix this by using here-doc instead.

Signed-off-by: Alexandr Miloslavskiy <redacted>
Signed-off-by: Junio C Hamano <redacted>

t: add tests for error conditions with --pathspec-from-file

Also move some old tests into the new tests: it doesn't seem reasonable
to have individual error condition tests.

Old test for `git commit` was corrected, previously it was instructed
to use stdin but wasn't provided with any stdin. While this works at
the moment, it's not exactly perfect.

Old tests for `git reset` were improved to test for a specific error
message.

Suggested-By: Phillip Wood <redacted>
Signed-off-by: Alexandr Miloslavskiy <redacted>
Signed-off-by: Junio C Hamano <redacted>

ci: include the built-in `git add -i` in the `linux-gcc` job

This job runs the test suite twice, once in regular mode, and once with
a whole slew of `GIT_TEST_*` variables set.

Now that the built-in version of `git add --interactive` is
feature-complete, let's also throw `GIT_TEST_ADD_I_USE_BUILTIN` into
that fray.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

built-in add -p: handle Escape sequences more efficiently

When `interactive.singlekey = true`, we react immediately to keystrokes,
even to Escape sequences (e.g. when pressing a cursor key).

The problem with Escape sequences is that we do not really know when
they are done, and as a heuristic we poll standard input for half a
second to make sure that we got all of it.

While waiting half a second is not asking for a whole lot, it can become
quite annoying over time, therefore with this patch, we read the
terminal capabilities (if available) and extract known Escape sequences
from there, then stop polling immediately when we detected that the user
pressed a key that generated such a known sequence.

This recapitulates the remaining part of b5cc003253c8 (add -i: ignore
terminal escape sequences, 2011-05-17).

Note: We do *not* query the terminal capabilities directly. That would
either require a lot of platform-specific code, or it would require
linking to a library such as ncurses.

Linking to a library in the built-ins is something we try very hard to
avoid (we even kicked the libcurl dependency to a non-built-in remote
helper, just to shave off a tiny fraction of a second from Git's startup
time). And the platform-specific code would be a maintenance nightmare.

Even worse: in Git for Windows' case, we would need to query MSYS2
pseudo terminals, which `git.exe` simply cannot do (because it is
intentionally *not* an MSYS2 program).

To address this, we simply spawn `infocmp -L -1` and parse its output
(which works even in Git for Windows, because that helper is included in
the end-user facing installations).

This is done only once, as in the Perl version, but it is done only when
the first Escape sequence is encountered, not upon startup of `git add
-i`; This saves on startup time, yet makes reacting to the first Escape
sequence slightly more sluggish. But it allows us to keep the
terminal-related code encapsulated in the `compat/terminal.c` file.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

built-in add -p: handle Escape sequences in interactive.singlekey mode

This recapitulates part of b5cc003253c8 (add -i: ignore terminal escape
sequences, 2011-05-17):

    add -i: ignore terminal escape sequences

    On the author's terminal, the up-arrow input sequence is ^[[A, and
    thus fat-fingering an up-arrow into 'git checkout -p' is quite
    dangerous: git-add--interactive.perl will ignore the ^[ and [
    characters and happily treat A as "discard everything".

    As a band-aid fix, use Term::Cap to get all terminal capabilities.
    Then use the heuristic that any capability value that starts with ^[
    (i.e., \e in perl) must be a key input sequence.  Finally, given an
    input that starts with ^[, read more characters until we have read a
    full escape sequence, then return that to the caller.  We use a
    timeout of 0.5 seconds on the subsequent reads to avoid getting stuck
    if the user actually input a lone ^[.

    Since none of the currently recognized keys start with ^[, the net
    result is that the sequence as a whole will be ignored and the help
    displayed.

Note that we leave part for later which uses "Term::Cap to get all
terminal capabilities", for several reasons:

1. it is actually not really necessary, as the timeout of 0.5 seconds
   should be plenty sufficient to catch Escape sequences,

2. it is cleaner to keep the change to special-case Escape sequences
   separate from the change that reads all terminal capabilities to
   speed things up, and

3. in practice, relying on the terminal capabilities is a bit overrated,
   as the information could be incomplete, or plain wrong. For example,
   in this developer's tmux sessions, the terminal capabilities claim
   that the "cursor up" sequence is ^[M, but the actual sequence
   produced by the "cursor up" key is ^[[A.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

built-in add -p: respect the `interactive.singlekey` config setting

The Perl version of `git add -p` supports this config setting to allow
users to input commands via single characters (as opposed to having to
press the <Enter> key afterwards).

This is an opt-in feature because it requires Perl packages
(Term::ReadKey and Term::Cap, where it tries to handle an absence of the
latter package gracefully) to work. Note that at least on Ubuntu, that
Perl package is not installed by default (it needs to be installed via
`sudo apt-get install libterm-readkey-perl`), so this feature is
probably not used a whole lot.

In C, we obviously do not have these packages available, but we just
introduced `read_single_keystroke()` that is similar to what
Term::ReadKey provides, and we use that here.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

terminal: add a new function to read a single keystroke

Typically, input on the command-line is line-based. It is actually not
really easy to get single characters (or better put: keystrokes).

We provide two implementations here:

- One that handles `/dev/tty` based systems as well as native Windows.
  The former uses the `tcsetattr()` function to put the terminal into
  "raw mode", which allows us to read individual keystrokes, one by one.
  The latter uses `stty.exe` to do the same, falling back to direct
  Win32 Console access.

  Thanks to the refactoring leading up to this commit, this is a single
  function, with the platform-specific details hidden away in
  conditionally-compiled code blocks.

- A fall-back which simply punts and reads back an entire line.

Note that the function writes the keystroke into an `strbuf` rather than
a `char`, in preparation for reading Escape sequences (e.g. when the
user hit an arrow key). This is also required for UTF-8 sequences in
case the keystroke corresponds to a non-ASCII letter.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

terminal: accommodate Git for Windows' default terminal

Git for Windows' Git Bash runs in MinTTY by default, which does not have
a Win32 Console instance, but uses MSYS2 pseudo terminals instead.

This is a problem, as Git for Windows does not want to use the MSYS2
emulation layer for Git itself, and therefore has no direct way to
interact with that pseudo terminal.

As a workaround, use the `stty` utility (which is included in Git for
Windows, and which *is* an MSYS2 program, so it knows how to deal with
the pseudo terminal).

Note: If Git runs in a regular CMD or PowerShell window, there *is* a
regular Win32 Console to work with. This is not a problem for the MSYS2
`stty`: it copes with this scenario just fine.

Also note that we introduce support for more bits than would be
necessary for a mere `disable_echo()` here, in preparation for the
upcoming `enable_non_canonical()` function.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

terminal: make the code of disable_echo() reusable

We are about to introduce the function `enable_non_canonical()`, which
shares almost the complete code with `disable_echo()`.

Let's prepare for that, by refactoring out that shared code.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

built-in add -p: handle diff.algorithm

The Perl version of `git add -p` reads the config setting
`diff.algorithm` and if set, uses it to generate the diff using the
specified algorithm.

This patch ports that functionality to the C version.

Note: just like `git-add--interactive.perl`, we do _not_ respect this
config setting in `git add -i`'s `diff` command, but _only_ in the
`patch` command.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

built-in add -p: support interactive.diffFilter

The Perl version supports post-processing the colored diff (that is
generated in addition to the uncolored diff, intended to offer a
prettier user experience) by a command configured via that config
setting, and now the built-in version does that, too.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

t3701: adjust difffilter test

In 42f7d45428e (add--interactive: detect bogus diffFilter output,
2018-03-03), we added a test case that verifies that the diffFilter
feature complains appropriately when the output is too short.

In preparation for the upcoming change where the built-in `add -p` is
taught to respect that setting, let's adjust that test a little. The
problem is that `echo too-short` is configured as diffFilter, and it
does not read the `stdin`. When calling it through `pipe_command()`, it
is therefore possible that we try to feed the `diff` to it while it is
no longer listening, and we receive a `SIGPIPE`.

The Perl code apparently handles this in a way similar to an
end-of-file, but taking a step back, we realize that a diffFilter that
does not even _look_ at its standard input is very unrealistic. The
entire point of this feature is to transform the diff, not to ignore it
altogether.

So let's modify the test case to reflect that insight: instead of
printing some bogus text, let's use a diffFilter that deletes the first
line of the diff instead.

This still tests for the same thing, but it does not confuse the
built-in `add -p` with that `SIGPIPE`.

Helped-by: SZEDER Gábor <redacted>
Helped-by: Jeff King <redacted>
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

submodule add: show 'add --dry-run' stderr when aborting

Unless --force is specified, 'submodule add' checks if the destination
path is ignored by calling 'git add --dry-run --ignore-missing', and,
if that call fails, aborts with a custom "path is ignored" message (a
slight variant of what 'git add' shows).  Aborting early rather than
letting the downstream 'git add' call fail is done so that the command
exits before cloning into the destination path.  However, in rare
cases where the dry-run call fails for a reason other than the path
being ignored---for example, due to a preexisting index.lock
file---displaying the "ignored path" error message hides the real
source of the failure.

Instead of displaying the tailored "ignored path" message, let's
report the standard error from the dry run to give the caller more
accurate information about failures that are not due to an ignored
path.

For the ignored path case, this leads to the following change in the
error message:

  The following [-path is-]{+paths are+} ignored by one of your .gitignore files:
  <destination path>
  Use -f if you really want to add [-it.-]{+them.+}

The new phrasing is a bit awkward, because 'submodule add' is only
dealing with one destination path.  Alternatively, we could continue
to use the tailored message when the exit code is 1 (the expected
status for a failure due to an ignored path) and relay the standard
error for all other non-zero exits.  That, however, risks hiding the
message of unrelated failures that share an exit code of 1, so it
doesn't seem worth doing just to avoid a clunkier, but still clear,
error message.

Signed-off-by: Kyle Meyer <redacted>
Signed-off-by: Junio C Hamano <redacted>

fsmonitor: handle version 2 of the hooks that will use opaque token

Some file monitors like watchman will use something other than a timestamp
to keep better track of what changes happen in between calls to query
the fsmonitor. The clockid in watchman is a string. Now that the index
is storing an opaque token for the last update the code needs to be
updated to pass that opaque token to a verion 2 of the fsmonitor hook.

Because there are repos that already have version 1 of the hook and we
want them to continue to work when git is updated, we need to handle
both version 1 and version 2 of the hook. In order to do that a
config value is being added core.fsmonitorHookVersion to force what
version of the hook should be used. When this is not set it will default
to -1 and then the code will attempt to call version 2 of the hook first.
If that fails it will fallback to trying version 1.

Signed-off-by: Kevin Willford <redacted>
Signed-off-by: Junio C Hamano <redacted>

fsmonitor: change last update timestamp on the index_state to opaque token

Some file system monitors might not use or take a timestamp for processing
and in the case of watchman could have race conditions with using a
timestamp. Watchman uses something called a clockid that is used for race
free queries to it. The clockid for watchman is simply a string.

Change the fsmonitor_last_update from being a uint64_t to a char pointer
so that any arbitrary data can be stored in it and passed back to the
fsmonitor.

Signed-off-by: Kevin Willford <redacted>
Signed-off-by: Junio C Hamano <redacted>

Git 2.25

Signed-off-by: Junio C Hamano <redacted>

Merge tag 'l10n-2.25.0-rnd1' of git://github.com/git-l10n/git-po

l10n-2.25.0-rnd1

* tag 'l10n-2.25.0-rnd1' of git://github.com/git-l10n/git-po:
  l10n: zh_CN: for git v2.25.0 l10n round 1
  l10n: Update Catalan translation
  l10n: de.po: Update German translation v2.25.0 round 1
  l10n: de.po: Reword generation numbers
  l10n: bg.po: Updated Bulgarian translation (4800t)
  l10n: es: 2.25.0 round #1
  l10n: sv.po: Update Swedish translation (4800t0f0u)
  l10n: fr.po v2.25.0 rnd 1
  l10n: vi(4800t): Updated Vietnamese translation v2.25.0
  l10n: zh_TW.po: update translation for v2.25.0 round 1
  l10n: it.po: update the Italian translation for Git 2.25.0
  l10n: git.pot: v2.25.0 round 1 (119 new, 13 removed)
  l10n: Update Catalan translation
  l10n: zh_TW: add translation for v2.24.0

Revert "Merge branch 'ra/rebase-i-more-options'"

This reverts commit 5d9324e0f4210bb7d52bcb79efe3935703083f72, reversing
changes made to c58ae96fc4bb11916b62a96940bb70bb85ea5992.

The topic turns out to be too buggy for real use.

cf. <redacted>

l10n: zh_CN: for git v2.25.0 l10n round 1

Translate 119 new messages (4800t0f0u) for git 2.25.0.

Signed-off-by: Jiang Xin <redacted>

Merge branch 'master' of github.com:Softcatala/git-po into git-po-master

* 'master' of github.com:Softcatala/git-po:
l10n: Update Catalan translation

Merge branch 'js/mingw-loosen-overstrict-tree-entry-checks'

Further tweak to a "no backslash in indexed paths" for Windows port
we applied earlier.

* js/mingw-loosen-overstrict-tree-entry-checks:
mingw: safeguard better against backslashes in file names

Merge branch 'ma/config-advice-markup-fix'

Documentation markup fix.

* ma/config-advice-markup-fix:
config/advice.txt: fix description list separator

l10n: Update Catalan translation

Signed-off-by: Jordi Mas <redacted>

mingw: safeguard better against backslashes in file names

In 224c7d70fa1 (mingw: only test index entries for backslashes, not tree
entries, 2019-12-31), we relaxed the check for backslashes in tree
entries to check only index entries.

However, the code change was incorrect: it was added to
`add_index_entry_with_check()`, not to `add_index_entry()`, so under
certain circumstances it was possible to side-step the protection.

Besides, the description of that commit purported that all index entries
would be checked when in fact they were only checked when being added to
the index (there are code paths that do not do that, constructing
"transient" index entries).

In any case, it was pointed out in one insightful review at
https://github.com/git-for-windows/git/pull/2437#issuecomment-566771835
that it would be a much better idea to teach `verify_path()` to perform
the check for a backslash. This is safer, even if it comes with two
notable drawbacks:

- `verify_path()` cannot say _what_ is wrong with the path, therefore
  the user will no longer be told that there was a backslash in the
  path, only that the path was invalid.

- The `git apply` command also calls the `verify_path()` function, and
  might have been able to handle Windows-style paths (i.e. with
  backslashes instead of forward slashes). This will no longer be
  possible unless the user (temporarily) sets `core.protectNTFS=false`.

Note that `git add <windows-path>` will _still_ work because
`normalize_path_copy_len()` will convert the backslashes to forward
slashes before hitting the code path that creates an index entry.

The clear advantage is that `verify_path()`'s purpose is to check the
validity of the file name, therefore we naturally tap into all the code
paths that need safeguarding, also implicitly into future code paths.

The benefits of that approach outweigh the downsides, so let's move the
check from `add_index_entry_with_check()` to `verify_path()`.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

unpack-trees: correctly compute result count

The clear_ce_flags_dir() method processes the cache entries within
a common directory. The returned int is the number of cache entries
processed by that directory. When using the sparse-checkout feature
in cone mode, we can skip the pattern matching for entries in the
directories that are entirely included or entirely excluded.

eb42feca (unpack-trees: hash less in cone mode, 2019-11-21)
introduced this performance feature. The old mechanism relied on
the counts returned by calling clear_ce_flags_1(), but the new
mechanism calculated the number of rows by subtracting "cache_end"
from "cache" to find the size of the range. However, the equation
is wrong because it divides by sizeof(struct cache_entry *). This
is not how pointer arithmetic works!

A coverity build of Git for Windows in preparation for the 2.25.0
release found this issue with the warning, "Pointer differences,
such as cache_end - cache, are automatically scaled down by the
size (8 bytes) of the pointed-to type (struct cache_entry *).
Most likely, the division by sizeof(struct cache_entry *) is
extraneous and should be eliminated." This warning is correct.

This leaves us with the question "how did this even work?" The
problem that occurs with this incorrect pointer arithmetic is
a performance-only bug, and a very slight one at that. Since
the entry count returned by clear_ce_flags_dir() is reduced by
a factor of 8, the loop in clear_ce_flags_1() will re-process
entries from those directories.

By inserting global counters into unpack-tree.c and tracing
them with trace2_data_intmax() (in a private change, for
testing), I was able to see count how many times the loop inside
clear_ce_flags_1() processed an entry and how many times
clear_ce_flags_dir() was called. Each of these are reduced by at
least a factor of 8 with the current change. A factor larger
than 8 happens when multiple levels of directories are repeated.

Specifically, in the Linux kernel repo, the command

git sparse-checkout set LICENSES

restricts the working directory to only the files at root and
in the LICENSES directory. Here are the measured counts:

clear_ce_flags_1 loop blocks:
Before: 11,520
After: 1,621

clear_ce_flags_dir calls:
Before: 7,048
After: 606

While these are dramatic counts, the time spent in
clear_ce_flags_1() is under one millisecond in each case, so
the improvement is not measurable as an end-to-end time.

Reported-by: Johannes Schindelin <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

l10n: de.po: Update German translation v2.25.0 round 1

Signed-off-by: Matthias Rüster <redacted>
Reviewed-by: Ralf Thielow <redacted>
Reviewed-by: Phillip Szelat <redacted>

l10n: de.po: Reword generation numbers

The english term generation is here not used in the sense of "to
generate" but in the sense of "generations of beings".

This corrects the initial translation from cf4c0c25 (l10n: update German
translation, 2018-12-06).

Fixed-by: SZEDER Gábor <redacted>
Signed-off-by: Ralf Thielow <redacted>

l10n: bg.po: Updated Bulgarian translation (4800t)

Signed-off-by: Alexander Shopov <redacted>

config/advice.txt: fix description list separator

The whole submoduleAlternateErrorStrategyDie item is interpreted as
being part of the supporting content of the preceding item. This is
because we don't give a double-colon "::" for the separator, but just a
single colon, ":". Let's fix that.

There are a few other matches for [^:]:\s*$ in Documentation/config, but
I didn't spot any similar bugs among them.

Signed-off-by: Martin Ågren <redacted>
Signed-off-by: Junio C Hamano <redacted>

Git 2.25-rc2

Signed-off-by: Junio C Hamano <redacted>

Merge branch 'ds/graph-assert-fix'

Since recent updates to the log graph rendering code, drawing
certain merges started triggering an assert on a condition that
would no longer hold true, which has been corrected.

* ds/graph-assert-fix:
graph: fix lack of color in horizontal lines
graph: drop assert() for merge with two collapsing parents

Merge branch 'tm/doc-submodule-absorb-fix'

Typofix.

* tm/doc-submodule-absorb-fix:
doc: submodule: fix typo for command absorbgitdirs

Merge branch 'pm/am-in-body-header-doc-update'

Doc update.

* pm/am-in-body-header-doc-update:
am: document that Date: can appear as an in-body header

Merge branch 'jb/doc-multi-pack-idx-fix'

Typofix.

* jb/doc-multi-pack-idx-fix:
multi-pack-index: correct configuration in documentation

Merge branch 'do/gitweb-typofix-in-comments'

Typofix.

* do/gitweb-typofix-in-comments:
gitweb: fix a couple spelling errors in comments

Merge https://github.com/prati0100/git-gui

* https://github.com/prati0100/git-gui:
  git-gui: allow opening currently selected file in default app
  git-gui: allow closing console window with Escape
  git gui: fix branch name encoding error
  git-gui: revert untracked files by deleting them
  git-gui: update status bar to track operations
  git-gui: consolidate naming conventions

graph: fix lack of color in horizontal lines

In some cases, horizontal lines in rendered graphs can lose their
coloring. This is due to a use of graph_line_addch() instead of
graph_line_write_column(). Using a ternary operator to pick the
character is nice for compact code, but we actually need a column to
provide the color.

Add a test to t4215-log-skewed-merges.sh to prevent regression.

Reported-by: Jeff King <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

graph: drop assert() for merge with two collapsing parents

When "git log --graph" shows a merge commit that has two collapsing
lines, like:

    | | | | *
    | |_|_|/|
    |/| | |/
    | | |/|
    | |/| |
    | * | |
    * | | |

we trigger an assert():

        graph.c:1228: graph_output_collapsing_line: Assertion
                      `graph->mapping[i - 3] == target' failed.

The assert was introduced by eaf158f8 ("graph API: Use horizontal
lines for more compact graphs", 2009-04-21), which is quite old.
This assert is trying to say that when we complete a horizontal
line with a single slash, it is because we have reached our target.

It is actually the _second_ collapsing line that hits this assert.
The reason we are in this code path is because we are collapsing
the first line, and in that case we are hitting our target now
that the horizontal line is complete. However, the second line
cannot be a horizontal line, so it will collapse without horizontal
lines. In this case, it is inappropriate to assert that we have
reached our target, as we need to continue for another column
before reaching the target. Dropping the assert is safe here.

The new behavior in 0f0f389f12 (graph: tidy up display of
left-skewed merges, 2019-10-15) caused the behavior change that
made this assertion failure possible. In addition to making the
assert possible, it also changed how multiple edges collapse.

In a larger example, the current code will output a collapse
as follows:

| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /
| | | |/| |/
| | |/| |/|
| |/| |/| |
| | |/| | |
| | * | | |

However, the intended collapse should allow multiple horizontal lines
as follows:

| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | |_|_|/| /
| |/| | | |/
| | | |_|/|
| | |/| | |
| | * | | |

This behavior is not corrected by this change, but is noted for a later
update.

Helped-by: Jeff King <redacted>
Reported-by: Bradley Smith <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>