git.99rst.org Git - git.git/log

revision.c: fix whitespace

Here, four spaces were used instead of tab characters.

Reported-by: Taylor Blau <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph: check chunk sizes after writing

In my experience while experimenting with new commit-graph chunks,
early versions of the corresponding new write_commit_graph_my_chunk()
functions are, sadly but not surprisingly, often buggy, and write more
or less data than they are supposed to, especially if the chunk size
is not directly proportional to the number of commits. This then
causes all kinds of issues when reading such a bogus commit-graph
file, raising the question of whether the writing or the reading part
happens to be buggy this time.

Let's catch such issues early, already when writing the commit-graph
file, and check that each write_graph_chunk_*() function wrote the
amount of data that it was expected to, and what has been encoded in
the Chunk Lookup table. Now that all commit-graph chunks are written
in a loop we can do this check in a single place for all chunks, and
any chunks added in the future will get checked as well.

Helped-by: René Scharfe <redacted>
Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph: simplify chunk writes into loop

In write_commit_graph_file() we now have one block of code filling the
array of 'struct chunk_info' with the IDs and sizes of chunks to be
written, and an other block of code calling the functions responsible
for writing individual chunks. In case of optional chunks like Extra
Edge List an Base Graphs List there is also a condition checking
whether that chunk is necessary/desired, and that same condition is
repeated in both blocks of code. Other, newer chunks have similar
optional conditions.

Eliminate these repeated conditions by storing the function pointers
responsible for writing individual chunks in the 'struct chunk_info'
array as well, and calling them in a loop to write the commit-graph
file. This will open up the possibility for a bit of foolproofing in
the following patch.

Helped-by: René Scharfe <redacted>
Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph: unify the signatures of all write_graph_chunk_*() functions

Update the write_graph_chunk_*() helper functions to have the same
signature:

  - Return an int error code from all these functions.
    write_graph_chunk_base() already has an int error code, now the
    others will have one, too, but since they don't indicate any
    error, they will always return 0.

  - Drop the hash size parameter of write_graph_chunk_oids() and
    write_graph_chunk_data(); its value can be read directly from
    'the_hash_algo' inside these functions as well.

This opens up the possibility for further cleanups and foolproofing in
the following two patches.

Helped-by: René Scharfe <redacted>
Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph: persist existence of changed-paths

The changed-path Bloom filters were released in v2.27.0, but have a
significant drawback. A user can opt-in to writing the changed-path
filters using the "--changed-paths" option to "git commit-graph write"
but the next write will drop the filters unless that option is
specified.

This becomes even more important when considering the interaction with
gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of
features.experimental). These config options trigger commit-graph writes
that the user did not signal, and hence there is no --changed-paths
option available.

Allow a user that opts-in to the changed-path filters to persist the
property of "my commit-graph has changed-path filters" automatically. A
user can drop filters using the --no-changed-paths option.

In the process, we need to be extremely careful to match the Bloom
filter settings as specified by the commit-graph. This will allow future
versions of Git to customize these settings, and the version with this
change will persist those settings as commit-graphs are rewritten on
top.

Use the trace2 API to signal the settings used during the write, and
check that output in a test after manually adjusting the correct bytes
in the commit-graph file.

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

bloom: fix logic in get_bloom_filter()

The get_bloom_filter() method is a bit complicated in some parts where
it does not need to be. In particular, it needs to return a NULL filter
only when compute_if_not_present is zero AND the filter data cannot be
loaded from a commit-graph file. This currently happens by accident
because the commit-graph does not load changed-path Bloom filters from
an existing commit-graph when writing a new one. This will change in a
later patch.

Also clean up some style issues while we are here.

One side-effect of returning a NULL filter is that the filters that are
reported as "too large" will now be reported as NULL insead of length
zero. This case was not properly covered before, so add a test. Further,
remote the counting of the zero-length filters from revision.c and the
trace2 logs.

Helped-by: René Scharfe <redacted>
Helped-by: SZEDER Gábor <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

contrib: subtree: adjust test to change in fmt-merge-msg

We're starting to stop treating `master' specially in fmt-merge-msg.
Adjust the test to reflect that change.

Signed-off-by: Đoàn Trần Công Danh <redacted>
Signed-off-by: Junio C Hamano <redacted>

The sixth batch

Signed-off-by: Junio C Hamano <redacted>

Merge branch 'sk/diff-files-show-i-t-a-as-new'

"git diff-files" has been taught to say paths that are marked as
intent-to-add are new files, not modified from an empty blob.

* sk/diff-files-show-i-t-a-as-new:
diff-files: treat "i-t-a" files as "not-in-index"

Merge branch 'rs/commit-reach-leakfix'

Leakfix.

* rs/commit-reach-leakfix:
commit-reach: plug minor memory leak after using is_descendant_of()

Merge branch 'rs/pull-leakfix'

Leakfix.

* rs/pull-leakfix:
pull: plug minor memory leak after using is_descendant_of()

Merge branch 'rs/retire-strbuf-write-fd'

A misdesigned strbuf_write_fd() function has been retired.

* rs/retire-strbuf-write-fd:
strbuf: remove unreferenced strbuf_write_fd method.
bugreport.c: replace strbuf_write_fd with write_in_full

Merge branch 'dl/diff-usage-comment-update'

An in-code comment in "git diff" has been updated.

* dl/diff-usage-comment-update:
builtin/diff: fix botched update of usage comment
builtin/diff: update usage comment

Merge branch 'xl/upgrade-repo-format'

Allow runtime upgrade of the repository format version, which needs
to be done carefully.

There is a rather unpleasant backward compatibility worry with the
last step of this series, but it is the right thing to do in the
longer term.

* xl/upgrade-repo-format:
  check_repository_format_gently(): refuse extensions for old repositories
  sparse-checkout: upgrade repository to version 1 when enabling extension
  fetch: allow adding a filter after initial clone
  repository: add a helper function to perform repository format upgrade

ci: modification of main.yml to use cmake for vs-build job

Teach .github/workflows/main.yml to use CMake for VS builds.

Modified the vs-test step to match windows-test step. This speeds
up the vs-test. Calling git-cmd from powershell and then calling git-bash
to perform the tests slows things down(factor of about 6). So git-bash
is directly called from powershell to perform the tests using prove.

NOTE: Since GitHub keeps the same directory for each job
(with respect to path) absolute paths are used in the bin-wrapper
scripts.

GitHub has switched to CMake 3.17.1 which changed the behaviour of
FindCURL module. An extra definition (-DCURL_NO_CURL_CMAKE=ON) has been
added to revert to the old behaviour.

In the configuration phase CMake looks for the required libraries for
building git (eg zlib,libiconv). So we extract the libraries before we
configure.

To check for ICONV_OMITS_BOM libiconv.dll needs to be in the working
directory of script or path. So we copy the dlls before we configure.

Signed-off-by: Sibi Siddharthan <redacted>
Signed-off-by: Junio C Hamano <redacted>

cmake: support for building git on windows with msvc and clang.

This patch adds support for Visual Studio and Clang builds

The minimum required version of CMake is upgraded to 3.15 because
this version offers proper support for Clang builds on Windows.

Libintl is not searched for when building with Visual Studio or Clang
because there is no binary compatible version available yet.

NOTE: In the link options invalidcontinue.obj has to be included.
The reason for this is because by default, Windows calls abort()'s
instead of setting errno=EINVAL when invalid arguments are passed to
standard functions.
This commit explains it in detail:
4b623d80f73528a632576990ca51e34c333d5dd6

On Windows the default generator is Visual Studio,so for Visual Studio
builds do this:

cmake `relative-path-to-srcdir`

NOTE: Visual Studio generator is a multi config generator, which means
that Debug and Release builds can be done on the same build directory.

For Clang builds do this:

On bash
CC=clang cmake `relative-path-to-srcdir` -G Ninja
-DCMAKE_BUILD_TYPE=[Debug or Release]

On cmd
set CC=Clang
cmake `relative-path-to-srcdir` -G Ninja
-DCMAKE_BUILD_TYPE=[Debug or Release]

Signed-off-by: Sibi Siddharthan <redacted>
Signed-off-by: Junio C Hamano <redacted>

cmake: support for building git on windows with mingw

This patch facilitates building git on Windows with CMake using MinGW

NOTE: The funtions unsetenv and hstrerror are not checked in Windows
builds.
Reasons
NO_UNSETENV is not compatible with Windows builds.
lines 262-264 compat/mingw.h

compat/mingw.h(line 25) provides a definition of hstrerror which
conflicts with the definition provided in
git-compat-util.h(lines 733-736).

To use CMake on Windows with MinGW do this:
cmake `relative-path-to-srcdir` -G "MinGW Makefiles"

Signed-off-by: Sibi Siddharthan <redacted>
Signed-off-by: Junio C Hamano <redacted>

cmake: support for testing git when building out of the source tree

This patch allows git to be tested when performin out of source builds.

This involves changing GIT_BUILD_DIR in t/test-lib.sh to point to the
build directory. Also some miscellaneous copies from the source directory
to the build directory.
The copies are:
t/chainlint.sed needed by a bunch of test scripts
po/is.po needed by t0204-gettext-rencode-sanity
mergetools/tkdiff needed by t7800-difftool
contrib/completion/git-prompt.sh needed by t9903-bash-prompt
contrib/completion/git-completion.bash needed by t9902-completion
contrib/svn-fe/svnrdump_sim.py needed by t9020-remote-svn

NOTE: t/test-lib.sh is only modified when tests are run not during
the build or configure.
The trash directory is still srcdir/t

Signed-off-by: Sibi Siddharthan <redacted>
Signed-off-by: Junio C Hamano <redacted>

cmake: support for testing git with ctest

This patch provides an alternate way to test git using ctest.
CTest ships with CMake, so there is no additional dependency being
introduced.

To perform the tests with ctest do this after building:
ctest -j[number of jobs]

NOTE: -j is optional, the default number of jobs is 1

Each of the jobs does this:
cd t/ && sh t[something].sh

The reason for using CTest is that it logs the output of the tests
in a neat way, which can be helpful during diagnosis of failures.

After the tests have run ctest generates three log files located in
`build-directory`/Testing/Temporary/

These log files are:

CTestCostData.txt:
This file contains the time taken to complete each test.

LastTestsFailed.log:
This log file contains the names of the tests that have failed in the
run.

LastTest.log:
This log file contains the log of all the tests that have run.
A snippet of the file is given below.

10/901 Testing: D:/my/git-master/t/t0009-prio-queue.sh
10/901 Test: D:/my/git-master/t/t0009-prio-queue.sh
Command: "sh.exe" "D:/my/git-master/t/t0009-prio-queue.sh"
Directory: D:/my/git-master/t
"D:/my/git-master/t/t0009-prio-queue.sh"
Output:
----------------------------------------------------------
ok 1 - basic ordering
ok 2 - mixed put and get
ok 3 - notice empty queue
ok 4 - stack order
passed all 4 test(s)
1..4
<end of output>
Test time = 1.11 sec

NOTE: Testing only works when building in source for now.

Signed-off-by: Sibi Siddharthan <redacted>
Signed-off-by: Junio C Hamano <redacted>

cmake: installation support for git

Install the built binaries and scripts using CMake

This is very similar to `make install`.
By default the destination directory(DESTDIR) is /usr/local/ on Linux
To set a custom installation path do this:
cmake `relative-path-to-srcdir`
-DCMAKE_INSTALL_PREFIX=`preferred-install-path`

Then run `make install`

Signed-off-by: Sibi Siddharthan <redacted>
Signed-off-by: Junio C Hamano <redacted>

cmake: generate the shell/perl/python scripts and templates, translations

Implement the placeholder substitution to generate scripted
Porcelain commands, e.g. git-request-pull out of
git-request-pull.sh

Generate shell/perl/python scripts and template using CMake instead of
using sed like the build procedure in the Makefile does.

The text translations are only build if `msgfmt` is found in your path.

NOTE: The scripts and templates are generated during configuration.

Signed-off-by: Sibi Siddharthan <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: use local array to store anonymized oid

Some older versions of gcc complain about this line:

  builtin/fast-export.c:412:2: error: dereferencing type-punned pointer
       will break strict-aliasing rules [-Werror=strict-aliasing]
    put_be32(oid.hash + hashsz - 4, counter++);
    ^

This seems to be a false positive, as there's no type-punning at all
here. oid.hash is an array of unsigned char; when we pass it to a
function it decays to a pointer to unsigned char. We do take a void
pointer in put_be32(), but it's immediately aliased with another pointer
to unsigned char (and clearly the compiler is looking inside the inlined
put_be32(), since the warning doesn't happen with -O0).

This happens on gcc 4.8 and 4.9, but not later versions (I tested gcc 6,
7, 8, and 9).

We can work around it by using a local array instead of an object_id
struct. This is a little more intimate with the details of object_id,
but for whatever reason doesn't seem to trigger the compiler warning.
We can revert this patch once we decide that those gcc versions are too
old to care about for a warning like this (gcc 4.8 is the default
compiler for Ubuntu Trusty, which is out-of-support but not fully
end-of-life'd until April 2022).

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: anonymize "master" refname

Running "fast-export --anonymize" will leave "refs/heads/master"
untouched in the output, for two reasons:

  - it helped to have some known reference point between the original
    and anonymized repository

  - since it's historically the default branch name, it doesn't leak any
    information

Now that we can ask fast-export to retain particular tokens, we have a
much better tool for the first one (because it works for any ref, not
just master).

For the second, the notion of "default branch name" is likely to become
configurable soon, at which point the name _does_ leak information.
Let's drop this special case in preparation.

Note that we have to adjust the test a bit, since it relied on using the
name "master" in the anonymized repos. We could just use
--anonymize-map=master to keep the same output, but then we wouldn't
know if it works because of our hard-coded master or because of the
explicit map.

So let's flip the test a bit, and confirm that we anonymize "master",
but keep "other" in the output.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: allow seeding the anonymized mapping

After you anonymize a repository, it can be hard to find which commits
correspond between the original and the result, and thus hard to
reproduce commands that triggered bugs in the original.

Let's make it possible to seed the anonymization map. This lets users
either:

  - mark names to be retained as-is, if they don't consider them secret
    (in which case their original commands would just work)

  - map names to new values, which lets them adapt the reproduction
    recipe to the new names without revealing the originals

The implementation is fairly straight-forward. We already store each
anonymized token in a hashmap (so that the same token appearing twice is
converted to the same result). We can just introduce a new "seed"
hashmap which is consulted first.

This does make a few more promises to the user about how we'll anonymize
things (e.g., token-splitting pathnames). But it's unlikely that we'd
want to change those rules, even if the actual anonymization of a single
token changes. And it makes things much easier for the user, who can
unblind only a directory name without having to specify each path within
it.

One alternative to this approach would be to anonymize as we see fit,
and then dump the whole refname and pathname mappings to a file. This
does work, but it's a bit awkward to use (you have to manually dig the
items you care about out of the mapping).

Helped-by: Eric Sunshine <redacted>
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

The fifth batch

Signed-off-by: Junio C Hamano <redacted>

Merge branch 'pb/t4014-unslave'

A branch name used in a test has been clarified to match what is
going on.

* pb/t4014-unslave:
t4014: do not use "slave branch" nomenclature

Merge branch 'jt/cdn-offload'

The "fetch/clone" protocol has been updated to allow the server to
instruct the clients to grab pre-packaged packfile(s) in addition
to the packed object data coming over the wire.

* jt/cdn-offload:
  upload-pack: fix a sparse '0 as NULL pointer' warning
  upload-pack: send part of packfile response as uri
  fetch-pack: support more than one pack lockfile
  upload-pack: refactor reading of pack-objects out
  Documentation: add Packfile URIs design doc
  Documentation: order protocol v2 sections
  http-fetch: support fetching packfiles by URL
  http-fetch: refactor into function
  http: refactor finish_http_pack_request()
  http: use --stdin when indexing dumb HTTP pack

Merge branch 'ss/submodule-set-branch-in-c'

Rewrite of parts of the scripted "git submodule" Porcelain command
continues; this time it is "git submodule set-branch" subcommand's
turn.

* ss/submodule-set-branch-in-c:
submodule: port subcommand 'set-branch' from shell to C

Merge branch 'ds/merge-base-is-ancestor-optim'

"git merge-base --is-ancestor" is taught to take advantage of the
commit graph.

* ds/merge-base-is-ancestor-optim:
commit-reach: use fast logic in repo_in_merge_base
commit-reach: create repo_is_descendant_of()

Merge branch 'dl/branch-cleanup'

Code clean-up around "git branch" with a minor bugfix.

* dl/branch-cleanup:
  branch: don't mix --edit-description
  t3200: test for specific errors
  t3200: rename "expected" to "expect"

Merge branch 'cc/upload-pack-data-3'

Code clean-up in the codepath that serves "git fetch" continues.

* cc/upload-pack-data-3:
  upload-pack: refactor common code into do_got_oid()
  upload-pack: move oldest_have to upload_pack_data
  upload-pack: pass upload_pack_data to got_oid()
  upload-pack: pass upload_pack_data to ok_to_give_up()
  upload-pack: pass upload_pack_data to send_acks()
  upload-pack: pass upload_pack_data to process_haves()
  upload-pack: change allow_unadvertised_object_request to an enum
  upload-pack: move allow_unadvertised_object_request to upload_pack_data
  upload-pack: move extra_edge_obj to upload_pack_data
  upload-pack: move shallow_nr to upload_pack_data
  upload-pack: pass upload_pack_data to send_unshallow()
  upload-pack: pass upload_pack_data to deepen_by_rev_list()
  upload-pack: pass upload_pack_data to deepen()
  upload-pack: pass upload_pack_data to send_shallow_list()

Merge branch 'ct/diff-with-merge-base-clarification'

"git diff" used to take arguments in random and nonsense range
notation, e.g. "git diff A..B C", "git diff A..B C...D", etc.,
which has been cleaned up.

* ct/diff-with-merge-base-clarification:
  Documentation: usage for diff combined commits
  git diff: improve range handling
  t/t3430: avoid undefined git diff behavior

Merge branch 'en/clean-cleanups'

Code clean-up of "git clean" resulted in a fix of recent
performance regression.

* en/clean-cleanups:
  clean: optimize and document cases where we recurse into subdirectories
  clean: consolidate handling of ignored parameters
  dir, clean: avoid disallowed behavior
  dir: fix a few confusing comments

Merge branch 'jk/complete-git-switch'

The command line completion (in contrib/) learned to complete
options that the "git switch" command takes.

* jk/complete-git-switch:
  completion: improve handling of --orphan option of switch/checkout
  completion: improve handling of -c/-C and -b/-B in switch/checkout
  completion: improve handling of --track in switch/checkout
  completion: improve handling of --detach in checkout
  completion: improve completion for git switch with no options
  completion: improve handling of DWIM mode for switch/checkout
  completion: perform DWIM logic directly in __git_complete_refs
  completion: extract function __git_dwim_remote_heads
  completion: replace overloaded track term for __git_complete_refs
  completion: add tests showing subpar switch/checkout --orphan logic
  completion: add tests showing subpar -c/C argument completion
  completion: add tests showing subpar -c/-C startpoint completion
  completion: add tests showing subpar switch/checkout --track logic
  completion: add tests showing subar checkout --detach logic
  completion: add tests showing subpar DWIM logic for switch/checkout
  completion: add test showing subpar git switch completion

tests: reference `seen` wherever `pu` was referenced

As our test suite partially reflects how we work in the Git project, it
is natural that the branch name `pu` was used in a couple places.

Since that branch was renamed to `seen`, let's use the new name
consistently.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

docs: adjust the technical overview for the rename `pu` -> `seen`

This patch tries to rewrite history a bit: the mail contents that have
been added to Git's source code are actually fixed, we cannot change
them in hindsight.

But as the `pu` branch _was_ renamed, and as the documents were added to
Git's source code not so much as historical record, but to describe the
status quo, let's pretend that we have a time machine and adjust the
provided information accordingly.

Where appropriate, quotes were added for readability.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

docs: adjust for the recent rename of `pu` to `seen`

As of "What's cooking in git.git (Jun 2020, #04; Mon, 22)", there is no
longer any `pu` branch, but a `seen` branch.

While we technically do not even need to update the manual pages, it
makes sense to update them because they clearly talk about branches in
git.git.

Please note that in two instances, this patch not only updates the
branch name, but also the description "(proposed updates)".

Where appropriate, quotes have been added for readability.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

worktree: avoid dead-code in conditional

get_worktrees() retrieves a list of all worktrees associated with a
repository, including the main worktree. The location of the main
worktree is determined by get_main_worktree() which needs to handle
three distinct cases for the main worktree after absolute-path
conversion:

    * <bare-repository>/.
    * <main-worktree>/.git/. (when $CWD is .git)
    * <main-worktree>/.git (when $CWD is any worktree)

They all need to be normalized to just the <path> portion, dropping any
"/." or "/.git" suffix.

It turns out, however, that get_main_worktree() was only handling the
first and last cases, i.e.:

    if (!strip_suffix(path, "/.git"))
        strip_suffix(path, "/.");

This shortcoming was addressed by 45f274fbb1 (get_main_worktree(): allow
it to be called in the Git directory, 2020-02-23) by changing the logic
to:

    strip_suffix(path, "/.");
    if (!strip_suffix(path, "/.git"))
        strip_suffix(path, "/.");

which makes the final strip_suffix() invocation dead-code.

Fix this oversight by enumerating the three distinct cases explicitly
rather than attempting to strip the suffix(es) incrementally.

Signed-off-by: Eric Sunshine <redacted>
Signed-off-by: Junio C Hamano <redacted>

testsvn: respect `init.defaultBranch`

The default name of the initial branch in new repositories can now be
configured. The `testsvn` remote helper translates the remote Subversion
repository's branch name `trunk` to the hard-coded name `master`.
Clearly, the intention was to make the name align with Git's defaults.

So while we are not talking about a newly-created repository in the
`testsvn` context, it is a newly-created _Git_ repository, si it _still_
makes sense to use the overridden default name for the initial branch
whenever users configured it.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

remote: use the configured default branch name when appropriate

When guessing the default branch name of a remote, and there are no refs
to guess from, we want to go with the preference specified by the user
for the fall-back, i.e. the default name to be used for the initial
branch of new repositories (because as far as the user is concerned, a
remote that has no branches yet is a new repository).

At the same time, when talking to an older Git server that does not
report a symref for `HEAD` (but instead reports a commit hash), let's
try to guess the configured default branch name first. If it does not
match the reported commit hash, let's fall back to `master` as before.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

clone: use configured default branch name when appropriate

When cloning a repository without any branches, Git chooses a default
branch name for the as-yet unborn branch.

As part of the implicit initialization of the local repository, Git just
learned to respect `init.defaultBranch` to choose a different initial
branch name. We now really want that branch name to be used as a
fall-back.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

init: allow setting the default for the initial branch name via the config

We just introduced the command-line option
`--initial-branch=<branch-name>` to allow initializing a new repository
with a different initial branch than the hard-coded one.

To allow users to override the initial branch name more permanently
(i.e. without having to specify the name manually for each and every
`git init` invocation), let's introduce the `init.defaultBranch` config
setting.

Helped-by: Johannes Schindelin <redacted>
Helped-by: Derrick Stolee <redacted>
Signed-off-by: Don Goodman-Wilson <redacted>
Signed-off-by: Junio C Hamano <redacted>

init: allow specifying the initial branch name for the new repository

There is a growing number of projects and companies desiring to change
the main branch name of their repositories (see e.g.
https://twitter.com/mislav/status/1270388510684598272 for background on
this).

To change that branch name for new repositories, currently the only way
to do that automatically is by copying all of Git's template directory,
then hard-coding the desired default branch name into the `.git/HEAD`
file, and then configuring `init.templateDir` to point to those copied
template files.

To make this process much less cumbersome, let's introduce a new option:
`--initial-branch=<branch-name>`.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

docs: add missing diamond brackets

There were a couple of instances in our manual pages that had an
opening diamond bracket without a corresponding closing one.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

submodule: fall back to remote's HEAD for missing remote.<name>.branch

When `remote.<name>.branch` is not configured, `git submodule update`
currently falls back to using the branch name `master`. A much better
idea, however, is to use the remote `HEAD`: on all Git servers running
reasonably recent Git versions, the symref `HEAD` points to the main
branch.

Note: t7419 demonstrates that there _might_ be use cases out there that
_expect_ `git submodule update --remote` to update submodules to the
remote `master` branch even if the remote `HEAD` points to another
branch. Arguably, this patch makes the behavior more intuitive, but
there is a slight possibility that this might cause regressions in
obscure setups.

Even so, it should be okay to fix this behavior without anything like a
longer transition period:

- The `git submodule update --remote` command is not really common.

- Current Git's behavior when running this command is outright
  confusing, unless the remote repository's current branch _is_ `master`
  (in which case the proposed behavior matches the old behavior).

- If a user encounters a regression due to the changed behavior, the fix
  is actually trivial: setting `submodule.<name>.branch` to `master`
  will reinstate the old behavior.

Helped-by: Philippe Blain <redacted>
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

send-pack/transport-helper: avoid mentioning a particular branch

When trying to push all matching branches, but none match, we offer a
message suggesting to push the `master` branch.

However, we want to step away from making that branch any more special
than any other branch, so let's reword that message to mention no branch
in particular.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

submodule: use submodule repository when preparing summary

In show_submodule_header(), we gather the left and right commits
of the submodule repository, as well as the merge bases. However,
prepare_submodule_summary() initializes the rev_info with the_repository,
so we end up parsing the commit in the wrong repository.

This results in a fatal error in parse_commit_in_graph(), since the
passed item does not belong to the repository's commit graph.

Signed-off-by: Michael Forney <redacted>
Acked-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

revision: use repository from rev_info when parsing commits

This is needed when repo_init_revisions() is called with a repository
that is not the_repository to ensure appropriate repository is used
in repo_parse_commit_internal(). If the wrong repository is used,
a fatal error is the commit-graph machinery occurs:

fatal: invalid commit position. commit-graph is likely corrupt

Since revision.c was the only user of the parse_commit_gently
compatibility define, remove it from commit.h.

Signed-off-by: Michael Forney <redacted>
Acked-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

revision: reallocate TOPO_WALK object flags

The bit fields in struct object have an unfortunate layout.  Here's what
pahole reports on x86_64 GNU/Linux:

struct object {
unsigned int               parsed:1;             /*     0: 0  4 */
unsigned int               type:3;               /*     0: 1  4 */

/* XXX 28 bits hole, try to pack */

/* Force alignment to the next boundary: */
unsigned int               :0;

unsigned int               flags:29;             /*     4: 0  4 */

/* XXX 3 bits hole, try to pack */

struct object_id           oid;                  /*     8    32 */

/* size: 40, cachelines: 1, members: 4 */
/* sum members: 32 */
/* sum bitfield members: 33 bits, bit holes: 2, sum bit holes: 31 bits */
/* last cacheline: 40 bytes */
};

Notice the 1+3+29=33 bits in bit fields and 28+3=31 bits in holes.

There are holes inside the flags bit field as well -- while some object
flags are used for more than one purpose, 22, 23 and 24 are still free.
Use  23 and 24 instead of 27 and 28 for TOPO_WALK_EXPLORED and
TOPO_WALK_INDEGREE.  This allows us to reduce FLAG_BITS by one so that
all bitfields combined fit into a single 32-bit slot:

struct object {
unsigned int               parsed:1;             /*     0: 0  4 */
unsigned int               type:3;               /*     0: 1  4 */
unsigned int               flags:28;             /*     0: 4  4 */
struct object_id           oid;                  /*     4    32 */

/* size: 36, cachelines: 1, members: 4 */
/* last cacheline: 36 bytes */
};

With this tight packing the size of struct object is reduced by 10%.
Other architectures probably benefit as well.

Signed-off-by: René Scharfe <redacted>
Signed-off-by: Junio C Hamano <redacted>

lib-submodule-update: pass 'test_must_fail' as an argument

When we run a test helper function in test_submodule_switch_common(), we
sometimes specify a whole helper function as the $command. When we do
this, in some test cases, we just mark the whole function with
`test_must_fail`. However, it's possible that the helper function might
fail earlier or later than expected due to an introduced bug. If this
happens, then the test case will still report as passing but it should
really be marked as failing since it didn't actually display the
intended behaviour.

Instead of invoking `test_must_fail $command`, pass the string
"test_must_fail" as the second argument in case where the git command is
expected to fail.

When $command is a helper function, the parent function calling
test_submodule_switch_common() is test_submodule_switch_func(). For all
test_submodule_switch_func() invocations, increase the granularity of
the argument test helper function by prefixing the git invocation which is
meant to fail with the second argument like this:

$2 git checkout "$1"

In the other cases, test_submodule_switch() and
test_submodule_forced_switch(), instead of passing in the git command
directly, wrap it using the git_test_func() and pass the git arguments
using the global variable $gitcmd. Unfortunately, since closures aren't
a thing in shell scripts, the global variable is necessary. Another
unfortunate result is that the "git_test_func" will used as the test
case name when $command is printed but it's worth it for the cleaner
code.

Finally, as an added bonus, `test_must_fail` will now only run on git
commands.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: add a "data" callback parameter to anonymize_str()

The anonymize_str() function takes a generator callback, but there's no
way to pass extra context to it. Let's add the usual "void *data"
parameter to the generator interface and pass it along.

This is mildly annoying for existing callers, all of which pass NULL,
but is necessary to avoid extra globals in some cases we'll add in a
subsequent patch.

While we're touching each of these callbacks, we can further observe
that none of them use the existing orig/len parameters at all. This
makes sense, since the point is for their output to have no discernable
basis in the original (my original version had some notion that we might
use a one-way function to obfuscate the names, but it was never
implemented). So let's drop those extra parameters. If a caller really
wants to do something with them, it can pass a struct through the new
data parameter.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: move global "idents" anonymize hashmap into function

All of the other anonymization functions keep their static mappings
inside the function to avoid polluting the global namespace. Let's do
the same for "idents", as nobody needs it outside of
anonymize_ident_line().

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: use a flex array to store anonymized entries

Now that we're using a separate keydata struct for hash lookups, we have
more flexibility in how we allocate anonymized_entry structs. Let's push
the "orig" key into a flex member within the struct. That should save us
a few bytes of memory per entry (a pointer plus any malloc overhead),
and may make lookups a little faster (since it's one less pointer to
chase in the comparison function).

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: stop storing lengths in anonymized hashmaps

Now that the anonymize_str() interface is restricted to NUL-terminated
strings, there's no need for us to keep track of the length of each
entry in the hashmap. This simplifies the code and saves a bit of
memory.

Note that we do still need to compare the stored results to partial
strings passed in by the callers. We can do that by using hashmap's
keydata feature to get the ptr/len pair into the comparison function,
and then using strncmp().

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: tighten anonymize_mem() interface to handle only strings

While the anonymize_mem() interface _can_ store arbitrary byte
sequences, none of the callers uses this feature (as of the previous
commit). We'd like to keep it that way, as we'll be exposing the
string-like nature of the anonymization routines to the user. So let's
tighten up the interface a bit:

  - don't treat "len" as an out-parameter from anonymize_mem(); this
    ensures callers treat the pointer result as a NUL-terminated string

  - likewise, don't treat "len" as an out-parameter from generator
    functions

  - swap out "void *" for "char *" as appropriate to signal that we
    don't handle arbitrary memory

  - rename the function to anonymize_str()

This will also open up some optimization opportunities in a future
patch.

Note that we can't drop the "len" parameter entirely. Some callers do
pass in partial strings (e.g., "foo/bar", len=3) to avoid copying, and
we need to handle those still.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: store anonymized oids as hex strings

When fast-export stores anonymized oids, it does so as binary strings.
And while the anonymous mapping storage is binary-clean (at least as of
the previous commit), this will become awkward when we start exposing
more of it to the user. In particular, if we allow a method for
retaining token "foo", then users may want to specify a hex oid as such
a token.

Let's just switch to storing the hex strings. The difference in memory
usage is negligible (especially considering how infrequently we'd
generally store an oid compared to, say, path components).

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

fast-export: use xmemdupz() for anonymizing oids

Our anonymize_mem() function is careful to take a ptr/len pair to allow
storing binary tokens like object ids, as well as partial strings (e.g.,
just "foo" of "foo/bar"). But it duplicates the hash key using
xstrdup()! That means that:

  - for a partial string, we'd store all bytes up to the NUL, even
    though we'd never look at anything past "len". This didn't produce
    wrong behavior, but was wasteful.

  - for a binary oid that doesn't contain a zero byte, we'd copy garbage
    bytes off the end of the array (though as long as nothing complained
    about reading uninitialized bytes, further reads would be limited by
    "len", and we'd produce the correct results)

  - for a binary oid that does contain a zero byte, we'd copy _fewer_
    bytes than intended into the hashmap struct. When we later try to
    look up a value, we'd access uninitialized memory and potentially
    falsely claim that a particular oid is not present.

The most common reason to store an oid is an anonymized gitlink, but our
test case doesn't have any gitlinks at all. So let's add one whose oid
contains a NUL and is present at two different paths. ASan catches the
memory error, but even without it we can detect the bug because the oid
is not anonymized the same way for both paths.

And of course the fix is to copy the correct number of bytes. We don't
technically need the appended NUL from xmemdupz(), but it doesn't hurt
as an extra protection against anybody treating it like a string (plus a
future patch will push us more in that direction).

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

t9351: derive anonymized tree checks from original repo

Our tests of the anonymized repo just hard-code the expected set of
objects in the root and subdirectory trees. This makes them brittle to
the test setup changing (e.g., adding new paths that need tested).

Let's look at the original repo to compute our expected set of objects.
Note that this isn't completely perfect (e.g., we still rely on there
being only one tree in the root), but it does simplify later patches.

Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>

fmt-merge-msg: stop treating `master` specially

In the context of many projects renaming their primary branch names away
from `master`, Git wants to stop treating the `master` branch specially.

Let's start with `git fmt-merge-msg`.

Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph: change test to die on parse, not load

43d3561 (commit-graph write: don't die if the existing graph is corrupt,
2019-03-25) introduced the GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD environment
variable. This was created to verify that commit-graph was not loaded
when writing a new non-incremental commit-graph.

An upcoming change wants to load a commit-graph in some valuable cases,
but we want to maintain that we don't trust the commit-graph data when
writing our new file. Instead of dying on load, instead die if we ever
try to parse a commit from the commit-graph. This functionally verifies
the same intended behavior, but allows a more advanced feature in the
next change.

Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-graph: place bloom_settings in context

Place an instance of struct bloom_settings into the struct
write_commit_graph_context. This allows simplifying the function
prototype of write_graph_chunk_bloom_data(). This will allow us
to combine the function prototypes and use function pointers to
simplify write_commit_graph_file().

By using a pointer, we can later replace the settings to match those
that exist in the current commit-graph, in case a future Git version
allows customization of these parameters.

Reported-by: SZEDER Gábor <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

doc: fix author vs. committer copy/paste error

Signed-off-by: Miroslav Koškár <redacted>
Signed-off-by: Junio C Hamano <redacted>

builtin/diff: fix botched update of usage comment

In the previous commit, an attempt was made to correct the "N=1, M=0"
case. However, the fix was botched and it introduced two half-correct
sections by mistake. Combine these half-correct sections into one fully
correct section.

Signed-off-by: Denton Liu <redacted>
Signed-off-by: Junio C Hamano <redacted>

commit-reach: avoid is_descendant_of() shim

d91d6fbf26 (commit-reach: create repo_is_descendant_of(), 2020-06-17)
adds a repository aware version of is_descendant_of() and a backward
compatibility shim that is barely used.

Update all callers to directly use the new repo_is_descendant_of()
function instead; making the codebase simpler and pushing more
the_repository references higher up the stack.

Helped-by: Derrick Stolee <redacted>
Signed-off-by: Carlo Marcelo Arenas Belón <redacted>
Reviewed-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>

http-push: ensure unforced pushes fail when data would be lost

When we push using the DAV-based protocol, the client is the one that
performs the ref updates and therefore makes the checks to see whether
an unforced push should be allowed.  We make this check by determining
if either (a) we lack the object file for the old value of the ref or
(b) the new value of the ref is not newer than the old value, and in
either case, reject the push.

However, the ref_newer function, which performs this latter check, has
an odd behavior due to the reuse of certain object flags.  Specifically,
it will incorrectly return false in its first invocation and then
correctly return true on a subsequent invocation.  This occurs because
the object flags used by http-push.c are the same as those used by
commit-reach.c, which implements ref_newer, and one piece of code
misinterprets the flags set by the other.

Note that this does not occur in all cases.  For example, if the example
used in the tests is changed to use one repository instead of two and
rewind the head to add a commit, the test passes and we correctly reject
the push.  However, the example provided does trigger this behavior, and
the code has been broken in this way since at least Git 2.0.0.

To solve this problem, let's move the two sets of object flags so that
they don't overlap, since we're clearly using them at the same time.
The new set should not conflict with other usage because other users are
either builtin code (which is not compiled into git http-push) or
upload-pack (which we similarly do not use here).

Reported-by: Michael Ward <redacted>
Signed-off-by: René Scharfe <redacted>
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

The fourth batch

Signed-off-by: Junio C Hamano <redacted>

Merge branch 'en/sparse-with-submodule-doc'

The effect of sparse checkout settings on submodules is documented.

* en/sparse-with-submodule-doc:
git-sparse-checkout: clarify interactions with submodules

Merge branch 'es/worktree-duplicate-paths'

The same worktree directory must be registered only once, but
"git worktree move" allowed this invariant to be violated, which
has been corrected.

* es/worktree-duplicate-paths:
  worktree: make "move" refuse to move atop missing registered worktree
  worktree: generalize candidate worktree path validation
  worktree: prune linked worktree referencing main worktree path
  worktree: prune duplicate entries referencing same worktree path
  worktree: make high-level pruning re-usable
  worktree: give "should be pruned?" function more meaningful name
  worktree: factor out repeated string literal

Merge branch 'jt/redact-all-cookies'

The interface to redact sensitive information in the trace output
has been simplified.

* jt/redact-all-cookies:
http: redact all cookies, teach GIT_TRACE_REDACT=0

Merge branch 'cc/upload-pack-data-2'

Further code clean-up.

* cc/upload-pack-data-2:
  upload-pack: move pack_objects_hook to upload_pack_data
  upload-pack: move allow_sideband_all to upload_pack_data
  upload-pack: move allow_ref_in_want to upload_pack_data
  upload-pack: move allow_filter to upload_pack_data
  upload-pack: move keepalive to upload_pack_data
  upload-pack: pass upload_pack_data to upload_pack_config()
  upload-pack: change multi_ack to an enum
  upload-pack: move multi_ack to upload_pack_data
  upload-pack: move filter_capability_requested to upload_pack_data
  upload-pack: move use_sideband to upload_pack_data
  upload-pack: move static vars to upload_pack_data
  upload-pack: annotate upload_pack_data fields
  upload-pack: actually use some upload_pack_data bitfields

bash-completion: add git-prune into bash completion

Sometimes git would suggest the user to run `git prune` when there are
too many unreachable loose objects. It's more user-friendly if we add
git-prune into bash completion.

Signed-off-by: John Lin <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-cvsexportcommit: port to SHA-256

When we apply a binary patch, we must have the full object ID in the
header in order to apply it; without that, any attempt to apply it will
fail.  If we set GIT_DIR to empty, git apply does not know about the
hash algorithm we're using, and consequently any attempt to apply a
patch using SHA-256 will fail, since the object ID is the wrong length.

The reason we set the GIT_DIR environment variable is because we don't
want to modify the index; we just want to know whether the patch
applies.  Instead, let's just use a temporary file for the index, which
will be cleaned up automatically when the object goes out of scope.

Additionally, read the configuration for the repository and compute the
length of an object ID based on it.  Use that when matching object IDs
with a regex or computing the all-zeros object ID.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-cvsimport: port to SHA-256

Instead of calling the function is_sha1, call it is_oid and update it to
match either a SHA-1 or a SHA-256 hex object ID.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-cvsserver: port to SHA-256

The code of git-cvsserver currently has several hard-coded 20 and 40
constants that are the length of SHA-1. When parsing the configuration
file, read the extensions.objectformat configuration setting as well as
CVS-related ones and adjust the hash sizes accordingly. Use these
computed values in all the places we match object IDs.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-svn: set the OID length based on hash algorithm

When reading the configuration or when creating a new repository, load
the extensions.objectFormat value and set the object ID length to 64 if
it's "sha256". Note that we use the hex length in git-svn because most
of our processing is done on hex values, not binary ones.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

perl: make SVN code hash independent

There are several places throughout git-svn that use various hard-coded
constants. For matching object IDs, use the $oid variable. Compute the
record size we use for our revision storage based on the object ID.

When parsing the revision map format, use a wildcard in the pack format
since we know that the data we're parsing is always exactly the record
size. This lets us continue to use a constant for the pack format.

Finally, update several comments to reflect the fact that an object ID
may be of one of multiple sizes.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

perl: make Git::IndexInfo work with SHA-256

Most of the Git modules, git-svn excepted, don't know anything about the
hash algorithm and mostly work. However, when we're printing an
all-zero object ID in Git::IndexInfo, we need to know the hash length.

Since we don't want to change the API to have that information passed
in, let's query the config to find the hash algorithm and compute the
right value.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

perl: create and switch variables for hash constants

git-svn has several variables for SHA-1 constants, including short hash
values and full length hash values.  Since these are no longer SHA-1
specific, let's start them with "oid" instead of "sha1".  Add a
constant, oid_length, which is the length of the hash algorithm in use
in hex.  We use the hex version because overwhelmingly that's what's
used by git-svn.

We don't currently set oid_length based on the repository algorithm, but
we will in a future commit.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

t/lib-git-svn: make hash size independent

The record size used in the git svn storage is four bytes plus the
length of the binary hash. Pass the hash length into our Perl
invocation and use it to compute the size of the records.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

diff-files: treat "i-t-a" files as "not-in-index"

The `diff-files' command and related commands which call the function
`cmd_diff_files()', consider the "intent-to-add" files as a part of the
index when comparing the work-tree against it. This was previously
addressed in commits [1] and [2] by turning the option
`--ita-invisible-in-index' (introduced in [3]) on by default.

For `diff-files' (and `add -p' as a consequence) to show the i-t-a
files as as new, `ita_invisible_in_index' will be enabled by default
here as well.

[1] 0231ae71d3 (diff: turn --ita-invisible-in-index on by default,
2018-05-26)
[2] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist
in index", 2016-10-24)
[3] b42b451919 (diff: add --ita-[in]visible-in-index, 2016-10-24)

Signed-off-by: Srinidhi Kaushik <redacted>
Signed-off-by: Junio C Hamano <redacted>

worktree: drop get_worktrees() unused 'flags' argument

get_worktrees() accepts a 'flags' argument, however, there are no
existing flags (the lone flag GWT_SORT_LINKED was recently retired) and
no behavior which can be tweaked. Therefore, drop the 'flags' argument.

Signed-off-by: Eric Sunshine <redacted>
Signed-off-by: Junio C Hamano <redacted>

worktree: drop get_worktrees() special-purpose sorting option

Of all the clients of get_worktrees(), only "git worktree list" wants
the list sorted in a very specific way; other clients simply don't care
about the order. Rather than imbuing get_worktrees() with special
knowledge about how various clients -- now and in the future -- may want
the list sorted, drop the sorting capability altogether and make it the
client's responsibility to sort the list if needed.

Signed-off-by: Eric Sunshine <redacted>
Signed-off-by: Junio C Hamano <redacted>

t9101: make hash independent

Instead of hard-coding the object ID for our test .gitignore file, let's
compute it.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

t9104: make hash size independent

The size of a record in the database used by git svn is four bytes plus
the length of the binary hash. Instead of hard-coding 24, compute this
value based on the size of the hash in use.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

t9100: make test work with SHA-256

Compute the relevant tree objects for SHA-256 and use those when
appropriate instead of using the SHA-1 ones.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

t9108: make test hash independent

Instead of stripping off the first 41 characters of git log output,
let's just strip off the first space-separated component, which will
work for any size hash.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

t9168: make test hash independent

Instead of stripping off the first 41 characters of git log output,
let's just strip off the first space-separated component, which will
work for any size hash.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

t9109: make test hash independent

Instead of stripping off the first 41 characters of git log output,
let's just strip off the first space-separated component, which will
work for any size hash.

Signed-off-by: brian m. carlson <redacted>
Acked-by: Eric Wong <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-prompt: include sparsity state as well

git-prompt includes the current branch, a bunch of single character
mini-state displayers, and some much longer in-progress state
notifications.  The current branch is always shown.  The single
character mini-state displayers are all off by default (they are not
self explanatory) but each has an environment variable for turning it
on.  The in-progress state notifications provide no configuration
options for turning them off, and can be up to 15 characters long (e.g.
"|REBASE (12/18)" or "|CHERRY-PICKING").

The single character mini-state tends to be used for things like "Do you
have any stashes in refs/stash?" or "Are you ahead or behind of
upstream?".  These are things which users can take advantage of but do
not affect most normal git operations.  The in-progress states, by
contrast, suggest the user needs to interact differently and may also
prevent some normal operations from succeeding (e.g. git switch may show
an error instead of switching branches).

Sparsity is like the in-progress states in that it suggests a
fundamental different interaction with the repository (many of the files
from the repository are not present in your working copy!).  A few
commits ago added sparsity information to wt_longstatus_print_state(),
grouping it with other in-progress state displays.  We do similarly here
with the prompt and show the extra state, by default, with an extra
    |SPARSE
This state can be present simultaneously with the in-progress states, in
which case it will appear before the other states; for example,
    (branchname|SPARSE|REBASE 6/10)

The reason for showing the "|SPARSE" substring before other states is to
emphasize those other states.  Sparsity is probably not going to change
much within a repository, while temporary operations will.  So we want
the state changes related to temporary operations to be listed last, to
make them appear closer to where the user types and make them more
likely to be noticed.

The fact that sparsity isn't just cached metadata or additional
information is what leads us to show it more similarly to the
in-progress states, but the fact that sparsity is not transient like the
in-progress states might cause some users to want an abbreviated
notification of sparsity state or perhaps even be able to turn it off.
Allow GIT_PS1_COMPRESSSPARSESTATE to be set to request that it be
shortened to a single character ('?'), and GIT_PS1_OMITSPARSESTATE to be
set to request that sparsity state be omitted from the prompt entirely.

Signed-off-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

git-prompt: document how in-progress operations affect the prompt

Signed-off-by: Elijah Newren <redacted>
Signed-off-by: Junio C Hamano <redacted>

Merge branch 'mt/open-worktree'

Clean up the code that checks if a directory is a Git repo. Use git
rev-parse instead of rolling our own logic to find that out. A side
effect (which also happens to be the main motivation behind it) of this
change is that git-gui can now open worktrees other than the main
worktree.

* mt/open-worktree:
git-gui: allow opening work trees from the startup dialog

remote-testgit: adapt for object-format

When using an algorithm other than SHA-1, we need the remote helper to
advertise support for the object-format extension and provide
information back to us so that we can properly parse refs and return
data. Ensure that the test remote helper understands these extensions.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

bundle: detect hash algorithm when reading refs

Much like with the dumb HTTP transport, there isn't a way to explicitly
specify the hash algorithm when dealing with a bundle, so detect the
algorithm based on the length of the object IDs in the prerequisites and
ref advertisements.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5300: pass --object-format to git index-pack

git index-pack by default reads the repository to determine the object
format. However, when outside of a repository, it's necessary to specify
the hash algorithm in use so that the pack can be properly indexed. Add
an --object-format argument when invoking git index-pack outside of a
repository.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5704: send object-format capability with SHA-256

When we speak protocol v2 in this test, we must pass the object-format
header if the algorithm is not SHA-1. Otherwise, git upload-pack fails
because the hash algorithm doesn't match and not because we've failed to
speak the protocol correctly. Pass the header so that our assertions
test what we're really interested in.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5703: use object-format serve option

When we're using an algorithm other than SHA-1, we need to specify the
algorithm in use so we don't get a failure with an "unknown format"
message. Add a wrapper function that specifies this header if required.
Skip specifying this header for SHA-1 to test that it works both with an
without this header.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t5702: offer an object-format capability in the test

In order to make this test work with SHA-256, offer an object-format
capability so that both sides use the same algorithm.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t/helper: initialize the repository for test-sha1-array

test-sha1-array uses the_hash_algo under the hood. Since t0064 wants to
use the value that is correct for the hash algorithm that we're testing,
make sure the test helper initializes the repository to set
the_hash_algo correctly.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

remote-curl: avoid truncating refs with ls-remote

Normally, the remote-curl transport helper is aware of the hash
algorithm we're using because we're in a repo with the appropriate hash
algorithm set. However, when using git ls-remote outside of a
repository, we won't have initialized the hash algorithm properly, so
use hash_to_hex_algop to print the ref corresponding to the algorithm
we've detected.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>

t1050: pass algorithm to index-pack when outside repo

When outside a repository, git index-pack is unable to guess the hash
algorithm in use for a pack, since packs don't contain any information
on the algorithm in use. Pass an option to index-pack to help it out in
this test.

Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>