Junio C Hamano [Tue, 9 Jul 2019 22:25:41 +0000 (15:25 -0700)]
Merge branch 'dl/includeif-onbranch'
The conditional inclusion mechanism learned to base the choice on
the branch the HEAD currently is on.
* dl/includeif-onbranch:
config: learn the "onbranch:" includeIf condition
Junio C Hamano [Tue, 9 Jul 2019 22:25:41 +0000 (15:25 -0700)]
Merge branch 'pw/rebase-abort-clean-rewritten'
"git rebase --abort" used to leave refs/rewritten/ when concluding
"git rebase -r", which has been corrected.
* pw/rebase-abort-clean-rewritten:
rebase --abort/--quit: cleanup refs/rewritten
sequencer: return errors from sequencer_remove_state()
rebase: warn if state directory cannot be removed
rebase: fix a memory leak
Junio C Hamano [Tue, 9 Jul 2019 22:25:40 +0000 (15:25 -0700)]
Merge branch 'am/p4-branches-excludes'
"git p4" update.
* am/p4-branches-excludes:
git-p4: respect excluded paths when detecting branches
git-p4: add failing test for "git-p4: respect excluded paths when detecting branches"
git-p4: don't exclude other files with same prefix
git-p4: add failing test for "don't exclude other files with same prefix"
git-p4: don't groom exclude path list on every commit
git-p4: match branches case insensitively if configured
git-p4: add failing test for "git-p4: match branches case insensitively if configured"
git-p4: detect/prevent infinite loop in gitCommitByP4Change()
Junio C Hamano [Tue, 9 Jul 2019 22:25:40 +0000 (15:25 -0700)]
Merge branch 'tg/stash-ref-by-index-fix'
"git stash show 23" used to work, but no more after getting
rewritten in C; this regression has been corrected.
* tg/stash-ref-by-index-fix:
stash: fix show referencing stash index
Junio C Hamano [Tue, 9 Jul 2019 22:25:40 +0000 (15:25 -0700)]
Merge branch 'cb/mkstemps-uint-type-fix'
Variable type fix.
* cb/mkstemps-uint-type-fix:
wrapper: avoid undefined behaviour in macOS
Junio C Hamano [Tue, 9 Jul 2019 22:25:40 +0000 (15:25 -0700)]
Merge branch 'jk/trailers-use-config'
"git interpret-trailers" always treated '#' as the comment
character, regardless of core.commentChar setting, which has been
corrected.
* jk/trailers-use-config:
interpret-trailers: load default config
Junio C Hamano [Tue, 9 Jul 2019 22:25:39 +0000 (15:25 -0700)]
Merge branch 'js/t3404-typofix'
Typofix.
* js/t3404-typofix:
t3404: fix a typo
Junio C Hamano [Tue, 9 Jul 2019 22:25:38 +0000 (15:25 -0700)]
Merge branch 'pw/doc-synopsis-markup-opmode-options'
Docfix.
* pw/doc-synopsis-markup-opmode-options:
show --continue/skip etc. consistently in synopsis
Junio C Hamano [Tue, 9 Jul 2019 22:25:38 +0000 (15:25 -0700)]
Merge branch 'rs/copy-array'
Code clean-up.
* rs/copy-array:
use COPY_ARRAY for copying arrays
coccinelle: use COPY_ARRAY for copying arrays
Junio C Hamano [Tue, 9 Jul 2019 22:25:38 +0000 (15:25 -0700)]
Merge branch 'cb/fsmonitor-intfix'
Variable type fix.
* cb/fsmonitor-intfix:
fsmonitor: avoid signed integer overflow / infinite loop
Junio C Hamano [Tue, 9 Jul 2019 22:25:37 +0000 (15:25 -0700)]
Merge branch 'rs/avoid-overflow-in-midpoint-computation'
Code clean-up to avoid signed integer overlaps during binary search.
* rs/avoid-overflow-in-midpoint-computation:
cleanup: fix possible overflow errors in binary search, part 2
Junio C Hamano [Tue, 9 Jul 2019 22:25:37 +0000 (15:25 -0700)]
Merge branch 'pw/add-p-recount'
"git checkout -p" needs to selectively apply a patch in reverse,
which did not work well.
* pw/add-p-recount:
add -p: fix checkout -p with pathological context
Junio C Hamano [Tue, 9 Jul 2019 22:25:37 +0000 (15:25 -0700)]
Merge branch 'ds/close-object-store'
The commit-graph file is now part of the "files that the runtime
may keep open file descriptors on, all of which would need to be
closed when done with the object store", and the file descriptor to
an existing commit-graph file now is closed before "gc" finalizes a
new instance to replace it.
* ds/close-object-store:
packfile: rename close_all_packs to close_object_store
packfile: close commit-graph in close_all_packs
commit-graph: use raw_object_store when closing
Junio C Hamano [Tue, 9 Jul 2019 22:25:36 +0000 (15:25 -0700)]
Merge branch 'ds/commit-graph-write-refactor'
Renamed from commit-graph-format-v2 and changed scope.
* ds/commit-graph-write-refactor:
commit-graph: extract write_commit_graph_file()
commit-graph: extract copy_oids_to_commits()
commit-graph: extract count_distinct_commits()
commit-graph: extract fill_oids_from_all_packs()
commit-graph: extract fill_oids_from_commit_hex()
commit-graph: extract fill_oids_from_packs()
commit-graph: create write_commit_graph_context
commit-graph: remove Future Work section
commit-graph: collapse parameters into flags
commit-graph: return with errors during write
commit-graph: fix the_repository reference
Junio C Hamano [Tue, 9 Jul 2019 22:25:36 +0000 (15:25 -0700)]
Merge branch 'sg/trace2-rename'
Dev support update to help tracing out tests.
* sg/trace2-rename:
trace2: correct typo in technical documentation
Revert "test-lib: whitelist GIT_TR2_* in the environment"
Junio C Hamano [Tue, 9 Jul 2019 22:25:35 +0000 (15:25 -0700)]
Merge branch 'nd/completion-no-cache-failure'
An incorrect list of options was cached after command line
completion failed (e.g. trying to complete a command that requires
a repository outside one), which has been corrected.
* nd/completion-no-cache-failure:
completion: do not cache if --git-completion-helper fails
Junio C Hamano [Tue, 9 Jul 2019 22:25:35 +0000 (15:25 -0700)]
Merge branch 'js/mergetool-optim'
"git mergetool" and its tests now spawn fewer subprocesses.
* js/mergetool-optim:
mergetool: use shell variable magic instead of `awk`
mergetool: dissect strings with shell variable magic instead of `expr`
t7610-mergetool: use test_cmp instead of test $(cat file) = $txt
t7610-mergetool: do not place pipelines headed by `yes` in subshells
Junio C Hamano [Tue, 9 Jul 2019 22:25:35 +0000 (15:25 -0700)]
Merge branch 'mo/hpux-dynpath'
Auto-detect how to tell HP-UX aCC where to use dynamically linked
libraries from at runtime.
* mo/hpux-dynpath:
configure: Detect linking style for HP aCC on HP-UX
Junio C Hamano [Tue, 9 Jul 2019 22:25:35 +0000 (15:25 -0700)]
Merge branch 'dl/config-alias-doc'
Doc update.
* dl/config-alias-doc:
config/alias.txt: document alias accepting non-command first word
config/alias.txt: change " and ' to `
Junio C Hamano [Tue, 9 Jul 2019 22:25:34 +0000 (15:25 -0700)]
Merge branch 'tm/tag-gpgsign-config'
A new tag.gpgSign configuration variable turns "git tag -a" into
"git tag -s".
* tm/tag-gpgsign-config:
tag: add tag.gpgSign config option to force all tags be GPG-signed
Junio C Hamano [Tue, 9 Jul 2019 22:25:34 +0000 (15:25 -0700)]
Merge branch 'fc/fetch-with-import-fix'
Code restructuring during 2.20 period broke fetching tags via
"import" based transports.
* fc/fetch-with-import-fix:
fetch: fix regression with transport helpers
fetch: make the code more understandable
fetch: trivial cleanup
t5801 (remote-helpers): add test to fetch tags
t5801 (remote-helpers): cleanup refspec stuff
Junio C Hamano [Tue, 9 Jul 2019 22:25:34 +0000 (15:25 -0700)]
Merge branch 'po/doc-branch'
Doc update.
* po/doc-branch:
doc branch: provide examples for listing remote tracking branches
Junio C Hamano [Tue, 9 Jul 2019 22:25:33 +0000 (15:25 -0700)]
Merge branch 'nb/branch-show-other-worktrees-head'
"git branch --list" learned to show branches that are checked out
in other worktrees connected to the same repository prefixed with
'+', similar to the way the currently checked out branch is shown
with '*' in front.
* nb/branch-show-other-worktrees-head:
branch: add worktree info on verbose output
branch: update output to include worktree info
ref-filter: add worktreepath atom
Edmundo Carmona Antoranz [Tue, 9 Jul 2019 03:15:59 +0000 (21:15 -0600)]
builtin/merge.c - cleanup of code in for-cycle that tests strategies
The cmd_merge() function has a loop that tries different
merge strategies in turn, and stops when a strategy gets a
clean merge, while keeping the "best" conflicted merge so
far.
Make the loop easier to follow by moving the code around,
ensuring that there is only one "break" in the loop where
an automerge succeeds. Also group the actions that are
performed after an automerge succeeds together to a single
location, outside and after the loop.
Signed-off-by: Edmundo Carmona Antoranz <redacted>
Signed-off-by: Junio C Hamano <redacted>
Thomas Gummerer [Mon, 8 Jul 2019 16:33:06 +0000 (17:33 +0100)]
apply: only pass required data to find_name_*
Currently the 'find_name_*()' functions take 'struct apply_state' as
parameter, even though they only need the 'root' member from that
struct.
These functions are in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit. To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.
Signed-off-by: Thomas Gummerer <redacted>
Signed-off-by: Junio C Hamano <redacted>
Thomas Gummerer [Mon, 8 Jul 2019 16:33:05 +0000 (17:33 +0100)]
apply: only pass required data to check_header_line
Currently the 'check_header_line()' function takes 'struct
apply_state' as parameter, even though it only needs the linenr from
that struct.
This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit. To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.
Signed-off-by: Thomas Gummerer <redacted>
Signed-off-by: Junio C Hamano <redacted>
Thomas Gummerer [Mon, 8 Jul 2019 16:33:04 +0000 (17:33 +0100)]
apply: only pass required data to git_header_name
Currently the 'git_header_name()' function takes 'struct apply_state'
as parameter, even though it only needs the p_value from that struct.
This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit. To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.
Signed-off-by: Thomas Gummerer <redacted>
Signed-off-by: Junio C Hamano <redacted>
Thomas Gummerer [Mon, 8 Jul 2019 16:33:03 +0000 (17:33 +0100)]
apply: only pass required data to skip_tree_prefix
Currently the 'skip_tree_prefix()' function takes 'struct apply_state'
as parameter, even though it only needs the p_value from that struct.
This function is in the callchain of 'parse_git_header()', which we
want to make more generally useful in a subsequent commit. To make
that happen we only want to pass in the required data to
'parse_git_header()', and not the whole 'struct apply_state', and thus
we want functions in the callchain of 'parse_git_header()' to only
take arguments they really need.
Signed-off-by: Thomas Gummerer <redacted>
Signed-off-by: Junio C Hamano <redacted>
Thomas Gummerer [Mon, 8 Jul 2019 16:33:02 +0000 (17:33 +0100)]
apply: replace marc.info link with public-inbox
public-inbox.org links include the whole message ID by default. This
means the message can still be found even if the site goes away, which
is not the case with the marc.info link. Replace the marc.info link
with a more future proof one.
Signed-off-by: Thomas Gummerer <redacted>
Signed-off-by: Junio C Hamano <redacted>
Phillip Wood [Thu, 4 Jul 2019 09:47:02 +0000 (02:47 -0700)]
t3420: remove progress lines before comparing output
Some of the tests check the output of rebase is what we expect. These
were added after a regression that added unwanted stash output when
using --autostash. They are useful as they prevent unintended changes to
the output of the various rebase commands. However they also include all
the progress output which is less useful as it only tests what would be
written to a dumb terminal which is not the normal use case. The recent
changes to fix clearing the line when printing progress necessarily
meant making an ugly change to these tests. Address this my removing the
progress output before comparing it to the expected output. We do this
by removing everything before the final "\r" on each line as we don't
care about the progress indicator, but we do care about what is printed
immediately after it.
Signed-off-by: Phillip Wood <redacted>
Signed-off-by: Junio C Hamano <redacted>
Karsten Blees [Thu, 4 Jul 2019 09:20:33 +0000 (02:20 -0700)]
mingw: initialize HOME on startup
HOME initialization was historically duplicated in many different places,
including /etc/profile, launch scripts such as git-bash.vbs and gitk.cmd,
and (although slightly broken) in the git-wrapper.
Even unrelated projects such as GitExtensions and TortoiseGit need to
implement the same logic to be able to call git directly.
Initialize HOME in git's own startup code so that we can eventually retire
all the duplicate initialization code.
Signed-off-by: Karsten Blees <redacted>
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>
Johannes Schindelin [Thu, 4 Jul 2019 22:36:57 +0000 (15:36 -0700)]
mingw: fix possible buffer overrun when calling `GetUserNameW()`
In
39a98e9b68b8 (mingw: get pw_name in UTF-8 format, 2019-06-27), this
developer missed the fact that the `GetUserNameW()` function takes the
number of characters as `len` parameter, not the number of bytes.
Reported-by: Beat Bolli <redacted>
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>
SZEDER Gábor [Sat, 6 Jul 2019 16:21:14 +0000 (18:21 +0200)]
ci/lib.sh: update a comment about installed P4 and Git-LFS versions
A comment in 'ci/lib.sh' claims that the "OS X build installs the
latest available versions" of P4 and Git-LFS, but since
f2f47150
("ci: don't update Homebrew", 2019-07-03) that's no longer the case,
as it will install the versions which were recorded in the image's
Homebrew database when the image was created.
Update this comment accordingly.
Signed-off-by: SZEDER Gábor <redacted>
Acked-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>
Karsten Blees [Wed, 3 Jul 2019 20:46:04 +0000 (13:46 -0700)]
gettext: always use UTF-8 on native Windows
On native Windows, Git exclusively uses UTF-8 for console output (both
with MinTTY and native Win32 Console). Gettext uses `setlocale()` to
determine the output encoding for translated text, however, MSVCRT's
`setlocale()` does not support UTF-8. As a result, translated text is
encoded in system encoding (as per `GetAPC()`), and non-ASCII chars are
mangled in console output.
Side note: There is actually a code page for UTF-8: 65001. In practice,
it does not work as expected at least on Windows 7, though, so we cannot
use it in Git. Besides, if we overrode the code page, any process
spawned from Git would inherit that code page (as opposed to the code
page configured for the current user), which would quite possibly break
e.g. diff or merge helpers. So we really cannot override the code page.
In `init_gettext_charset()`, Git calls gettext's
`bind_textdomain_codeset()` with the character set obtained via
`locale_charset()`; Let's override that latter function to force the
encoding to UTF-8 on native Windows.
In Git for Windows' SDK, there is a `libcharset.h` and therefore we
define `HAVE_LIBCHARSET_H` in the MINGW-specific section in
`config.mak.uname`, therefore we need to add the override before that
conditionally-compiled code block.
Rather than simply defining `locale_charset()` to return the string
`"UTF-8"`, though, we are careful not to break `LC_ALL=C`: the
`ab/no-kwset` patch series, for example, needs to have a way to prevent
Git from expecting UTF-8-encoded input.
Signed-off-by: Karsten Blees <redacted>
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>
SZEDER Gábor [Wed, 3 Jul 2019 10:47:48 +0000 (12:47 +0200)]
ci: disable Homebrew's auto cleanup
Lately Homebrew learned to automagically clean up information about
outdated packages during other 'brew' commands, which might be useful
for the avarage user, but is a waste of time in CI build jobs, because
the next build jobs will start from the exact same image containing
the same outdated packages anyway.
Export HOMEBREW_NO_INSTALL_CLEANUP=1 to disable this auto cleanup feature,
shaving off about 20-30s from the time needed to install dependencies
in our macOS build jobs on Travis CI.
Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>
SZEDER Gábor [Wed, 3 Jul 2019 10:47:47 +0000 (12:47 +0200)]
ci: don't update Homebrew
Lately our GCC macOS build job on Travis CI has been erroring out
while installing dependencies with:
+brew link gcc@8
Error: No such keg: /usr/local/Cellar/gcc@8
The command "ci/install-dependencies.sh" failed and exited with 1 during .
Now, while gcc@8 is still pre-installed (but not linked) and would be
perfectly usable in the Travis CI macOS image we use [1], it's at
version 8.2. However, when installing dependencies we first
explicitly run 'brew update', which spends over two minutes to update
itself and information about the available packages, and it learns
about GCC 8.3. After that point gcc@8 exclusively refers to v8.3,
and, unfortunately, 'brew' is just too dumb to be able to do anything
with the still installed 8.2 package, and the subsequent 'brew link
gcc@8' fails. (Even 'brew uninstall gcc@8' fails with the same
error!)
Don't run 'brew update' to keep the already installed GCC 8.2 'brew
link'-able. Note that in addition we have to 'export
HOMEBREW_NO_AUTO_UPDATE=1' first, because 'brew' is so very helpful
that it would implicitly run update for us on the next 'brew install
<pkg>' otherwise.
Disabling 'brew update' has additional benefits:
- It shaves off 2-3mins from the ~4mins currently spent on
installing dependencies, and the macOS build jobs have always been
prone to exceeding the time limit on Travis CI.
- Our builds won't suddenly break because of the occasional Homebrew
breakages [2].
The drawback is that we'll be stuck with slightly older versions of
the packages that we install via Homebrew (Git-LFS 2.5.2 and Perforce
2018.1; they are currently at 2.7.2 and 2019.1, respectively). We
might want to reconsider this decision as time goes on and/or switch
to a more recent macOS image as they become available.
[1]
2000ac9fbf (travis-ci: switch to Xcode 10.1 macOS image,
2019-01-17)
[2] See e.g.
a1ccaedd62 (travis-ci: make the OSX build jobs' 'brew
update' more quiet, 2019-02-02) or
https://public-inbox.org/git/
20180907032002.23366-1-szeder.dev@gmail.com/T/#+u
Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>
Dimitriy Ryazantcev [Tue, 2 Jul 2019 18:22:48 +0000 (21:22 +0300)]
l10n: localizable upload progress messages
Currenly the data rate in throughput_string(...) method is
output by simple strbuf_humanise_bytes(...) call and '/s' append.
But for proper translation of such string the translator needs
full context.
Add strbuf_humanise_rate(...) method to properly print out
localizable version of data rate ('3.5 MiB/s' etc) with full context.
Strings with the units in strbuf_humanise_bytes(...) are marked
for translation.
Signed-off-by: Dimitriy Ryazantcev <redacted>
Signed-off-by: Junio C Hamano <redacted>
Quentin Nerden [Tue, 2 Jul 2019 14:37:41 +0000 (07:37 -0700)]
docs: git-clone: list short form of options first
List the short form of options (e.g.: '-l') before the long form (e.g.
'--local').
This is to match the doc of git-add, git-commit, git-clean, git-branch...
Signed-off-by: Quentin Nerden <redacted>
Signed-off-by: Junio C Hamano <redacted>
Quentin Nerden [Tue, 2 Jul 2019 14:37:40 +0000 (07:37 -0700)]
docs: git-clone: refer to long form of options
To make the doc of git-clone easier to read,
refer to the long form of the options
(it is easier to guess what '--verbose' is doing than '-v').
Signed-off-by: Quentin Nerden <redacted>
Signed-off-by: Junio C Hamano <redacted>
Rohit Ashiwal [Tue, 2 Jul 2019 09:11:29 +0000 (14:41 +0530)]
cherry-pick/revert: advise using --skip
The previous commit introduced a --skip flag for cherry-pick and
revert. Update the advice messages, to tell users about this less
cumbersome way of skipping commits. Also add tests to ensure
everything is working fine.
Signed-off-by: Rohit Ashiwal <redacted>
Signed-off-by: Junio C Hamano <redacted>
Rohit Ashiwal [Tue, 2 Jul 2019 09:11:28 +0000 (14:41 +0530)]
cherry-pick/revert: add --skip option
git am or rebase have a --skip flag to skip the current commit if the
user wishes to do so. During a cherry-pick or revert a user could
likewise skip a commit, but needs to use 'git reset' (or in the case
of conflicts 'git reset --merge'), followed by 'git (cherry-pick |
revert) --continue' to skip the commit. This is more annoying and
sometimes confusing on the users' part. Add a `--skip` option to make
skipping commits easier for the user and to make the commands more
consistent.
In the next commit, we will change the advice messages hence finishing
the process of teaching revert and cherry-pick "how to skip commits".
Signed-off-by: Rohit Ashiwal <redacted>
Signed-off-by: Junio C Hamano <redacted>
Rohit Ashiwal [Tue, 2 Jul 2019 09:11:27 +0000 (14:41 +0530)]
sequencer: use argv_array in reset_merge
Avoid using magic numbers for array size and index under `reset_merge`
function. Use `argv_array` instead. This will make code shorter and
easier to extend.
Signed-off-by: Rohit Ashiwal <redacted>
Signed-off-by: Junio C Hamano <redacted>
Rohit Ashiwal [Tue, 2 Jul 2019 09:11:26 +0000 (14:41 +0530)]
sequencer: rename reset_for_rollback to reset_merge
We are on a path to teach cherry-pick/revert how to skip commits. To
achieve this, we could really make use of existing functions.
reset_for_rollback is one such function, but the name does not
intuitively suggest to use it to reset a merge, which it was born to
perform, see
539047c ("revert: introduce --abort to cancel a failed
cherry-pick", 2011-11-23). Change the name to reset_merge to make
it more intuitive.
Signed-off-by: Rohit Ashiwal <redacted>
Signed-off-by: Junio C Hamano <redacted>
Rohit Ashiwal [Tue, 2 Jul 2019 09:11:25 +0000 (14:41 +0530)]
sequencer: add advice for revert
In the case of merge conflicts, while performing a revert, we are
currently advised to use `git cherry-pick --<sequencer-options>`.
Introduce a separate advice message for `git revert`. Also change
the signature of `create_seq_dir` to handle which advice to display
selectively.
Signed-off-by: Rohit Ashiwal <redacted>
Signed-off-by: Junio C Hamano <redacted>
Jeff King [Fri, 28 Jun 2019 09:42:07 +0000 (05:42 -0400)]
t5703: use test_commit_bulk
There are two loops that create 33 commits each using test_commit. Using
test_commit_bulk speeds this up from:
Benchmark #1: ./t5703-upload-pack-ref-in-want.sh --root=/var/ram/git-tests
Time (mean ± σ): 2.142 s ± 0.161 s [User: 1.136 s, System: 0.974 s]
Range (min … max): 1.903 s … 2.401 s 10 runs
to:
Benchmark #1: ./t5703-upload-pack-ref-in-want.sh --root=/var/ram/git-tests
Time (mean ± σ): 1.440 s ± 0.114 s [User: 737.7 ms, System: 615.4 ms]
Range (min … max): 1.230 s … 1.604 s 10 runs
for an average savings of almost 33%.
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>
Jeff King [Fri, 28 Jun 2019 09:41:54 +0000 (05:41 -0400)]
t5702: use test_commit_bulk
There are two loops that create 32 commits each using test_commit. Using
test_commit_bulk speeds this up from:
Benchmark #1: ./t5702-protocol-v2.sh --root=/var/ram/git-tests
Time (mean ± σ): 5.409 s ± 0.513 s [User: 2.382 s, System: 2.466 s]
Range (min … max): 4.633 s … 5.927 s 10 runs
to:
Benchmark #1: ./t5702-protocol-v2.sh --root=/var/ram/git-tests
Time (mean ± σ): 3.956 s ± 0.242 s [User: 1.775 s, System: 1.627 s]
Range (min … max): 3.449 s … 4.239 s 10 runs
for an average savings of over 25%.
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>
Jeff King [Fri, 28 Jun 2019 09:41:35 +0000 (05:41 -0400)]
t3311: use test_commit_bulk
One of the tests in t3311 creates 300 commits by running "test_commit"
in a loop. This requires 900 processes. Instead, we can use
test_commit_bulk to do it with only four. This improves the runtime of
the script from:
Benchmark #1: ./t3311-notes-merge-fanout.sh --root=/var/ram/git-tests
Time (mean ± σ): 5.821 s ± 0.691 s [User: 3.146 s, System: 2.782 s]
Range (min … max): 4.783 s … 6.841 s 10 runs
to:
Benchmark #1: ./t3311-notes-merge-fanout.sh --root=/var/ram/git-tests
Time (mean ± σ): 1.743 s ± 0.116 s [User: 1.144 s, System: 0.691 s]
Range (min … max): 1.629 s … 1.994 s 10 runs
for an average speedup of over 70%.
Unfortunately we still have to run 300 instances of "git notes add",
since the point is to test the fanout that comes from adding notes one
by one.
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>
Jeff King [Fri, 28 Jun 2019 09:39:42 +0000 (05:39 -0400)]
t5310: increase the number of bitmapped commits
The bitmap index we compute in t5310 has only 20 commits in it. This
gives poor coverage of bitmap_writer_select_commits(), which simply
writes a bitmap for everything when there are fewer than 100 commits.
Let's bump the number of commits in the test to cover the more complex
code paths (this does drop coverage of the individual lines of the
trivial path, but the complex path does everything it does and more).
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>
Jeff King [Tue, 2 Jul 2019 05:16:49 +0000 (01:16 -0400)]
test-lib: introduce test_commit_bulk
Some tests need to create a string of commits. Doing this with
test_commit is very heavy-weight, as it needs at least one process per
commit (and in fact, uses several).
For bulk creation, we can do much better by using fast-import, but it's
often a pain to generate the input. Let's provide a helper to do so.
We'll use t5310 as a guinea pig, as it has three 10-commit loops. Here
are hyperfine results before and after:
[before]
Benchmark #1: ./t5310-pack-bitmaps.sh --root=/var/ram/git-tests
Time (mean ± σ): 2.846 s ± 0.305 s [User: 3.042 s, System: 0.919 s]
Range (min … max): 2.250 s … 3.210 s 10 runs
[after]
Benchmark #1: ./t5310-pack-bitmaps.sh --root=/var/ram/git-tests
Time (mean ± σ): 2.210 s ± 0.174 s [User: 2.570 s, System: 0.604 s]
Range (min … max): 1.999 s … 2.590 s 10 runs
So we're over 20% faster, while making the callers slightly shorter. We
added a lot more lines in test-lib-function.sh, of course, and the
helper is way more featureful than we need here. But my hope is that it
will be flexible enough to use in more places.
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Mon, 1 Jul 2019 21:21:00 +0000 (23:21 +0200)]
grep: use PCRE v2 for optimized fixed-string search
Bring back optimized fixed-string search for "grep", this time with
PCRE v2 as an optional backend. As noted in [1] with kwset we were
slower than PCRE v1 and v2 JIT with the kwset backend, so that
optimization was counterproductive.
This brings back the optimization for "--fixed-strings", without
changing the semantics of having a NUL-byte in patterns. As seen in
previous commits in this series we could support it now, but I'd
rather just leave that edge-case aside so we don't have one behavior
or the other depending what "--fixed-strings" backend we're using. It
makes the behavior harder to understand and document, and makes tests
for the different backends more painful.
This does change the behavior under non-C locales when "log"'s
"--encoding" option is used and the heystack/needle in the
content/command-line doesn't have a matching encoding. See the recent
change in "t4210: skip more command-line encoding tests on MinGW" in
this series. I think that's OK. We did nothing sensible before
then (just compared raw bytes that had no hope of matching). At least
now the user will get some idea why their grep/log never matches in
that edge case.
I could also support the PCRE v1 backend here, but that would make the
code more complex. I'd rather aim for simplicity here and in future
changes to the diffcore. We're not going to have someone who
absolutely must have faster search, but for whom building PCRE v2
isn't acceptable.
The difference between this series of commits and the current "master"
is, using the same t/perf commands shown in the last commit:
plain grep:
Test origin/master HEAD
-------------------------------------------------------------------------
7821.1: fixed grep int 0.55(1.67+0.56) 0.41(0.98+0.60) -25.5%
7821.2: basic grep int 0.58(1.65+0.52) 0.41(0.96+0.57) -29.3%
7821.3: extended grep int 0.57(1.66+0.49) 0.42(0.93+0.60) -26.3%
7821.4: perl grep int 0.54(1.67+0.50) 0.43(0.88+0.65) -20.4%
7821.6: fixed grep uncommon 0.21(0.52+0.42) 0.16(0.24+0.51) -23.8%
7821.7: basic grep uncommon 0.20(0.49+0.45) 0.17(0.28+0.47) -15.0%
7821.8: extended grep uncommon 0.20(0.54+0.39) 0.16(0.25+0.50) -20.0%
7821.9: perl grep uncommon 0.20(0.58+0.36) 0.16(0.23+0.50) -20.0%
7821.11: fixed grep æ 0.35(1.24+0.43) 0.16(0.23+0.50) -54.3%
7821.12: basic grep æ 0.36(1.29+0.38) 0.16(0.20+0.54) -55.6%
7821.13: extended grep æ 0.35(1.23+0.44) 0.16(0.24+0.50) -54.3%
7821.14: perl grep æ 0.35(1.33+0.34) 0.16(0.28+0.46) -54.3%
grep with -i:
Test origin/master HEAD
----------------------------------------------------------------------------
7821.1: fixed grep -i int 0.62(1.81+0.70) 0.47(1.11+0.64) -24.2%
7821.2: basic grep -i int 0.67(1.90+0.53) 0.46(1.07+0.62) -31.3%
7821.3: extended grep -i int 0.62(1.92+0.53) 0.53(1.12+0.58) -14.5%
7821.4: perl grep -i int 0.66(1.85+0.58) 0.45(1.10+0.59) -31.8%
7821.6: fixed grep -i uncommon 0.21(0.54+0.43) 0.17(0.20+0.55) -19.0%
7821.7: basic grep -i uncommon 0.20(0.52+0.45) 0.17(0.29+0.48) -15.0%
7821.8: extended grep -i uncommon 0.21(0.52+0.44) 0.17(0.26+0.50) -19.0%
7821.9: perl grep -i uncommon 0.21(0.53+0.44) 0.17(0.20+0.56) -19.0%
7821.11: fixed grep -i æ 0.26(0.79+0.44) 0.16(0.29+0.46) -38.5%
7821.12: basic grep -i æ 0.26(0.79+0.42) 0.16(0.20+0.54) -38.5%
7821.13: extended grep -i æ 0.26(0.84+0.39) 0.16(0.24+0.50) -38.5%
7821.14: perl grep -i æ 0.16(0.24+0.49) 0.17(0.25+0.51) +6.3%
plain log:
Test origin/master HEAD
--------------------------------------------------------------------------------
4221.1: fixed log --grep='int' 7.24(6.95+0.28) 7.20(6.95+0.18) -0.6%
4221.2: basic log --grep='int' 7.31(6.97+0.22) 7.20(6.93+0.21) -1.5%
4221.3: extended log --grep='int' 7.37(7.04+0.24) 7.22(6.91+0.25) -2.0%
4221.4: perl log --grep='int' 7.31(7.04+0.21) 7.19(6.89+0.21) -1.6%
4221.6: fixed log --grep='uncommon' 6.93(6.59+0.32) 7.04(6.66+0.37) +1.6%
4221.7: basic log --grep='uncommon' 6.92(6.58+0.29) 7.08(6.75+0.29) +2.3%
4221.8: extended log --grep='uncommon' 6.92(6.55+0.31) 7.00(6.68+0.31) +1.2%
4221.9: perl log --grep='uncommon' 7.03(6.59+0.33) 7.12(6.73+0.34) +1.3%
4221.11: fixed log --grep='æ' 7.41(7.08+0.28) 7.05(6.76+0.29) -4.9%
4221.12: basic log --grep='æ' 7.39(6.99+0.33) 7.00(6.68+0.25) -5.3%
4221.13: extended log --grep='æ' 7.34(7.00+0.25) 7.15(6.81+0.31) -2.6%
4221.14: perl log --grep='æ' 7.43(7.13+0.26) 7.01(6.60+0.36) -5.7%
log with -i:
Test origin/master HEAD
------------------------------------------------------------------------------------
4221.1: fixed log -i --grep='int' 7.31(7.07+0.24) 7.23(7.00+0.22) -1.1%
4221.2: basic log -i --grep='int' 7.40(7.08+0.28) 7.19(6.92+0.20) -2.8%
4221.3: extended log -i --grep='int' 7.43(7.13+0.25) 7.27(6.99+0.21) -2.2%
4221.4: perl log -i --grep='int' 7.34(7.10+0.24) 7.10(6.90+0.19) -3.3%
4221.6: fixed log -i --grep='uncommon' 7.07(6.71+0.32) 7.11(6.77+0.28) +0.6%
4221.7: basic log -i --grep='uncommon' 6.99(6.64+0.28) 7.12(6.69+0.38) +1.9%
4221.8: extended log -i --grep='uncommon' 7.11(6.74+0.32) 7.10(6.77+0.27) -0.1%
4221.9: perl log -i --grep='uncommon' 6.98(6.60+0.29) 7.05(6.64+0.34) +1.0%
4221.11: fixed log -i --grep='æ' 7.85(7.45+0.34) 7.03(6.68+0.32) -10.4%
4221.12: basic log -i --grep='æ' 7.87(7.49+0.29) 7.06(6.69+0.31) -10.3%
4221.13: extended log -i --grep='æ' 7.87(7.54+0.31) 7.09(6.69+0.31) -9.9%
4221.14: perl log -i --grep='æ' 7.06(6.77+0.28) 6.91(6.57+0.31) -2.1%
So as with
e05b027627 ("grep: use PCRE v2 for optimized fixed-string
search", 2019-06-26) there's a huge improvement in performance for
"grep", but in "log" most of our time is spent elsewhere, so we don't
notice it that much.
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Mon, 1 Jul 2019 21:20:59 +0000 (23:20 +0200)]
grep: remove the kwset optimization
A later change will replace this optimization with optimistic use of
PCRE v2. I'm completely removing it as an intermediate step, as
opposed to replacing it with PCRE v2, to demonstrate that no grep
semantics depend on this (or any other) optimization for the fixed
backend anymore.
For now this is mostly (but not entirely) a performance regression, as
shown by this hacky one-liner:
for opt in '' ' -i'
do
GIT_PERF_7821_GREP_OPTS=$opt GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_OPTS='-j8 CFLAGS=-O3 USE_LIBPCRE=YesPlease' ./run origin/master HEAD -- p7821-grep-engines-fixed.sh
done &&
for opt in '' ' -i'
do GIT_PERF_4221_LOG_OPTS=$opt GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_OPTS='-j8 CFLAGS=-O3 USE_LIBPCRE=YesPlease' ./run origin/master HEAD -- p4221-log-grep-engines-fixed.sh
done
Which produces:
plain grep:
Test origin/master HEAD
-------------------------------------------------------------------------
7821.1: fixed grep int 0.55(1.60+0.63) 0.82(3.11+0.51) +49.1%
7821.2: basic grep int 0.62(1.68+0.49) 0.85(3.02+0.52) +37.1%
7821.3: extended grep int 0.61(1.63+0.53) 0.91(3.09+0.44) +49.2%
7821.4: perl grep int 0.55(1.60+0.57) 0.41(0.93+0.57) -25.5%
7821.6: fixed grep uncommon 0.20(0.50+0.44) 0.35(1.27+0.42) +75.0%
7821.7: basic grep uncommon 0.20(0.49+0.45) 0.35(1.29+0.41) +75.0%
7821.8: extended grep uncommon 0.20(0.45+0.48) 0.35(1.25+0.44) +75.0%
7821.9: perl grep uncommon 0.20(0.53+0.41) 0.16(0.24+0.49) -20.0%
7821.11: fixed grep æ 0.35(1.27+0.40) 0.25(0.82+0.39) -28.6%
7821.12: basic grep æ 0.35(1.28+0.38) 0.25(0.75+0.44) -28.6%
7821.13: extended grep æ 0.36(1.21+0.46) 0.25(0.86+0.35) -30.6%
7821.14: perl grep æ 0.35(1.33+0.34) 0.16(0.26+0.47) -54.3%
grep with -i:
Test origin/master HEAD
-----------------------------------------------------------------------------
7821.1: fixed grep -i int 0.61(1.84+0.64) 1.11(4.12+0.64) +82.0%
7821.2: basic grep -i int 0.72(1.86+0.57) 1.15(4.48+0.49) +59.7%
7821.3: extended grep -i int 0.94(1.83+0.60) 1.53(4.12+0.58) +62.8%
7821.4: perl grep -i int 0.66(1.82+0.59) 0.55(1.08+0.58) -16.7%
7821.6: fixed grep -i uncommon 0.21(0.51+0.44) 0.44(1.74+0.34) +109.5%
7821.7: basic grep -i uncommon 0.21(0.55+0.41) 0.44(1.72+0.40) +109.5%
7821.8: extended grep -i uncommon 0.21(0.57+0.39) 0.42(1.64+0.45) +100.0%
7821.9: perl grep -i uncommon 0.21(0.48+0.48) 0.17(0.30+0.45) -19.0%
7821.11: fixed grep -i æ 0.25(0.73+0.45) 0.25(0.75+0.45) +0.0%
7821.12: basic grep -i æ 0.25(0.71+0.49) 0.26(0.77+0.44) +4.0%
7821.13: extended grep -i æ 0.25(0.75+0.44) 0.25(0.74+0.46) +0.0%
7821.14: perl grep -i æ 0.17(0.26+0.48) 0.16(0.20+0.52) -5.9%
plain log:
Test origin/master HEAD
---------------------------------------------------------------------------------
4221.1: fixed log --grep='int' 7.31(7.06+0.21) 8.11(7.85+0.20) +10.9%
4221.2: basic log --grep='int' 7.30(6.94+0.27) 8.16(7.89+0.19) +11.8%
4221.3: extended log --grep='int' 7.34(7.05+0.21) 8.08(7.76+0.25) +10.1%
4221.4: perl log --grep='int' 7.27(6.94+0.24) 7.05(6.76+0.25) -3.0%
4221.6: fixed log --grep='uncommon' 6.97(6.62+0.32) 7.86(7.51+0.30) +12.8%
4221.7: basic log --grep='uncommon' 7.05(6.69+0.29) 7.89(7.60+0.28) +11.9%
4221.8: extended log --grep='uncommon' 6.89(6.56+0.32) 7.99(7.66+0.24) +16.0%
4221.9: perl log --grep='uncommon' 7.02(6.66+0.33) 6.97(6.54+0.36) -0.7%
4221.11: fixed log --grep='æ' 7.37(7.03+0.33) 7.67(7.30+0.31) +4.1%
4221.12: basic log --grep='æ' 7.41(7.00+0.31) 7.60(7.28+0.26) +2.6%
4221.13: extended log --grep='æ' 7.35(6.96+0.38) 7.73(7.31+0.34) +5.2%
4221.14: perl log --grep='æ' 7.43(7.10+0.32) 6.95(6.61+0.27) -6.5%
log with -i:
Test origin/master HEAD
------------------------------------------------------------------------------------
4221.1: fixed log -i --grep='int' 7.40(7.05+0.23) 8.66(8.38+0.20) +17.0%
4221.2: basic log -i --grep='int' 7.39(7.09+0.23) 8.67(8.39+0.20) +17.3%
4221.3: extended log -i --grep='int' 7.29(6.99+0.26) 8.69(8.31+0.26) +19.2%
4221.4: perl log -i --grep='int' 7.42(7.16+0.21) 7.14(6.80+0.24) -3.8%
4221.6: fixed log -i --grep='uncommon' 6.94(6.58+0.35) 8.43(8.04+0.30) +21.5%
4221.7: basic log -i --grep='uncommon' 6.95(6.62+0.31) 8.34(7.93+0.32) +20.0%
4221.8: extended log -i --grep='uncommon' 7.06(6.75+0.25) 8.32(7.98+0.31) +17.8%
4221.9: perl log -i --grep='uncommon' 6.96(6.69+0.26) 7.04(6.64+0.32) +1.1%
4221.11: fixed log -i --grep='æ' 7.92(7.55+0.33) 7.86(7.44+0.34) -0.8%
4221.12: basic log -i --grep='æ' 7.88(7.49+0.32) 7.84(7.46+0.34) -0.5%
4221.13: extended log -i --grep='æ' 7.91(7.51+0.32) 7.87(7.48+0.32) -0.5%
4221.14: perl log -i --grep='æ' 7.01(6.59+0.35) 6.99(6.64+0.28) -0.3%
Some of those, as noted in [1] are because PCRE is faster at finding
fixed strings. This looks bad for some engines, but in the next change
we'll optimistically use PCRE v2 for all of these, so it'll look
better.
1. https://public-inbox.org/git/87v9x793qi.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Mon, 1 Jul 2019 21:20:58 +0000 (23:20 +0200)]
grep: drop support for \0 in --fixed-strings <pattern>
Change "-f <file>" to not support patterns with a NUL-byte in them
under --fixed-strings. We'll now only support these under
"--perl-regexp" with PCRE v2.
A previous change to grep's documentation changed the description of
"-f <file>" to be vague enough as to not promise that this would work.
By dropping support for this we make it a whole lot easier to move
away from the kwset backend, which we'll do in a subsequent change.
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Mon, 1 Jul 2019 21:20:57 +0000 (23:20 +0200)]
grep: make the behavior for NUL-byte in patterns sane
The behavior of "grep" when patterns contained a NUL-byte has always
been haphazard, and has served the vagaries of the implementation more
than anything else. A pattern containing a NUL-byte can only be
provided via "-f <file>". Since pickaxe (log search) has no such flag
the NUL-byte in patterns has only ever been supported by "grep" (and
not "log --grep").
Since
9eceddeec6 ("Use kwset in grep", 2011-08-21) patterns containing
"\0" were considered fixed. In
966be95549 ("grep: add tests to fix
blind spots with \0 patterns", 2017-05-20) I added tests for this
behavior.
Change the behavior to do the obvious thing, i.e. don't silently
discard a regex pattern and make it implicitly fixed just because they
contain a NUL-byte. Instead die if the backend in question can't
handle them, e.g. --basic-regexp is combined with such a pattern.
This is desired because from a user's point of view it's the obvious
thing to do. Whether we support BRE/ERE/Perl syntax is different from
whether our implementation is limited by C-strings. These patterns are
obscure enough that I think this behavior change is OK, especially
since we never documented the old behavior.
Doing this also makes it easier to replace the kwset backend with
something else, since we'll no longer strictly need it for anything we
can't easily use another fixed-string backend for.
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Mon, 1 Jul 2019 21:20:56 +0000 (23:20 +0200)]
grep tests: move binary pattern tests into their own file
Move the tests for "-f <file>" where "<file>" contains a NUL byte
pattern into their own file. I added most of these tests in
966be95549 ("grep: add tests to fix blind spots with \0 patterns",
2017-05-20).
Whether a regex engine supports matching binary content is very
different from whether it matches binary patterns. Since
2f8952250a ("regex: add regexec_buf() that can work on a non
NUL-terminated string", 2016-09-21) we've required REG_STARTEND of our
regex engines so we can match binary content, but only the PCRE v2
engine can sensibly match binary patterns.
Since
9eceddeec6 ("Use kwset in grep", 2011-08-21) we've been punting
patterns containing NUL-byte and considering them fixed, except in
cases where "--ignore-case" is provided and they're non-ASCII, see
5c1ebcca4d ("grep/icase: avoid kwsset on literal non-ascii strings",
2016-06-25). Subsequent commits will change this behavior.
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Mon, 1 Jul 2019 21:20:55 +0000 (23:20 +0200)]
grep tests: move "grep binary" alongside the rest
Move the "grep binary" test case added in
aca20dd558 ("grep: add test
script for binary file handling", 2010-05-22) so that it lives
alongside the rest of the "grep" tests in t781*. This would have left
a gap in the t/700* namespace, so move a "filter-branch" test down,
leaving the "t7010-setup.sh" test as the next one after that.
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Mon, 1 Jul 2019 21:20:54 +0000 (23:20 +0200)]
grep: inline the return value of a function call used only once
Since
e944d9d932 ("grep: rewrite an if/else condition to avoid
duplicate expression", 2016-06-25) the "ascii_only" variable has only
been used once in compile_regexp(), let's just inline it there.
This makes the code easier to read, and might make it marginally
faster depending on compiler optimizations.
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Mon, 1 Jul 2019 21:20:53 +0000 (23:20 +0200)]
t4210: skip more command-line encoding tests on MinGW
In
5212f91deb ("t4210: skip command-line encoding tests on mingw",
2014-07-17) the positive tests in this file were skipped. That left
the negative tests that don't produce a match.
An upcoming change to migrate the "fixed" backend of grep to PCRE v2
will cause these "log" commands to produce an error instead on
MinGW. This is because the command-line on that platform implicitly
has its encoding changed before being passed to git. See [1].
1. https://public-inbox.org/git/nycvar.QRO.7.76.6.
1907011515150.44@tvgsbejvaqbjf.bet/
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Derrick Stolee [Mon, 1 Jul 2019 13:16:19 +0000 (06:16 -0700)]
t5319: use 'test-tool path-utils' instead of 'ls -l'
Using 'ls -l' and parsing the columns to find file sizes is
problematic when the platform could report the owner as a name
with spaces. Instead, use the 'test-tool path-utils file-size'
command to list only the sizes.
Reported-by: Johannes Sixt <redacted>
Helped-by: Johannes Schindelin <redacted>
Signed-off-by: Derrick Stolee <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:28 +0000 (22:59 +0000)]
t2203: avoid hard-coded object ID values
In order to make this test work with multiple hash algorithms, compute
the object ID used in this test instead of hard-coding it.
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:27 +0000 (22:59 +0000)]
t1710: make hash independent
This test uses several index hashes, which necessarily depend on the
version of the index and the hash algorithm in use. Use test_oid_cache
to provide values for these for both SHA-1 and SHA-256. Also, compute
an object ID and use $EMPTY_BLOB to make the remainder of the tests
independent of the hash algorithm in use.
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:26 +0000 (22:59 +0000)]
t1007: remove SHA1 prerequisites
Update this test to use test_oid_cache to specify the object IDs for
both SHA-1 and SHA-256. Since this test now works with both algorithms,
remove the SHA1 prerequisite.
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:25 +0000 (22:59 +0000)]
t0090: make test pass with SHA-256
One assertion of this test checks for a shrinking cache tree. The
initial index contains a cache tree with two directory names but no
object ID, and the second index contains a cache tree with an object ID
but no directory name.
With SHA-1, the second index is smaller than the first, because the
directory information stored takes more than the 20 bytes of an SHA-1
hash, but with SHA-256, the hash is longer, and the test fails the
assertion that the second index is smaller than the first.
To address this issue, increase the length of the subdirectory name to
ensure that the cache tree does indeed shrink in size regardless of the
algorithm in use.
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:24 +0000 (22:59 +0000)]
t0027: make hash size independent
Several parts of this test generate files that have specific hard-coded
object IDs in them. We don't really care about what the object ID in
question is, so we turn them all to zeros.
However, because some of these values are fixed and some are generated,
they can be of different lengths, which causes problems when running
with SHA-256. Furthermore, some assertions in this test use only fixed
object IDs and some use both fixed and generated ones, so converting
only the expected results fixes some tests while breaking others.
Convert both actual and expected object IDs to the all-zeros object ID
of the appropriate length to ensure that the test passes when using
SHA-256.
The astute observer will notice that both tr and sed are used here.
Converting the tr call to a sed y/// command looks logical at first, but
it isn't possible because POSIX doesn't allow escapes in y/// commands
other than "\\" and "\n".
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:23 +0000 (22:59 +0000)]
t6030: make test work with SHA-256
Compute several object ID values instead of hard-coding them, and use
test_oid_to_path to cleanly produce a path for an object.
Note that the bisect code which is tested here remains sensitive to the
hash algorithm in use because it uses the object ID to disambiguate
between two equidistant commits. Fortunately, SHA-1 and SHA-256
disambiguate identically in the cases we care about, so there is no need
to modify the test to accommodate this situation. However, if a further
hash algorithm change occurs, this test may require some restructuring.
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:22 +0000 (22:59 +0000)]
t5000: make hash independent
This test uses a stub of a very large (64 GB) object to test our
generation of tar archives. In doing so, it uses the object ID of the
object so it can insert it into the database properly. Look up these
values using test_oid. Restructure the test slightly to use
test_oid_in_path.
Since we care about the object, not how it is named in a particular hash
algorithm, rename it to "huge-object", which is shorter and more
descriptive.
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:21 +0000 (22:59 +0000)]
t1450: make hash size independent
Replace several hard-coded full and partial object IDs with variables or
computed values. Create junk data to stuff inside an invalid tree that
can be either 20 or 32 bytes long. Compute a binary all-zeros object ID
instead of hard-coding a 20-byte length.
Additionally, compute various object IDs by using test_oid and
$EMPTY_BLOB so that this test works with multiple hash algorithms.
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:20 +0000 (22:59 +0000)]
t1410: make hash size independent
Instead of parsing object IDs using fixed-length shell patterns, use cut
to extract the first two characters of an object ID in addition to the
test helper for object paths. Update another test to look up an
appropriate object ID fragment from the all-zeros object ID instead of
hardcoding the value.
Although the test for parsing reflogs at BUFSIZ boundaries passes, mark
it with the SHA1 prerequisite, as it doesn't currently usefully test
anything when using a hash longer than 20 bytes.
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
brian m. carlson [Fri, 28 Jun 2019 22:59:19 +0000 (22:59 +0000)]
t: add helper to convert object IDs to paths
There are several places in our testsuite where we want to insert a
slash after an object ID to make it into a path we can reference under
.git/objects, and we have various ways of doing so. Add a helper to
provide a standard way of doing this that works for all size hashes.
Signed-off-by: brian m. carlson <redacted>
Signed-off-by: Junio C Hamano <redacted>
Phillip Wood [Mon, 1 Jul 2019 14:21:06 +0000 (07:21 -0700)]
git-prompt: improve cherry-pick/revert detection
If the user commits or resets a conflict resolution in the middle of a
sequence of cherry-picks or reverts then CHERRY_PICK_HEAD/REVERT_HEAD
will be removed and so in the absence of those files we need to check
.git/sequencer/todo to see if there is a cherry-pick or revert in
progress.
Signed-off-by: Phillip Wood <redacted>
Signed-off-by: Junio C Hamano <redacted>
Michael Platings [Sun, 30 Jun 2019 18:17:32 +0000 (19:17 +0100)]
t8014: remove unnecessary braces
Signed-off-by: Michael Platings <redacted>
Signed-off-by: Junio C Hamano <redacted>
SZEDER Gábor [Sat, 29 Jun 2019 08:24:57 +0000 (10:24 +0200)]
Document that 'git -C ""' works and doesn't change directory
It's been behaving so since
6a536e2076 (git: treat "git -C '<path>'"
as a no-op when <path> is empty, 2015-03-06).
Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>
Eric Wong [Sat, 29 Jun 2019 19:13:59 +0000 (19:13 +0000)]
repack: disable bitmaps-by-default if .keep files exist
Bitmaps aren't useful with multiple packs, and users with
.keep files ended up with redundant packs when bitmaps
got enabled by default in bare repos.
So detect when .keep files exist and stop enabling bitmaps
by default in that case.
Wasteful (but otherwise harmless) race conditions with .keep files
documented by Jeff King still apply and there's a chance we'd
still end up with redundant data on the FS:
https://public-inbox.org/git/
20190623224244.GB1100@sigill.intra.peff.net/
v2: avoid subshell in test case, be multi-index aware
Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos")
Signed-off-by: Eric Wong <redacted>
Helped-by: Jeff King <redacted>
Reported-by: Janos Farkas <redacted>
Signed-off-by: Junio C Hamano <redacted>
Christian Couder [Sat, 29 Jun 2019 07:57:47 +0000 (09:57 +0200)]
t0016: add 'remove' subcommand test
Testing the 'remove' subcommand was forgotten when t0016
was created. Let's fix that.
Helped-by: Derrick Stolee <redacted>
Signed-off-by: Christian Couder <redacted>
Signed-off-by: Junio C Hamano <redacted>
Christian Couder [Sat, 29 Jun 2019 07:57:46 +0000 (09:57 +0200)]
test-oidmap: remove 'add' subcommand
The 'add' subcommand is useless as it is mostly identical
to the 'put' subcommand, so let's remove it.
Helped-by: Derrick Stolee <redacted>
Signed-off-by: Christian Couder <redacted>
Signed-off-by: Junio C Hamano <redacted>
Jeff King [Mon, 1 Jul 2019 13:18:15 +0000 (09:18 -0400)]
check_everything_connected: assume alternate ref tips are valid
When we receive a remote ref update to sha1 "X", we want to check that
we have all of the objects needed by "X". We can assume that our
repository is not currently corrupted, and therefore if we have a ref
pointing at "Y", we have all of its objects. So we can stop our
traversal from "X" as soon as we hit "Y".
If we make the same non-corruption assumption about any repositories we
use to store alternates, then we can also use their ref tips to shorten
the traversal.
This is especially useful when cloning with "--reference", as we
otherwise do not have any local refs to check against, and have to
traverse the whole history, even though the other side may have sent us
few or no objects. Here are results for the included perf test (which
shows off more or less the maximal savings, getting one new commit and
sharing the whole history):
Test HEAD^ HEAD
--------------------------------------------------------------------
[on git.git]
5600.3: clone --reference 2.94(2.86+0.08) 0.09(0.08+0.01) -96.9%
[on linux.git]
5600.3: clone --reference 45.74(45.34+0.41) 0.36(0.30+0.08) -99.2%
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>
Jeff King [Mon, 1 Jul 2019 13:17:40 +0000 (09:17 -0400)]
object-store.h: move for_each_alternate_ref() from transport.h
There's nothing inherently transport-related about enumerating the
alternate ref tips. The code has lived in transport.[ch] because the
only use so far had been advertising available tips during transport.
But it could be used for more, and a future patch will teach rev-list to
access these refs.
Let's move it alongside the other alt-odb code, declaring it in
object-store.h with the implementation in sha1-file.c.
This lets us drop the inclusion of transport.h from receive-pack, which
perhaps shows how it was misplaced (though receive-pack is about
transporting objects, transport.h is mostly about the client side).
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>
Johannes Schindelin [Mon, 1 Jul 2019 11:58:15 +0000 (04:58 -0700)]
rebase --am: ignore rebase.rescheduleFailedExec
The `exec` command is specific to the interactive backend, therefore it
does not make sense for non-interactive rebases to heed that config
setting.
We still want to error out if a non-interactive rebase is started with
`--reschedule-failed-exec`, of course.
Reported by Vas Sudanagunta via:
https://github.com/git/git/commit/
969de3ff0e0#commitcomment-
33257187
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>
Carmine Zaccagnino [Sat, 29 Jun 2019 05:49:12 +0000 (07:49 +0200)]
l10n: it.po: remove an extra space
Remove an extra space between the dashes and the start of the
"abort" option.
Signed-off-by: Carmine Zaccagnino <redacted>
Signed-off-by: Alessandro Menti <redacted>
Jeff King [Fri, 28 Jun 2019 06:24:57 +0000 (02:24 -0400)]
blame: drop some unused function parameters
These unused parameters were introduced recently as part of the
br/blame-ignore topic. I assume they are not indicative of bugs, but are
just leftovers from the development process (they were introduced by the
series but not used in any of its iterations).
Signed-off-by: Jeff King <redacted>
Signed-off-by: Junio C Hamano <redacted>
Nguyễn Thái Ngọc Duy [Fri, 28 Jun 2019 09:35:28 +0000 (16:35 +0700)]
t7814: do not generate same commits in different repos
t7814 has repo tree like this
initial-repo
submodule
sub
In each repo 'submodule' and 'sub', a commit is made to add the same
initial file 'a' with the same message 'add a'. If tests run fast
enough, the two commits are made in the same second, resulting
identical commits.
There is nothing wrong with that per-se. But it could make the test
flaky. Currently all submodule odbs are merged back in the main
one (because we can't, or couldn't, access separate submodule repos
otherwise). But eventually we need to access objects from the right
repo.
Because the same commit could sometimes be present in both 'submodule'
and 'sub', if there is a bug looking up objects in the wrong repo,
sometimes it will go unnoticed because it finds the needed object in the
wrong repo anyway.
Fix this by changing commit time after every commit. This makes all
commits unique. Of course there are still identical blobs in different
repos, but because we often lookup commit first, then tree and blob,
unique commits are already quite safe.
Signed-off-by: Nguyễn Thái Ngọc Duy <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Thu, 27 Jun 2019 23:39:05 +0000 (01:39 +0200)]
grep: don't use PCRE2?_UTF8 with "log --encoding=<non-utf8>"
Fix a bug introduced in
18547aacf5 ("grep/pcre: support utf-8",
2016-06-25) that was missed due to a blindspot in our tests, as
discussed in the previous commit. I then blindly copied the same bug
in
94da9193a6 ("grep: add support for PCRE v2", 2017-06-01) when
adding the PCRE v2 code.
We should not tell PCRE that we're processing UTF-8 just because we're
dealing with non-ASCII. In the case of e.g. "log --encoding=<...>"
under is_utf8_locale() the haystack might be in ISO-8859-1, and the
needle might be in a non-UTF-8 encoding.
Maybe we should be more strict here and die earlier? Should we also be
converting the needle to the encoding in question, and failing if it's
not a string that's valid in that encoding? Maybe.
But for now matching this as non-UTF8 at least has some hope of
producing sensible results, since we know that our default heuristic
of assuming the text to be matched is in the user locale encoding
isn't true when we've explicitly encoded it to be in a different
encoding.
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Ævar Arnfjörð Bjarmason [Thu, 27 Jun 2019 23:39:04 +0000 (01:39 +0200)]
log tests: test regex backends in "--encode=<enc>" tests
Improve the tests added in
04deccda11 ("log: re-encode commit messages
before grepping", 2013-02-11) to test the regex backends. Those tests
never worked as advertised, due to the is_fixed() optimization in
grep.c (which was in place at the time), and the needle in the tests
being a fixed string.
We'd thus always use the "fixed" backend during the tests, which would
use the kwset() backend. This backend liberally accepts any garbage
input, so invalid encodings would be silently accepted.
In a follow-up commit we'll fix this bug, this test just demonstrates
the existing issue.
In practice this issue happened on Windows, see [1], but due to the
structure of the existing tests & how liberal the kwset code is about
garbage we missed this.
Cover this blind spot by testing all our regex engines. The PCRE
backend will spot these invalid encodings. It's possible that this
test breaks the "basic" and "extended" backends on some systems that
are more anal than glibc about the encoding of locale issues with
POSIX functions that I can remember, but PCRE is more careful about
the validation.
1. https://public-inbox.org/git/nycvar.QRO.7.76.6.
1906271113090.44@tvgsbejvaqbjf.bet/
Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:14 +0000 (15:54 -0700)]
list-objects-filter-options: make parser void
This function always returns 0, so make it return void instead.
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:13 +0000 (15:54 -0700)]
list-objects-filter-options: clean up use of ALLOC_GROW
Introduce a new macro ALLOC_GROW_BY which automatically zeros the added
array elements and takes care of updating the nr value. Use the macro in
code introduced earlier in this patchset.
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:12 +0000 (15:54 -0700)]
list-objects-filter-options: allow mult. --filter
Allow combining of multiple filters by simply repeating the --filter
flag. Before this patch, the user had to combine them in a single flag
somewhat awkwardly (e.g. --filter=combine:FOO+BAR), including
URL-encoding the individual filters.
To make this work, in the --filter flag parsing callback, rather than
error out when we detect that the filter_options struct is already
populated, we modify it in-place to contain the added sub-filter. The
existing sub-filter becomes the lhs of the combined filter, and the
next sub-filter becomes the rhs. We also have to URL-encode the LHS and
RHS sub-filters.
We can simplify the operation if the LHS is already a combine: filter.
In that case, we just append the URL-encoded RHS sub-filter to the LHS
spec to get the new spec.
Helped-by: Emily Shaffer <redacted>
Helped-by: Jeff Hostetler <redacted>
Helped-by: Jeff King <redacted>
Helped-by: Junio C Hamano <redacted>
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:11 +0000 (15:54 -0700)]
strbuf: give URL-encoding API a char predicate fn
Allow callers to specify exactly what characters need to be URL-encoded
and which do not. This new API will be taken advantage of in a patch
later in this set.
Helped-by: Jeff King <redacted>
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:10 +0000 (15:54 -0700)]
list-objects-filter-options: make filter_spec a string_list
Make the filter_spec string a string_list rather than a raw C string.
The list of strings must be concatted together to make a complete
filter_spec. A future patch will use this capability to build "combine:"
filter specs gradually.
A strbuf would seem to be a more natural choice for this object, but it
unfortunately requires initialization besides just zero'ing out the
memory. This results in all container structs, and all containers of
those structs, etc., to also require initialization. Initializing them
all would be more cumbersome that simply using a string_list, which
behaves properly when its contents are zero'd.
For the purposes of code simplification, change behavior in how filter
specs are conveyed over the protocol: do not normalize the tree:<depth>
filter specs since there should be no server in existence that supports
tree:# but not tree:#k etc.
Helped-by: Junio C Hamano <redacted>
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:09 +0000 (15:54 -0700)]
list-objects-filter-options: move error check up
Move the check that filter_options->choice is set to higher in the call
stack. This can only be set when the gentle parse function is called
from one of the two call sites.
This is important because in an upcoming patch this may or may not be an
error, and whether it is an error is only known to the
parse_list_objects_filter function.
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:08 +0000 (15:54 -0700)]
list-objects-filter: implement composite filters
Allow combining filters such that only objects accepted by all filters
are shown. The motivation for this is to allow getting directory
listings without also fetching blobs. This can be done by combining
blob:none with tree:<depth>. There are massive repositories that have
larger-than-expected trees - even if you include only a single commit.
A combined filter supports any number of subfilters, and is written in
the following form:
combine:<filter 1>+<filter 2>+<filter 3>
Certain non-alphanumeric characters in each filter must be
URL-encoded.
For now, combined filters must be specified in this form. In a
subsequent commit, rev-list will support multiple --filter arguments
which will have the same effect as specifying one filter argument
starting with "combine:". The documentation will be updated in that
commit, as the URL-encoding scheme is in general not meant to be used
directly by the user, and it is better to describe the URL-encoding
feature in terms of the repeated flag.
Helped-by: Emily Shaffer <redacted>
Helped-by: Jeff Hostetler <redacted>
Helped-by: Johannes Schindelin <redacted>
Helped-by: Jonathan Tan <redacted>
Helped-by: Junio C Hamano <redacted>
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:07 +0000 (15:54 -0700)]
list-objects-filter-options: always supply *errbuf
Making errbuf an optional argument complicates error reporting. Fix this
by making all callers supply an errbuf, even if they may ignore it. This
will be important in follow-up patches where the filter-spec parsing has
more pitfalls and possible errors.
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:06 +0000 (15:54 -0700)]
list-objects-filter: put omits set in filter struct
The oidset *omits pointer must be accessed by the combine filter in a
type-agnostic way once the graph traversal is over. Store that pointer
in the general `filter` struct. This will be used in a follow-up patch
to implement the combine filter.
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Matthew DeVore [Thu, 27 Jun 2019 22:54:05 +0000 (15:54 -0700)]
list-objects-filter: encapsulate filter components
Encapsulate filter_fn, filter_free_fn, and filter_data into their own
opaque struct.
Due to opaqueness, filter_fn and filter_free_fn can no longer be
accessed directly by users. Currently, all usages of filter_fn are
guarded by a necessary check:
(obj->flags & NOT_USER_GIVEN) && filter_fn
Take the opportunity to include this check into the new function
list_objects_filter__filter_object(), so that we no longer need to write
this check at every caller of the filter function.
Also, the init functions in list-objects-filter.c no longer need to
confusingly return the filter constituents in various places (filter_fn
and filter_free_fn as out parameters, and filter_data as the function's
return value); they can just initialize the "struct filter" passed in.
Helped-by: Jeff Hostetler <redacted>
Helped-by: Jonathan Tan <redacted>
Helped-by: Junio C Hamano <redacted>
Signed-off-by: Matthew DeVore <redacted>
Signed-off-by: Junio C Hamano <redacted>
Taylor Blau [Wed, 26 Jun 2019 22:41:48 +0000 (17:41 -0500)]
ref-filter.c: find disjoint pattern prefixes
Since
cfe004a5a9 (ref-filter: limit traversal to prefix, 2017-05-22),
the ref-filter code has sought to limit the traversals to a prefix of
the given patterns.
That code stopped short of handling more than one pattern, because it
means invoking 'for_each_ref_in' multiple times. If we're not careful
about which patterns overlap, we will output the same refs multiple
times.
For instance, consider the set of patterns 'refs/heads/a/*',
'refs/heads/a/b/c', and 'refs/tags/v1.0.0'. If we naïvely ran:
for_each_ref_in("refs/heads/a/*", ...);
for_each_ref_in("refs/heads/a/b/c", ...);
for_each_ref_in("refs/tags/v1.0.0", ...);
we would see 'refs/heads/a/b/c' (and everything underneath it) twice.
Instead, we want to partition the patterns into disjoint sets, where we
know that no ref will be matched by any two patterns in different sets.
In the above, these are:
- {'refs/heads/a/*', 'refs/heads/a/b/c'}, and
- {'refs/tags/v1.0.0'}
Given one of these disjoint sets, what is a suitable pattern to pass to
'for_each_ref_in'? One approach is to compute the longest common prefix
over all elements in that disjoint set, and let the caller cull out the
refs they didn't want. Computing the longest prefix means that in most
cases, we won't match too many things the caller would like to ignore.
The longest common prefixes of the above are:
- {'refs/heads/a/*', 'refs/heads/a/b/c'} -> refs/heads/a/*
- {'refs/tags/v1.0.0'} -> refs/tags/v1.0.0
We instead invoke:
for_each_ref_in("refs/heads/a/*", ...);
for_each_ref_in("refs/tags/v1.0.0", ...);
Which provides us with the refs we were looking for with a minimal
amount of extra cruft, but never a duplicate of the ref we asked for.
Implemented here is an algorithm which accomplishes the above, which
works as follows:
1. Lexicographically sort the given list of patterns.
2. Initialize 'prefix' to the empty string, where our goal is to
build each element in the above set of longest common prefixes.
3. Consider each pattern in the given set, and emit 'prefix' if it
reaches the end of a pattern, or touches a wildcard character. The
end of a string is treated as if it precedes a wildcard. (Note that
there is some room for future work to detect that, e.g., 'a?b' and
'abc' are disjoint).
4. Otherwise, recurse on step (3) with the slice of the list
corresponding to our current prefix (i.e., the subset of patterns
that have our prefix as a literal string prefix.)
This algorithm is 'O(kn + n log(n))', where 'k' is max(len(pattern)) for
each pattern in the list, and 'n' is len(patterns).
By discovering this set of interesting patterns, we reduce the runtime
of multi-pattern 'git for-each-ref' (and other ref traversals) from
O(N) to O(n log(N)), where 'N' is the total number of packed references.
Running 'git for-each-ref refs/tags/a refs/tags/b' on a repository with
10,000,000 refs in 'refs/tags/huge-N', my best-of-five times drop from:
real 0m5.805s
user 0m5.188s
sys 0m0.468s
to:
real 0m0.001s
user 0m0.000s
sys 0m0.000s
On linux.git, the times to dig out two of the latest -rc tags drops from
0.002s to 0.001s, so the change on repositories with fewer tags is much
less noticeable.
Co-authored-by: Jeff King <redacted>
Signed-off-by: Jeff King <redacted>
Signed-off-by: Taylor Blau <redacted>
Signed-off-by: Junio C Hamano <redacted>
SZEDER Gábor [Mon, 24 Jun 2019 18:13:18 +0000 (20:13 +0200)]
progress: use term_clear_line()
To make sure that the previously displayed progress line is completely
covered up when the new line is shorter, commit
545dc345eb (progress:
break too long progress bar lines, 2019-04-12) added a bunch of
calculations to figure out how many characters it needs to overwrite
with spaces.
Use the just introduced term_clear_line() helper function to, well,
clear the last line, making all these calculations unnecessary, and
thus simplifying the code considerably.
Three tests in 't5541-http-push-smart.sh' 'grep' for specific text
shown in the progress lines at the beginning of the line, but now
those lines begin either with the ANSI escape sequence or with the
terminal width worth of space characters clearing the line. Relax the
'grep' patterns to match anywhere on the line. Note that only two of
these three tests fail without relaxing their 'grep' pattern, but the
third looks for the absence of the pattern, so it still succeeds, but
without the adjustment would potentially hide future regressions.
Note also that with this change we no longer need the length of the
previously displayed progress line, so the strbuf added to 'struct
progress' in
d53ba841d4 (progress: assemble percentage and counters in
a strbuf before printing, 2019-04-05) is not strictly necessary
anymore. We still keep it, though, as it avoids allocating and
releasing a strbuf each time the progress is updated.
Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>
SZEDER Gábor [Thu, 27 Jun 2019 13:42:48 +0000 (15:42 +0200)]
rebase: fix garbled progress display with '-x'
When running a command with the 'exec' instruction during an
interactive rebase session, or for a range of commits using 'git
rebase -x', the output can be a bit garbled when the name of the
command is short enough:
$ git rebase -x true HEAD~5
Executing: true
Executing: true
Executing: true
Executing: true
Executing: true)
Successfully rebased and updated refs/heads/master.
Note the ')' at the end of the last line. It gets more garbled as the
range of commits increases:
$ git rebase -x true HEAD~50
Executing: true)
[ repeated 3 more times ]
Executing: true0)
[ repeated 44 more times ]
Executing: true00)
Successfully rebased and updated refs/heads/master.
Those extra numbers and ')' are remnants of the previously displayed
"Rebasing (N/M)" progress lines that are usually completely
overwritten by the "Executing: <cmd>" lines, unless 'cmd' is short and
the "N/M" part is long.
Make sure that the previously displayed "Rebasing (N/M)" line is
cleared by using the term_clear_line() helper function added in the
previous patch. Do so only when not being '--verbose', because in
that case these "Rebasing (N/M)" lines are not printed as progress
(i.e. as lines with '\r' at the end), but as "regular" output (with
'\n' at the end).
A couple of other rebase commands print similar messages, e.g.
"Stopped at <abbrev-oid>... <subject>" for the 'edit' or 'break'
commands, or the "Successfully rebased and updated <full-ref>." at the
very end. These are so long that they practically always overwrite
that "Rebasing (N/M)" progress line, but let's be prudent, and clear
the last line before printing these, too.
In 't3420-rebase-autostash.sh' two helper functions prepare the
expected output of four tests that check the full output of 'git
rebase' and thus are affected by this change, so adjust their
expectations to account for the new line clearing.
Note that this patch doesn't completely eliminate the possibility of
similar garbled outputs, e.g. some error messages from rebase or the
"Auto-merging <file>" message from within the depths of the merge
machinery might not be long enough to completely cover the last
"Rebasing (N/M)" line. This patch doesn't do anything about them,
because dealing with them individually would result in way too much
churn, while having a catch-all term_clear_line() call in the common
code path of pick_commits() would hide the "Rebasing (N/M)" line way
too soon, and it would either flicker or be invisible.
Signed-off-by: SZEDER Gábor <redacted>
Signed-off-by: Junio C Hamano <redacted>
Johannes Schindelin [Thu, 27 Jun 2019 09:37:19 +0000 (02:37 -0700)]
mingw: use Unicode functions explicitly
Many Win32 API functions actually exist in two variants: one with
the `A` suffix that takes ANSI parameters (`char *` or `const char *`)
and one with the `W` suffix that takes Unicode parameters (`wchar_t *`
or `const wchar_t *`).
The ANSI variant assumes that the strings are encoded according to
whatever is the current locale. This is not what Git wants to use on
Windows: we assume that `char *` variables point to strings encoded in
UTF-8.
There is a pseudo UTF-8 locale on Windows, but it does not work
as one might expect. In addition, if we overrode the user's locale, that
would modify the behavior of programs spawned by Git (such as editors,
difftools, etc), therefore we cannot use that pseudo locale.
Further, it is actually highly encouraged to use the Unicode versions
instead of the ANSI versions, so let's do precisely that.
Note: when calling the Win32 API functions _without_ any suffix, it
depends whether the `UNICODE` constant is defined before the relevant
headers are #include'd. Without that constant, the ANSI variants are
used. Let's be explicit and avoid that ambiguity.
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>
Johannes Schindelin [Thu, 27 Jun 2019 09:37:18 +0000 (02:37 -0700)]
mingw: get pw_name in UTF-8 format
Previously, we would have obtained the user name encoded in whatever the
current code page is.
Note: the "user name" here does not denote the full name but instead the
short logon name.
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>
Cesar Eduardo Barros [Thu, 27 Jun 2019 08:49:33 +0000 (01:49 -0700)]
mingw: embed a manifest to trick UAC into Doing The Right Thing
On Windows >= Vista, not having an application manifest with a
requestedExecutionLevel can cause several kinds of confusing behavior.
The first and more obvious behavior is "Installer Detection" of the
"User Account Control" (also known as "UAC") feature, where Windows
sometimes decides (by looking at things like the file name and even
sequences of bytes within the executable) that an executable is an
installer and should run elevated (causing the well-known popup dialog
to appear). In Git's context, subcommands such as "git patch-id" or "git
update-index" fall prey to this behavior.
The second and more confusing behavior is "File Virtualization". It
means that when files are written without having write permission, it
does not fail (as expected), but they are instead redirected to
somewhere else. When the files are read, the original contents are
returned, though, not the ones that were just written somewhere else.
Even more confusing, not all write accesses are redirected; Trying to
write to write-protected .exe files, for example, will fail instead of
redirecting.
In addition to being unwanted behavior, File Virtualization causes
dramatic slowdowns in Git (see for instance
http://code.google.com/p/msysgit/issues/detail?id=320).
A third unwanted behavior of Windows >= Vista is that it lies about the
Windows version when calling `GetWindowsVersionEx()`.
There are two ways to prevent these unwanted behaviors: Either you embed
an application manifest (which really is an XML document conforming to a
specific schema) within all your executables, or you add an external
manifest (a file with the same name followed by `.manifest`) to all your
executables. Since Git's builtins are hardlinked (or copied), it is
simpler and more robust to embed a manifest.
Recent enough MSVC compilers already embed a working internal manifest,
and building with mingw-w64 (which is the case in Git for Windows' SDK)
does it, too, but for MinGW you have to do so by hand.
In any case, it is better to be explicit about this manifest, that way
changes in the compiler toolchain won't surprise us (as mingw-w64 once
did when it broke `GetWindowsVersionEx()` by mistake).
References:
- New UAC Technologies for Windows Vista
http://msdn.microsoft.com/en-us/library/
bb756960.aspx
- Create and Embed an Application Manifest (UAC)
http://msdn.microsoft.com/en-us/library/
bb756929.aspx
Signed-off-by: Cesar Eduardo Barros <redacted>
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>
Johannes Schindelin [Thu, 27 Jun 2019 09:29:02 +0000 (02:29 -0700)]
mingw: enable stack smashing protector
To reduce Git for Windows' attack surface, we started using the Address
Space Layout Randomization and Data Execution Prevention features in
ce6a158561f9 (mingw: enable DEP and ASLR, 2019-05-08).
To remove yet another attack vector, let's make use of gcc's stack
smashing protector that helps detect stack buffer overruns early.
Rather than using -fstack-protector, we use -fstack-protector-strong
because on Windows: The latter appears to strike a better balance
between the performance impact and the provided safety.
In a non-scientific test (time git log --grep=is -p), best of 5 timings
went from 23.009s to 22.997s, i.e. the performance impact was *well*
lost in the noise.
This fixes https://github.com/git-for-windows/git/issues/501
Signed-off-by: Johannes Schindelin <redacted>
Signed-off-by: Junio C Hamano <redacted>
Nguyễn Thái Ngọc Duy [Thu, 27 Jun 2019 09:28:52 +0000 (16:28 +0700)]
Use the right 'struct repository' instead of the_repository
There are a couple of places where 'struct repository' is already passed
around, but the_repository is still used. Use the right repo.
Signed-off-by: Nguyễn Thái Ngọc Duy <redacted>
Signed-off-by: Junio C Hamano <redacted>