Tested with a port with 5500 output files, from which 53 candidates
for stripping. The timing was:
- 30 secs (old code)
- 0.2 secs (new code - current patch)
- 1.4 secs (new code with proper quoting - not commited)
Most of the time is spent in getting the output from the file program.
The old code started the file program for every file.
The new/present code starts N=$(nproc) processes in parallel with 10 input
files for each 'file' process. The output of the file program is feed
to an awk process which filters-out only the candidates for stripping.
This process runs in parallel too (but with one file per strip process).
The --no-buffer options is used because it sounds good (the strip should
start as soon as one of the file processes has a verdict for one of
their 10 files), but I didn't measure it.
The "xargs -r -L10 -P$N" command will miss the files with spaces.
For a file named "a b" it will spawn:
file "a" "b"
A slower version, with proper quoting, "xargs -r -L1 -P$N -I{} file ... '{}'",
will spawn:
file "a b"
* xargs will force -L1 if -I{} is used
Given that the file process doesn't return error codes for non-existing
files, and that there is a very low probability that we have ports with
filenames constaining spaces that are worth stripping them, I choose to
keep the faster (non perfect) version.
build_needed() function returns true/yes if a source is missing,
even when the target/package exists and is up-to-date. This
behaviour triggers unnecesarry rebuilds.
Because only the remote sources can be missing and
we don't want to rebuild a port just because we've deleted
some of its remote sources, this patch changes that condition
from:
( the source is missing OR is newer than the target/package )
to
( the source exists AND is newer than the target/package )
This patch aborts the package installation and remove the package
from the database on any extraction error. This fixes FS#620.
I don't know why the extraction errors were ignored (even documented).
The initial commit already had this behaviour. Another odd thing
is that the install status of the package was commited to the database
before it was installed while there are exceptions used in pkg_install():
- archive open error
- empty archive
- archive read error
Any of these errors will falsely mark the package as installed (maybe
not a big problem with upgrades).
To avoid breaking something (else), this fix kicks in
only with fresh installs (not upgrades).
Thanks Erich Eckner, for pointing out that the issue
is reproductible with a read-only destination. Otherwise,
I don't think I would have looked at this issue.
A test case is presented with the FS issue, just in case someone
who loves C++ enough to dig deeper or have more knowledge about
the history of this program would take a closer look.
this is an artifact from make_md5sum() and NOT needed for make_signature().
In fact, this check breaks make_signature() because now packages that don't
have a $source don't get any .signature at all, even though it can
(and should) create a sha256 for the footprint and Pkgfile.
Patch by Camille (onodera).