updates with contents generation were super syscall-heavy. for each path
in a package (so at least 2-4, but ordinarily >4) we'd do a db.Put in
ContentsIndex which results in one syscall.Write. so, for every package in
a published repo we'd have to do *at least* 2 but ordinarily >4 syscalls.
this gets abysmally slow very quickly depending on the available system
specs.
instead, start a batch inside each package and finish it when we are done
with the package. this should keep the memory footprint negligible, but
reduce the write() calls from N to 1.
on one of KDE's servers I have seen update publishing of 7600 packages go
from ~28s to ~9s when using batch putting on an HDD.
on my local system the same set of packages go from ~14s to ~6s on an SSD.
(all inodes in cache in both cases)
- new publish calls can now enable AcquireByHash by right away (previously
one would have had to create a new publishing endpoint and then
explicitly switch it to AcquireByHash)
- all json marshals of PublishedRepo now contain AcquireByHash (allows
inspecting if a given endpoint has AcquireByHash enabled already; also
enables verification that a switch/update actually applied a
potential AcquireByHash change
- update all tests to reflect that default state of AcquireByHash
- update creation and switch testing to explicitly toggle AcquireByHash to
make sure state mutation works as expected
The added "aptly publish repo" option "-access-by-hash" publishes
the index files (Packages*, Sources*) also as hardlinked hashes.
Example:
/dists/yakkety/main/binary-amd64/by-hash/SHA512/31833ec39acc...
The Release files indicate this with the option "Acquire-By-Hash: yes"
This is used by apt >= 1.2.0 and prevents the "Hash sum mismatch" race
condition between a server side "aptly publish repo" and "apt-get update"
on a client.
See: http://www.chiark.greenend.org.uk/~cjwatson/blog/no-more-hash-sum-mismatch-errors.html
This implementation uses symlinks in the by-hash/*/ directory for keeping
only two versions of the index files and deleting older files
automatically.
Note: this only works with aptly.FileSystemPublishedStorage
Closes: #536
Signed-off-by: André Roth <neolynx@gmail.com>
Allow skipping unreferenced files cleanup on publish switch/update/drop
via the -skip-cleanup command line option.
Also support API SkipCleanup parameter.
Fixes#570.
NB: Go `defer` order execution is reverse to the order `defer` statements
are executed.
So before the change, `Drop()` was called before `Close()`, which was no-op.
Change that to explicit order in single func, print errors if they happen.
This replaces `panic` which aborts aptly execution with warning
message on console. So aptly continues publishing actions, but
`Contents` indexes might be incomplete.
Error will be printed every time contents generation is triggered.
Debian's installer validates a mirror by downloading a Release,
and then cross-checking it based on its Codename and Suite. Without
a Suite field, the installer becomes unhappy (e.g. segfaults) and
won't continue the install.
Making the Codename and Suite the same validates with no problem.