on hosts which have wildcard dns domains in their local domain search
list, builds failed because "nosuch.host" could actually be resolved.
Since ".host" isn't a recommended TLD by RFC2606, we use ".invalid" now.
And since this is not enough to fix the problem, we use now absoulte
domain names (having a '.' at the end)
The previous reflist logic would early-exit the loop body if one of the
lists was empty, but that skips the compacting logic entirely.
Instead of doing the early-exit, we can leave a list's ref as nil when
the list end is reached and then flip the comparison result, which will
essentially treat it as being greater than all others. This should
preserve the general behavior without omitting the compaction.
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
The output doesn't actually depend on the reflists, and loading them for
every published repo starts to take substantial time and memory.
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
Reflists are basically stored as arrays of strings, which are quite
space-efficient in MessagePack. Thus, using zero-copy decoding results
in nice performance and memory savings, because the overhead of separate
allocations ends up far exceeding the overhead of the original slice.
With the included benchmark run for 20s with -benchmem, the runtime,
memory usage, and allocations go from ~740us/op, ~192KiB/op, and 4100
allocs/op to ~240us/op, ~97KiB/op, and 13 allocs/op, respectively.
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
The cleanup phase needs to list out all the files in each component in
order to determine what's still in use. When there's a large number of
sources (e.g. from having many snapshots), the time spent just loading
the package information becomes substantial. However, in many cases,
most of the packages being loaded are actually shared across the
sources; if you're taking frequent snapshots, for instance, most of the
packages in each snapshot will be the same as other snapshots. In these
cases, re-reading the packages repeatedly is just a waste of time.
To improve this, we maintain a list of refs that we know were processed
for each component. When listing the refs from a source, only the ones
that have not yet been processed will be examined. Some tests were also
added specifically to check listing the files in a component.
With this change, listing the files in components on a copy of our
production database went from >10 minutes to ~10 seconds, and the newly
added benchmark went from ~300ms to ~43ms.
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
When merging reflists with ignoreConflicting set to true and
overrideMatching set to false, the individual ref components are never
examined, but the refs are still split anyway. Avoiding the split when
we never use the components brings a massive speedup: on my system, the
included benchmark goes from ~1500 us/it to ~180 us/it.
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
In some local tests w/ a slowed down filesystem, this massively cut down
on the time to clean up a repository by ~3x, bringing a total 'publish
update' time from ~16s to ~13s.
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
- use s3 mirror instead of internet download
- reduce download verbosity
- do not use venv in docker-system-tests
- be more verbose on test output
- do not run golangci-lint in system-tests
Ubuntu has started depreciating the Debian installer in focal
and moved the installer images to a different path. In versions
after focal, they are completly removed. This basically gives
us more time to figure out how to use the new system.
When a publishing uses a publish prefix, instead of listing the contents
of the whole bucket under the storage prefix, only list the contents of
the bucket under the storage prefix and publish prefix, and cache it by
publish prefix.
This speeds up publish operations under a prefix.
instead of caching the whole s3 bucket, cache only the pool path. this
requires an additional parameter, and since this is an interface, all
implementations need to follow. might help in other backends too.
closes#1181