Commit Graph

2198 Commits

Author SHA1 Message Date
Ryan Gonzalez
f9325fbc91 Add support for Azure package pools
This adds support for storing packages directly on Azure, with no truly
"local" (on-disk) repo used. The existing Azure PublishedStorage
implementation was refactored to move the shared code to a separate
context struct, which can then be re-used by the new PackagePool. In
addition, the files package's mockChecksumStorage was made public so
that it could be used in the Azure PackagePool tests as well.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-06-17 11:51:18 +02:00
Ryan Gonzalez
810df17009 Clean up temporary files when mirroring
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-06-17 11:51:18 +02:00
Ryan Gonzalez
19255debb9 Reduce required usage of LocalPackagePool
Several sections of the code *required* a LocalPackagePool, but they
could still perform their operations with a standard PackagePool.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-06-17 11:51:18 +02:00
Ryan Gonzalez
1ebd37f9ad Move Stat() into LocalPackagePool
The contents of `os.Stat` are rather fitted towards local package pools,
but the method is in the generic PackagePool interface. This moves it to
LocalPackagePool, and the use case of simply finding a file's size is
delegated to a new, more generic PackagePool.Size() method.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-06-17 11:51:18 +02:00
Ryan Gonzalez
8e37813129 Add support for custom package pool locations
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-06-17 11:51:18 +02:00
Ryan Gonzalez
91574b53d9 Add functional tests for Azure publishing
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-06-17 11:51:18 +02:00
Ryan Gonzalez
b8e0aba3cf Document Azure configuration
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-06-17 11:51:18 +02:00
Ryan Gonzalez
0eccaf2b60 Change Azure endpoint syntax to require a full URL
Before, a "partial" URL (either "localhost:port" or an endpoint URL
*without* the account name as the subdomain) would be specified, and the
full one would automatically be inferred. Although this is somewhat
nice, it means that the endpoint string doesn't match the official Azure
syntax:

https://docs.microsoft.com/en-us/azure/storage/common/storage-configure-connection-string

This also raises issues for the creation of functional tests for Azure,
as the code to determine the endpoint string needs to be duplicated
there as well.

Instead, it's just easiest to follow Azure's own standard, and then
sidestep the need for any custom logic in the functional tests.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-06-17 11:51:18 +02:00
Ryan Gonzalez
859edb10cd Fix S3 tests on Python 3
read_path() can read in binary, which the S3 tests don't support (simply
because they don't need it)...but it needs to be able to take the `mode`
argument anyway.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-06-17 11:51:18 +02:00
André Roth
f0bf519d36 cleanup 2024-06-15 19:18:14 +02:00
André Roth
d8cfafccc9 system tests: improve log output 2024-06-15 19:18:14 +02:00
André Roth
34c1c1f83a system-tests: make docker abortable and use colors 2024-06-15 19:18:14 +02:00
André Roth
3e1485faf5 queue sync calls 2024-06-15 19:18:14 +02:00
André Roth
efc5573b8e Makefile: run unit tests in docker 2024-06-15 19:18:14 +02:00
André Roth
45035802be implement task queue waiting for resources 2024-06-15 19:18:14 +02:00
André Roth
2d97ba2bbd fix flake8 error 2024-06-15 19:18:14 +02:00
André Roth
05fed16f6d fix golangci-lint error 2024-06-15 19:18:14 +02:00
Ramón N.Rodriguez
9b1f4272c0 AUTHORS: claiming the fame and glory 2024-06-15 19:18:14 +02:00
Ramón N.Rodriguez
1987220f1e api: publish: block on concurrent calls
This commit blocks concurrent calls to RunTaskInBackground which is
intended to fix the quirky behaviour where concurrent PUT calls to
api/publish/<prefix>/<distribution> would immedietly reuturn an error.

The solution proposed in this commit is not elegant and probaly has
unintended side-effects. The intention of this commit is to highlight
the area that actually needs to be addressed.
Ideally this patch is amended or dropped entierly in favor of a better
fixup.
2024-06-15 19:18:14 +02:00
Ramón N.Rodriguez
47696a3303 api:publish: test concurrency
This commit introduces a test which runs concurrent publishes (which
could be parallell with multiproccessing, python is fun).
The test purposly fails (at the point in history that this patch is
written) in order to make it as easy as possible to verify later patches,
which hopefully addresses concurrency problems.

The same behaviour can easily be tested outside of the system tests with
the following (or similar) shell

$ aptly serve -listen=:8080 -no-lock
$ aptly repo create create -distributions=testing local-repo
$ atply publish repo -architectures=amd64 local-repo
$ apt download aptly
$ aptly repo add local-repo ./aptly*.deb
$ for _ in $(seq 10); do curl -X PUT 127.0.0.1:8080/api/publish//testing

In the local testing perfomed (on a dual core vm) the first 1-4 jobs
would typically succeed and the rest would error out.
2024-06-15 19:18:14 +02:00
André Roth
787f954833 mirror: do not download already downloaded packages
this change imports downloaded packages into the pool immediately after download.
in case mirroring is aborted and later resumed, already downloaded packages will not be downloaded anymore.
2024-06-15 16:15:23 +02:00
André Roth
e9bdb983c8 tasks: improve log level 2024-06-15 16:15:23 +02:00
Noa Resare
b4cd86aa14 Introduce option multi-dist to the publish commands
This change makes it possible to publish multiple distributions
with packages named the same but with different content by changing
structure of the generated pool hierarchy. The option not enabled
by default as this changes the structure of the output which could
break the expectations of other tools.
2024-06-15 11:27:26 +02:00
Cal Jurgella
4bd26f5977 Enable SkipArchitectureCheck and IgnoreSignatures in mirror API 2024-06-14 14:30:29 +02:00
André Roth
57ff7c69cd mirror api: add test for SkipArchitectureCheck and SkipComponentCheck 2024-06-14 14:30:29 +02:00
André Roth
c843709eab add github sponsors 2024-06-12 13:01:40 +02:00
Jens Reinsberger
4bc2180eed fix failing build on hosts with wildcard dns
on hosts which have wildcard dns domains in their local domain search
list, builds failed because "nosuch.host" could actually be resolved.

Since ".host" isn't a recommended TLD by RFC2606, we use ".invalid" now.
And since this is not enough to fix the problem, we use now absoulte
domain names (having a '.' at the end)
2024-05-15 16:42:37 +02:00
Paul Cacheux
5353890b24 use tagged version of ProtonMail/go-crypto in go.mod 2024-05-15 16:41:35 +02:00
Ryan Gonzalez
79975bf2b6 Fix reflist diffs failing to compact when one of the inputs ends
The previous reflist logic would early-exit the loop body if one of the
lists was empty, but that skips the compacting logic entirely.

Instead of doing the early-exit, we can leave a list's ref as nil when
the list end is reached and then flip the comparison result, which will
essentially treat it as being greater than all others. This should
preserve the general behavior without omitting the compaction.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-04-24 17:36:36 +02:00
Ryan Gonzalez
8d09c202db Skip loading reflists when listing published repos
The output doesn't actually depend on the reflists, and loading them for
every published repo starts to take substantial time and memory.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-04-24 17:35:44 +02:00
André Roth
27013c0b2b apply go mod tidy 2024-04-24 16:46:16 +02:00
André Roth
756c499314 fix go mod tidy and use go 1.19
let's be compatible with debian/bookworm
2024-04-24 16:46:16 +02:00
Ryan Gonzalez
6c6d1b18ba Use zero-copy decoding for reflists
Reflists are basically stored as arrays of strings, which are quite
space-efficient in MessagePack. Thus, using zero-copy decoding results
in nice performance and memory savings, because the overhead of separate
allocations ends up far exceeding the overhead of the original slice.

With the included benchmark run for 20s with -benchmem, the runtime,
memory usage, and allocations go from ~740us/op, ~192KiB/op, and 4100
allocs/op to ~240us/op, ~97KiB/op, and 13 allocs/op, respectively.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-04-24 16:46:16 +02:00
Ryan Gonzalez
8cb1236a8c Improve publish cleanup perf when sources share most of their packages
The cleanup phase needs to list out all the files in each component in
order to determine what's still in use. When there's a large number of
sources (e.g. from having many snapshots), the time spent just loading
the package information becomes substantial. However, in many cases,
most of the packages being loaded are actually shared across the
sources; if you're taking frequent snapshots, for instance, most of the
packages in each snapshot will be the same as other snapshots. In these
cases, re-reading the packages repeatedly is just a waste of time.

To improve this, we maintain a list of refs that we know were processed
for each component. When listing the refs from a source, only the ones
that have not yet been processed will be examined. Some tests were also
added specifically to check listing the files in a component.

With this change, listing the files in components on a copy of our
production database went from >10 minutes to ~10 seconds, and the newly
added benchmark went from ~300ms to ~43ms.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-04-24 16:46:16 +02:00
Ryan Gonzalez
5636a9990b Improve performance of simple reflist merges
When merging reflists with ignoreConflicting set to true and
overrideMatching set to false, the individual ref components are never
examined, but the refs are still split anyway. Avoiding the split when
we never use the components brings a massive speedup: on my system, the
included benchmark goes from ~1500 us/it to ~180 us/it.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-04-24 16:46:16 +02:00
Ryan Gonzalez
8ab8398c50 Use github.com/saracen/walker for file walk operations
In some local tests w/ a slowed down filesystem, this massively cut down
on the time to clean up a repository by ~3x, bringing a total 'publish
update' time from ~16s to ~13s.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
2024-04-24 16:46:16 +02:00
André Roth
53c4a567c0 README: add new gitter url 2024-04-21 13:22:06 +02:00
dependabot[bot]
ff8f79f883 Bump golang.org/x/net from 0.17.0 to 0.23.0
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.17.0 to 0.23.0.
- [Commits](https://github.com/golang/net/compare/v0.17.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-20 22:51:50 +02:00
André Roth
06be6f23a6 Revert "Bump requests from 2.28.2 to 2.31.0 in /system"
This reverts commit 24e62b58bc.
2024-04-17 22:08:27 +02:00
dependabot[bot]
24e62b58bc Bump requests from 2.28.2 to 2.31.0 in /system
Bumps [requests](https://github.com/psf/requests) from 2.28.2 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.28.2...v2.31.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-17 21:35:16 +02:00
André Roth
b37a3341e8 trigger pipeline 2024-04-17 18:39:45 +02:00
André Roth
eee6ccc311 trigger pipeline 2024-04-17 18:19:29 +02:00
André Roth
f233a21787 github CI: nightly builds for multiple distributions
Since the pipeline changed to use ucuntu22.04 runners, the nighty debian packages do not work on debian buster anymore.
    This change updates the pipeline to build for Ubuntu 20.04 and 22.04 as well as for
    Debian 10, 11 and 12.

    The distribution specific apt sources are as follows:

      "deb http://repo.aptly.info/nightly-bookworm bookworm main"

    (replace bookworm with focal, buster, bullseye. Install aptly repo key with: curl -fsS https://www.aptly.info/pubkey.txt | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/aptly.gpg)

    The builds on focal will also release to the legacy nightly apt repo: https://github.com/aptly-dev/aptly/actions/runs/8723898496/job/23933824692#step:7:24

    This only affects nightly builds, for now.
    Pipeline example: [](https://github.com/aptly-dev/aptly/actions/runs/8723898496)
2024-04-17 17:48:37 +02:00
dependabot[bot]
940538e39b Bump google.golang.org/protobuf from 1.31.0 to 1.33.0
Bumps google.golang.org/protobuf from 1.31.0 to 1.33.0.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-11 23:34:38 +02:00
André Roth
3a29e08ff2 fix typo 2024-04-11 19:40:25 +02:00
André Roth
2a2e35c096 add test for publishing distribution with slash (/) 2024-04-11 19:37:51 +02:00
Ariel D'Alessandro
287a947c62 Revert "Don't allow '/' in distribution name, auto-replace '/' with '-' while guessing. #110"
This reverts commit 1daa076d65.

Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com>
2024-04-11 19:37:51 +02:00
André Roth
32595e7cb7 openpgp: export full key for internal gpg 2024-04-11 10:15:02 +02:00
André Roth
9deb031c44 fix system tests
- use s3 mirror instead of internet download
- reduce download verbosity
- do not use venv in docker-system-tests
- be more verbose on test output
- do not run golangci-lint in system-tests
2024-04-11 10:15:02 +02:00
André Roth
6be4f5e8d0 gpg api: allow self signed and use default gpg version 2024-04-03 10:16:56 +02:00