aptly

mirror of https://github.com/aptly-dev/aptly.git synced 2026-07-02 09:47:46 +00:00

Author	SHA1	Message	Date
Nick Bozhenko	463c34a38e	Fix race conditions and improve etcd timeout handling This commit addresses several critical race conditions and improves the reliability of etcd operations through better timeout and retry handling. ## Race Condition Fixes 1. Task Resource Management Bug - Fixed incorrect variable usage in task/list.go:78 - Was using completed task's resources instead of idle task's resources - This caused resource conflicts and potential deadlocks 2. Database Channel Initialization - Added sync.Once pattern to ensure thread-safe channel initialization - Prevents panic from concurrent access during startup - Created initDBRequests() function for safe initialization 3. Published Storage Double-Checked Locking - Implemented double-checked locking pattern in GetPublishedStorage - Reduces lock contention while preventing concurrent initialization - Improves performance for frequently accessed storage 4. File Operation Synchronization - Created FileLockRegistry in utils/filelock.go - Prevents concurrent file operations (create, rename, delete, link) - Implements deadlock prevention for multi-file operations - Critical for preventing file corruption during parallel publishes 5. WaitGroup Miscount Prevention - Added defer pattern to ensure Done() is always called - Protects against panics during task execution - Prevents "negative WaitGroup counter" errors ## etcd Improvements 1. Timeout Protection - Replaced global context.TODO() with per-operation timeout contexts - Default timeout: 60 seconds (configurable) - Prevents indefinite hangs when etcd is unresponsive 2. Environment Variable Configuration - APTLY_ETCD_TIMEOUT: Operation timeout (default: 60s) - APTLY_ETCD_DIAL_TIMEOUT: Connection timeout (default: 60s) - APTLY_ETCD_KEEPALIVE: Keep-alive timeout (default: 7200s) - APTLY_ETCD_MAX_MSG_SIZE: Max message size (default: 50MB) 3. Retry Logic for Read Operations - Get operations retry up to 3 times with exponential backoff - Only retries on temporary/network errors - Improves reliability without risking data inconsistency 4. Enhanced Error Logging - All etcd errors now logged with operation context - Replaces silent failures with actionable error messages - Improves debugging and monitoring capabilities 5. Increased Message Size Limits - Default increased from 10MB to 50MB - Configurable via environment variable - Prevents "message too large" errors for large operations ## Testing - Added comprehensive tests for etcd timeout functionality - Tests verify context timeout, retry logic, and configuration - All existing tests pass with the new implementation ## Documentation - Updated README.rst with etcd configuration section - Documented all environment variables and their defaults - Added examples and feature descriptions These changes significantly improve the reliability and debuggability of aptly when using etcd as the database backend, while also fixing critical race conditions that could cause data corruption or service crashes.	2025-07-10 10:05:49 -04:00
Nick Bozhenko	660cee2ce3	Fix concurrent map access race conditions in config publish roots This commit addresses critical race conditions that were causing "map write failed" errors and pod crashes in production environments. The issue occurred when multiple goroutines accessed shared configuration maps simultaneously without proper synchronization. Root Cause: The global utils.Config structure contains several maps (FileSystemPublishRoots, S3PublishRoots, SwiftPublishRoots, AzurePublishRoots) that were being accessed directly by concurrent HTTP handlers. While context.Config() uses a mutex, it returns a pointer to the global config, leaving subsequent map access unprotected. Changes Made: 1. Added safe accessor methods in utils/config.go: - GetFileSystemPublishRoots() - returns defensive copy of map - GetS3PublishRoots() - returns defensive copy of map - GetSwiftPublishRoots() - returns defensive copy of map - GetAzurePublishRoots() - returns defensive copy of map 2. Updated API handlers to use safe accessors: - api/s3.go: apiS3List() now uses GetS3PublishRoots() - api/router.go: reposListInAPIMode() now uses GetFileSystemPublishRoots() 3. Updated context package storage initialization: - context/context.go: GetPublishedStorage() now uses safe accessors for all storage type configurations (filesystem, s3, swift, azure) Impact: - Eliminates "concurrent map writes" panics that were causing service instability - Prevents pod crashes and restarts in Kubernetes environments - Ensures thread-safe access to configuration maps during concurrent API requests - Minimal performance overhead (microseconds) from creating map copies The fix is backward compatible and requires no configuration changes. The defensive copying approach ensures that even if config maps are modified after initialization (which shouldn't happen in production), concurrent readers remain safe. This addresses the production issues observed in lf-aptly-* pods where multiple parallel publish requests or API calls were triggering race conditions.	2025-07-10 01:35:09 -04:00
André Roth	ad4d0c7b96	doc: add swagger doc for /api/gpg/key - cleanup swagger validation errors	2025-06-08 14:24:27 +02:00
André Roth	f7057a9517	go1.24: fix lint, unit and system tests - development env: base on debian trixie with go1.24 - lint: run with default config - fix lint errors - fix unit tests - fix system test	2025-04-26 13:29:50 +02:00
André Roth	c07bf2b108	s3: add debug logs for commands * initialize zerolog for commands * Change default log format: remote colors and timestamp	2025-04-24 12:13:38 +02:00
André Roth	e062df68c5	go1.23: update golangci-lint version and fix warnings.	2025-04-20 20:32:55 +02:00
André Roth	9abbd74a9f	improve doc do not set default value for FromSnapshot when creating a repo	2024-12-21 20:23:52 +01:00
André Roth	93650efddb	Merge pull request #1404 from schoenherrg/fix/with-sources-ignored Fix `-with-sources` not fetching differently named source packages	2024-12-11 13:01:30 +01:00
André Roth	e319f3cd14	update doc make descrptions consistent	2024-12-11 11:19:46 +01:00
André Roth	1f469e23b5	fix optional params	2024-12-11 10:40:44 +01:00
André Roth	d8b9777b40	swagger: document params	2024-12-11 10:40:44 +01:00
André Roth	e5e3c49ace	swagger: document async	2024-12-11 10:40:44 +01:00
André Roth	c6e0a06b14	swagger: cleanup	2024-12-11 10:40:44 +01:00
André Roth	75e5f95277	task-dummy: remove internal testing API	2024-12-11 10:40:44 +01:00
André Roth	4ff3c894fa	swagger: cleanup Snapshots	2024-12-11 10:40:44 +01:00
André Roth	abfad37640	swagger: cleanup files doc	2024-12-11 10:40:44 +01:00
André Roth	a69c00a5bc	swagger: improve layout and fix lint	2024-12-11 10:40:44 +01:00
André Roth	4f229a5bcf	update doc	2024-12-11 10:40:44 +01:00
André Roth	397362bb1a	fix swagger build	2024-12-11 10:40:44 +01:00
iofq	d5571c41c7	Update files api docs	2024-12-11 10:40:44 +01:00
iofq	39921809ee	Update db api docs	2024-12-11 10:40:44 +01:00
iofq	68fe2bc852	Update gpg, graph api docs	2024-12-11 10:40:44 +01:00
iofq	398fec13b0	Update packages api docs	2024-12-11 10:40:44 +01:00
iofq	9fc7ebdac2	Update repos, task, snapshot api docs	2024-12-11 10:40:44 +01:00
André Roth	2171c05ef8	fix lint	2024-12-11 10:40:44 +01:00
André Roth	8f8de4bd29	update	2024-12-11 10:40:44 +01:00
André Roth	9b8f6b1d56	fix conflict	2024-12-11 10:40:43 +01:00
André Roth	69a1e2561d	docs: improve swagger - use markdown files in swagger - automate version, use swager.conf template - embed swagger ui index.html as docs.html	2024-12-11 10:40:43 +01:00
André Roth	ba86851d07	add api documentation stubs	2024-12-11 10:40:43 +01:00
Gordian Schoenherr	3b785e4165	Refactor Filter options into a struct It was already a lot of options for one method and I am going to add another one in the next commit.	2024-12-09 13:17:41 +09:00
André Roth	9ca9569714	fix build and golangci-lint	2024-11-17 14:09:37 +01:00
Mauro Regli	1357d246d8	rename addon files to skel files	2024-11-17 14:09:37 +01:00
Mauro Regli	c75c2c7594	pass down addonpath from api and cmd context	2024-11-17 14:09:37 +01:00
André Roth	eafec74c29	allow to exclude provided packages from list.Search	2024-11-04 17:02:54 +01:00
André Roth	f79423a4ee	update swagger documentation	2024-11-01 17:48:03 +01:00
André Roth	eb94211053	fix race conditions	2024-11-01 17:48:03 +01:00
André Roth	bd01cd4033	update swagger documentation	2024-11-01 17:48:03 +01:00
Christoph Fiehe	451de79666	Improve consistency between API and Swagger docs. Signed-off-by: Christoph Fiehe <c.fiehe@eurodata.de>	2024-11-01 17:48:03 +01:00
André Roth	755fdfaca2	update swagger documentation - add default values - set default values	2024-11-01 17:48:03 +01:00
André Roth	f4057850b9	fix compile and lint errors	2024-11-01 17:47:50 +01:00
André Roth	4d6688d68e	sanitize archs	2024-10-22 16:58:15 +02:00
Christoph Fiehe	7a7ff1142c	Minor code and documentation changes. Signed-off-by: Christoph Fiehe <c.fiehe@eurodata.de>	2024-10-22 16:58:15 +02:00
Christoph Fiehe	8cceed12f7	Fix tests. Signed-off-by: Christoph Fiehe <c.fiehe@eurodata.de>	2024-10-22 16:58:15 +02:00
Christoph Fiehe	f8f28e9554	Fixing tests and fix cleanup. Signed-off-by: Christoph Fiehe <c.fiehe@eurodata.de>	2024-10-22 16:58:15 +02:00
Christoph Fiehe	ac5ecf946d	Cleanup improved and code redundant code removed. Signed-off-by: Christoph Fiehe <c.fiehe@eurodata.de>	2024-10-22 16:58:15 +02:00
Christoph Fiehe	d87d8bac92	Fix test cases. Signed-off-by: Christoph Fiehe <c.fiehe@eurodata.de>	2024-10-22 16:58:15 +02:00
Christoph Fiehe	14c29ff912	Fixing tests. Signed-off-by: Christoph Fiehe <c.fiehe@eurodata.de>	2024-10-22 16:58:15 +02:00
Christoph Fiehe	73cdf5417b	Use POST instead of PUT for source creation. Signed-off-by: Christoph Fiehe <c.fiehe@eurodata.de>	2024-10-22 16:58:15 +02:00
André Roth	fa0d2860f0	fix multidist in publish	2024-10-22 16:58:15 +02:00
André Roth	dcbb2a06a5	fix build	2024-10-22 16:58:15 +02:00

1 2 3 4 5

227 Commits