Commit Graph

16 Commits

Author SHA1 Message Date
Nick Bozhenko 40ba104838 Add comprehensive CI/CD improvements and test coverage
This commit introduces major enhancements to the CI/CD pipeline and testing infrastructure:

CI/CD Improvements:
- Consolidated modern and legacy CI workflows into a single comprehensive pipeline
- Removed all publishing functionality from CI (no longer needed)
- Added 8 new advanced testing jobs for pull requests:
  * advanced-coverage: Detailed coverage analysis with base branch comparison
  * performance-profile: CPU and memory profiling with benchmarks
  * fuzz-test: Automated fuzz testing for supported packages
  * deep-analysis: Multiple static analysis tools (shadow, ineffassign, gosec, staticcheck)
  * mutation-test: Tests effectiveness of test suite on changed files
  * dependency-audit: Security vulnerabilities and outdated dependency checks
  * stress-test: Race detection with 100 iterations and parallel testing
  * test-report-summary: Aggregates all reports into a single PR comment
- Enabled RUN_LONG_TESTS by default for thorough testing
- Added automatic PR comment generation with all test results

Testing Infrastructure:
- Added comprehensive test files across all packages to improve coverage
- Implemented unit tests for previously untested packages
- Added race condition tests for concurrent operations
- Created integration tests for API endpoints
- Added storage backend tests (etcd, goleveldb)
- Implemented command-line interface tests

Local Testing Support:
- Added act configuration for testing GitHub Actions locally
- Created docker-compose.ci.yml for full CI environment simulation
- Updated CONTRIBUTING.md with detailed local testing instructions

Documentation Updates:
- Added comprehensive CI documentation to CONTRIBUTING.md
- Removed obsolete references to Travis CI
- Updated Go version requirements to 1.24
- Added act usage instructions and examples

Other Improvements:
- Updated .gitignore to exclude coverage reports and build artifacts
- Added test-act.yml workflow for testing act functionality
- Created CI_SUMMARY.md documenting all CI capabilities

These changes transform aptly's CI from a basic testing pipeline into a comprehensive quality assurance system that provides immediate feedback on code quality, performance, security, and test effectiveness.
2025-07-10 12:00:54 -04:00
Nick Bozhenko 463c34a38e Fix race conditions and improve etcd timeout handling
This commit addresses several critical race conditions and improves the reliability
of etcd operations through better timeout and retry handling.

## Race Condition Fixes

1. **Task Resource Management Bug**
   - Fixed incorrect variable usage in task/list.go:78
   - Was using completed task's resources instead of idle task's resources
   - This caused resource conflicts and potential deadlocks

2. **Database Channel Initialization**
   - Added sync.Once pattern to ensure thread-safe channel initialization
   - Prevents panic from concurrent access during startup
   - Created initDBRequests() function for safe initialization

3. **Published Storage Double-Checked Locking**
   - Implemented double-checked locking pattern in GetPublishedStorage
   - Reduces lock contention while preventing concurrent initialization
   - Improves performance for frequently accessed storage

4. **File Operation Synchronization**
   - Created FileLockRegistry in utils/filelock.go
   - Prevents concurrent file operations (create, rename, delete, link)
   - Implements deadlock prevention for multi-file operations
   - Critical for preventing file corruption during parallel publishes

5. **WaitGroup Miscount Prevention**
   - Added defer pattern to ensure Done() is always called
   - Protects against panics during task execution
   - Prevents "negative WaitGroup counter" errors

## etcd Improvements

1. **Timeout Protection**
   - Replaced global context.TODO() with per-operation timeout contexts
   - Default timeout: 60 seconds (configurable)
   - Prevents indefinite hangs when etcd is unresponsive

2. **Environment Variable Configuration**
   - APTLY_ETCD_TIMEOUT: Operation timeout (default: 60s)
   - APTLY_ETCD_DIAL_TIMEOUT: Connection timeout (default: 60s)
   - APTLY_ETCD_KEEPALIVE: Keep-alive timeout (default: 7200s)
   - APTLY_ETCD_MAX_MSG_SIZE: Max message size (default: 50MB)

3. **Retry Logic for Read Operations**
   - Get operations retry up to 3 times with exponential backoff
   - Only retries on temporary/network errors
   - Improves reliability without risking data inconsistency

4. **Enhanced Error Logging**
   - All etcd errors now logged with operation context
   - Replaces silent failures with actionable error messages
   - Improves debugging and monitoring capabilities

5. **Increased Message Size Limits**
   - Default increased from 10MB to 50MB
   - Configurable via environment variable
   - Prevents "message too large" errors for large operations

## Testing

- Added comprehensive tests for etcd timeout functionality
- Tests verify context timeout, retry logic, and configuration
- All existing tests pass with the new implementation

## Documentation

- Updated README.rst with etcd configuration section
- Documented all environment variables and their defaults
- Added examples and feature descriptions

These changes significantly improve the reliability and debuggability of aptly
when using etcd as the database backend, while also fixing critical race
conditions that could cause data corruption or service crashes.
2025-07-10 10:05:49 -04:00
André Roth f7057a9517 go1.24: fix lint, unit and system tests
- development env: base on debian trixie with go1.24
- lint: run with default config
- fix lint errors
- fix unit tests
- fix system test
2025-04-26 13:29:50 +02:00
André Roth 3e1485faf5 queue sync calls 2024-06-15 19:18:14 +02:00
André Roth 45035802be implement task queue waiting for resources 2024-06-15 19:18:14 +02:00
André Roth 05fed16f6d fix golangci-lint error 2024-06-15 19:18:14 +02:00
Ramón N.Rodriguez 1987220f1e api: publish: block on concurrent calls
This commit blocks concurrent calls to RunTaskInBackground which is
intended to fix the quirky behaviour where concurrent PUT calls to
api/publish/<prefix>/<distribution> would immedietly reuturn an error.

The solution proposed in this commit is not elegant and probaly has
unintended side-effects. The intention of this commit is to highlight
the area that actually needs to be addressed.
Ideally this patch is amended or dropped entierly in favor of a better
fixup.
2024-06-15 19:18:14 +02:00
Mauro Regli ae61706a34 Fix: Implement golangci-lint suggestions 2023-09-21 11:25:18 +02:00
Benj Fassbind 71fd730598 Return an empty array if no tasks are available
All other api endpoints also send empty arrays instead of nil.
Closes #1123
2022-11-17 10:44:35 +01:00
Lorenzo Bolla 3775d69a60 Fix linting errors 2022-01-27 09:30:14 +01:00
Lorenzo Bolla 6826efc723 Fix pure-go unittests
So they can run on e.g. LXC containers as root, or other conceivable setups.
2022-01-27 09:30:14 +01:00
Lorenzo Bolla ff51c46915 More informative return value for task.Process 2022-01-27 09:30:14 +01:00
Lorenzo Bolla 9b28d8984f Configurable background task execution 2022-01-27 09:30:14 +01:00
Oliver Sauder b4efe6a810 Add db cleanup api 2022-01-27 09:30:14 +01:00
Oliver Sauder f09a273ad7 Add publish output progress counting remaining number of packages 2022-01-27 09:30:14 +01:00
Oliver Sauder 6ab5e60833 Add task api and resource locking ability 2022-01-27 09:30:14 +01:00