This benchmark compares Semaphore to GitHub Actions, GitLab CI, Buildkite, and CircleCI using the same repository, pipeline logic, versions, and equivalent machine classes.
The goal is to measure real execution time and compute cost under identical conditions.
Repository and Workload
Repository: Redmine (Ruby on Rails application). The workload consists of dependency installation and full test execution. Runs were measured after cache warm-up to focus on steady-state execution time. After initializing the cache, 10 consecutive runs were executed per provider. No outliers were removed.
Infrastructure Configuration Setup
All providers used 2 vCPU machines, matching memory as closely as possible:
- Semaphore: f1-standard-2 (2 vCPU, 8 GB RAM)
- GitHub Actions: ubuntu-latest (Linux runner) (2 vCPU, 7 GB RAM)
- GitLab: saas-linux-small-amd64 (2 vCPU, 8 GB RAM)
- CircleCI: Docker medium (2 vCPU, 4 GB RAM)
- Buildkite: LINUX_AMD64_2X4 (2 vCPU, 4 GB RAM)
OS family, Ruby version, dependency installation strategy, and database backend are the same across the board. This benchmark measures single-job execution speed (no parallelism).
Semaphore Pipeline Overview
The Semaphore pipeline defines a single job running on f1-standard-2. It checks out the repository, restores cache, installs dependencies, sets up the database, and executes the full test suite.
version: v1.0
name: Redmine
agent:
machine:
type: f1-standard-2
os_image: ubuntu2404
blocks:
- name: Tests
task:
jobs:
- name: tests
commands:
- checkout
- cache restore gems-$(checksum Gemfile)-ruby-4.0
- sudo apt-get update
- sudo DEBIAN_FRONTEND=noninteractive apt-get install --yes --quiet build-essential pkg-config libpq-dev postgresql-client ghostscript gsfonts locales bzr cvs imagemagick
- sudo locale-gen en_US.UTF-8
- |
cat > policy.xml <<'EOF'
<policymap>
<policy domain="coder" rights="read | write" pattern="PDF" />
</policymap>
EOF
- sudo rm -f /etc/ImageMagick-6/policy.xml
- sudo mv policy.xml /etc/ImageMagick-6/policy.xml
- sem-version ruby "4.0"
- sem-service start postgres 14 --db redmine_test
- |
cat > config/database.yml <<'EOF'
test:
adapter: postgresql
database: redmine_test
username: postgres
password:
host: 127.0.0.1
EOF
- bundle config set path 'vendor/bundle'
- bundle install --jobs 4 --retry 3
- cache store gems-$(checksum Gemfile)-ruby-4.0 vendor/bundle
- export RAILS_ENV=test
- 'export SCMS=subversion,git,git_utf8,filesystem,bazaar,cvs'
- 'bundle exec rake ci:about'
- 'bundle exec rake ci:setup'
- 'bundle exec rake db:environment:set'
- bin/rails test
- LANG=en_US.ISO8859-1 LC_ALL=en_US.ISO8859-1 bin/rails test test/unit/repository_bazaar_test.rb
- 'bin/rails test:autoload'
Benchmark Results
| Provider | Runs (1-10) | Average | Sem faster by | Price / minute | Cost / run | Cost increase |
|---|---|---|---|---|---|---|
| Semaphore |
105:10
204:25
304:30
405:12
504:34
604:47
706:27
804:54
905:00
1005:06
|
05:01 | $0.0075 | $0.04 | ||
| GitHub Actions |
109:56
210:29
309:50
410:03
509:40
609:37
709:15
809:42
909:08
1009:44
|
09:44 | 94.48% | $0.0060 | $0.06 | 55.58% |
| GitLab |
110:13
211:45
311:11
411:27
509:45
611:56
711:10
811:23
911:30
1012:06
|
11:15 | 124.49% | $0.0100 | $0.11 | 199.32% |
| Buildkite |
105:08
209:13
305:51
406:38
507:36
607:31
706:53
808:47
906:44
1008:12
|
07:15 | 44.86% | $0.0130 | $0.09 | 151.09% |
| CircleCI |
110:44
211:00
310:56
417:08
514:02
609:25
716:05
813:25
914:38
1015:33
|
13:18 | 165.42% | $0.0060 | $0.08 | 112.34% |
Cost Calculation Method
Formulas used:
- Semaphore faster by = (ProviderAvg – SemaphoreAvg) / SemaphoreAvg
- Cost per run = AverageDurationMinutes × PricePerMinute
- Cost vs Semaphore = (ProviderCost – SemaphoreCost) / SemaphoreCost
Productivity Impact
A reduction from 9-13 minutes down to 5 minutes changes feedback cycles. For a team running 100 builds per day, saving 4 minutes per build results in 400 minutes saved daily. That equals over 6.5 engineer hours regained per day.
Budget and Engineering Capacity Impact at Scale
Assume your organization consumes 1,000,000 build minutes on Semaphore for this workload.
Based on the benchmark runtime ratios, the same workload would require:
| Semaphore | 1.00M build minutes | |
| GitHub Actions | 1.94M build minutes | +15,670 hours |
| GitLab | 2.24M build minutes | +20,709 hours |
| Buildkite | 1.45M build minutes | +7,420 hours |
| CircleCI | 2.65M build minutes | +27,519 hours |
These are not abstract numbers. They represent real waiting time in feedback loops, and a slower CI means:
- Longer pull request cycles
- Slower bug detection
- Slower incident resolution
- More context switching
- Reduced deployment frequency
Now consider your internal engineering cost model:
- H = fully loaded engineering hourly cost
Then the organizational impact of slower CI is:
Additional personnel cost exposure = Extra pipeline hours × 𝐻
Without assuming a specific compensation level, the relationship is linear and direct:
- If H increases, the cost penalty increases proportionally
- If your build volume increases, the cost penalty scales proportionally
CI performance therefore has a second-order budget impact:
- Direct compute cost
- Indirect engineering time cost
The benchmark shows that under identical workload conditions, Semaphore minimizes both simultaneously.
When performance and cost efficiency move in the same direction, CI infrastructure reduces both compute waste and feedback loop waiting times.
TL;DR
Under identical workload and equivalent machine classes, and single-job execution constraints, Semaphore delivered the fastest execution time and the lowest cost per run in this benchmark configuration.
For engineering teams optimizing feedback loops and infrastructure spend, execution time, and cost per run are measurable levers. This benchmark demonstrates both.
Next Step
CI performance directly influences two variables that scale with your organization: infrastructure spend and engineering throughput.
This benchmark demonstrates measurable differences under controlled conditions. The most relevant comparison, however, is against your own repository, build frequency, and test volume.
Create a project, run your existing pipeline, and measure the execution time and cost against your current setup.
You will have the data you need to quantify the difference where it matters.
Want to discuss this article? Join our Discord.