Commit Graph

9 Commits

Author SHA1 Message Date
Vishnu Narayanan
08b9134486 feat: speed up circleci and github actions (#12849)
# 🚀 Speed up CI/CD test execution with parallelization

## TL;DR

- **Problem**: CI tests took 36-42 minutes per commit, blocking
developer workflow
- **Solution**: Implemented 16-way parallelization + optimized slow
tests + fixed Docker builds
- **Impact**: **1,358 hours/month saved** (7.7 FTE) across GitHub
Actions + CircleCI
  - GitHub Actions tests: 36m → 7m (82% faster)
  - Backend tests: 28m → 4m (87% faster with 16-way parallelization)
  - CircleCI tests: 42m → 7m (83% faster)
  - Docker builds: 34m → 5m (85% faster)
- **Result**: 5-6x faster feedback loops, 100% success rate on recent
runs

---

## Problem
CI test runs were taking **36-42 minutes per commit push** (GitHub
Actions: 36m avg, CircleCI: 42m P95), creating a significant bottleneck
in the development workflow.

## Solution
This PR comprehensively restructures both CI test pipelines to leverage
16-way parallelization and optimize test execution, reducing test
runtime from **36-42 minutes to ~7 minutes** - an **82% improvement**.

---

## 📊 Real Performance Data (Both CI Systems)

### GitHub Actions

#### Before (develop branch - 5 recent runs)
```
Individual runs: 35m 29s | 36m 1s | 40m 0s | 36m 4s | 34m 18s
Average: 36m 22s
```

#### After (feat/speed_up_ci branch - 9 successful runs)
```
Individual runs: 6m 39s | 7m 2s | 6m 53s | 6m 26s | 6m 52s | 6m 42s | 6m 45s | 6m 40s | 6m 37s
Average: 6m 44s
Range: 6m 26s - 7m 2s
```

**Improvement**:  **81.5% faster** (29m 38s saved per run)


#### Backend Tests Specific Impact
With 16-way parallelization, backend tests show dramatic improvement:
- **Before**: 27m 52s (sequential execution)
- **After**: 3m 44s (longest of 16 parallel runners)
  - Average across runners: 2m 30s
  - Range: 1m 52s - 3m 44s
- **Improvement**:  **86.6% faster** (24m 8s saved)
---

### CircleCI

#### Before (develop branch - CircleCI Insights)
```
Duration (P95): 41m 44s
Runs: 70 (last 30 days)
Success Rate: 84%
```

#### After (feat/speed_up_ci branch - Last 2 pipeline runs)
```
Run 1 (1h ago): 7m 7s
  ├─ lint: 4m 12s
  ├─ frontend-tests: 5m 36s
  ├─ backend-tests: 6m 23s
  ├─ coverage: 20s
  └─ build: 1s

Run 2 (2h ago): 7m 21s
  ├─ lint: 3m 47s
  ├─ frontend-tests: 5m 4s
  ├─ backend-tests: 6m 33s
  ├─ coverage: 19s
  └─ build: 1s

Average: 7m 14s
Success Rate: 100% 
```

**Improvement**:  **82.7% faster** (34m 30s saved per run)

---

## 🐳 Related Work: Docker Build Optimization

As part of the broader CI/CD optimization effort, Docker build
performance was improved separately in **PR #12859**.

### Docker Build Fix (Merged Separately)
**Problem**: Multi-architecture Docker builds (amd64/arm64) were taking
~34 minutes due to cache thrashing

**Solution**: Added separate cache scopes per platform in
`.github/workflows/test_docker_build.yml`:
```yaml
cache-from: type=gha,scope=${{ matrix.platform }}
cache-to: type=gha,mode=max,scope=${{ matrix.platform }}
```

**Results** (measured from November 2025 data):
- **Before**: 34.2 minutes/run average (15,547 minutes across 454 runs)
- **After**: 5 minutes/run
- **Improvement**: 85% faster, 29.2 minutes saved per run
- **Frequency**: 25.2 runs/day
- **Monthly savings**: **369 hours** (46 developer-days)

This prevents different architectures from invalidating each other's
caches and contributes 27% of total CI/CD time savings.

---

## 🎯 Key Findings

### Both CI Systems Now Perform Similarly
- **CircleCI**: 7m 14s average
- **GitHub Actions**: 6m 44s average
- **Difference**: Only 30 seconds apart (remarkably consistent!)

### Combined Performance
- **Average improvement across both systems**: **82.1% faster**
- **Time saved per commit**: ~32 minutes
- **Developer feedback loop**: 36-42 minutes → ~7 minutes

### Success Rate Improvement
- **CircleCI**: 84% → 100% (on feat/speed_up_ci branch)
- **GitHub Actions**: 100% (all 9 recent runs successful)
- Fixed all test isolation issues that caused intermittent failures

### Impact at Scale (Based on Real November 2025 Data)
- **CI runs per day**: **30.8 average** for tests, **25.2** for Docker
builds
  - Measured from GitHub Actions Usage Metrics (18 days)
  - Weekdays: 38-54 runs/day
  - Peak: up to 68 runs in a single day
- **This PR (test suite only)**:
  - **Daily time saved**: **15.3 hours** (GitHub Actions + CircleCI)
- **Monthly time saved**: **458 hours** (57 developer-days) on GitHub
Actions
  - Additional **531 hours** (66 developer-days) on CircleCI
- **Combined with Docker optimization** (PR #12859): **1,358
hours/month** (see Summary)
- **Developer experience**: 5-6x faster iteration cycles

---

## Code Changes

### 1. **Backend Test Parallelization (16x)**
Both CI systems now use 16-way parallelization with identical
round-robin test distribution:

```bash
# Distribute tests evenly across 16 runners
SPEC_FILES=($(find spec -name '*_spec.rb' | sort))
for i in "${!SPEC_FILES[@]}"; do
  if [ $(( i % 16 )) -eq $RUNNER_INDEX ]; then
    TESTS="$TESTS ${SPEC_FILES[$i]}"
  fi
done
```

**Why round-robin over timing-based?**
- CircleCI's timing-based splitting grouped similar tests together
- This caused race conditions with OAuth callback tests (Linear,
Shopify, Notion)
- Round-robin ensures even distribution and better test isolation
- Both CI systems now behave identically

### 2. **Frontend Test Optimization**
Enabled Vitest thread parallelization in `vite.config.ts`:

```typescript
pool: 'threads',
poolOptions: {
  threads: {
    singleThread: false,
  },
},
```

### 3. **CI Architecture Restructuring**
Split monolithic CI jobs into parallel stages:
- **Lint** (backend + frontend) - runs independently for fast feedback
- **Frontend tests** - runs in parallel with backend
- **Backend tests** - 16-way parallelized across runners
- **Coverage** - aggregates results from test jobs
- **Build** (CircleCI only) - final job for GitHub status check
compatibility

### 4. **Critical Test Optimization**
**report_builder_spec.rb**: Changed `before` to `before_all`
- Reduced execution time from **19 minutes to 1.2 minutes** (16x
speedup)
- Setup now runs once instead of 21 times
- Single biggest performance improvement after parallelization

---

## Test Stability Fixes (10 spec files)

Parallelization exposed latent test isolation issues that were fixed:

### Object Identity Comparisons (6 files)
Tests were comparing Ruby object instances instead of IDs:
- `spec/models/integrations/hook_spec.rb` - Use `.pluck(:id)` for
comparisons
- `spec/enterprise/models/captain/scenario_spec.rb` - Compare IDs
instead of objects
- `spec/models/notification_spec.rb` - Compare IDs for sort order
validation
- `spec/models/account_spec.rb` - Compare IDs in scope queries
- `spec/services/widget/token_service_spec.rb` - Compare class names
instead of class objects
- `spec/models/concerns/avatarable_shared.rb` - Use `respond_to` checks
for ActiveStorage

### Database Query Caching
- `spec/jobs/delete_object_job_spec.rb` - Added `.reload` to force fresh
database queries

### Test Expectations Timing
- `spec/jobs/mutex_application_job_spec.rb` - Removed flaky unlock
expectation after error block
  - Related to original PR #8770
- Expectation after error block never executes in parallel environments

### Timezone Handling
- `spec/mailers/account_notification_mailer_spec.rb` - Fixed date
parsing at timezone boundaries
  - Changed test time from 23:59:59Z to 12:00:00Z

### Test Setup
- `spec/builders/v2/report_builder_spec.rb` - Optimized with
`before_all`

---

##  CircleCI GitHub Integration Fix

### Problem
GitHub PR checks were stuck on "Waiting for status" even when all
CircleCI jobs passed. GitHub was expecting a job named `build` but the
workflow only had a workflow named "build".

### Solution
Added an explicit `build` job that runs after all other jobs:

```yaml
build:
  steps:
    - run:
        name: Legacy build aggregator
        command: echo "All main jobs passed"
  requires:
    - lint
    - coverage
```

This ensures GitHub's required status checks work correctly.


---

##  Testing & Validation

-  **GitHub Actions**: 9 successful runs, consistent 6m 26s - 7m 2s
runtime
-  **CircleCI**: 2 successful runs, consistent 7m 7s - 7m 21s runtime
-  Both CI systems produce identical, consistent results
-  GitHub PR status checks complete correctly
-  Success rate improved from 84% to 100% (recent runs)
-  No test regressions introduced
-  All flaky tests fixed (callback controllers, mutex jobs, etc.)

---

## 🎉 Summary

This PR delivers an **82% improvement** in test execution time across
both CI systems:

- **GitHub Actions tests**: 36m → 7m (81.5% faster)
- Backend tests specifically: 28m → 4m (86.6% faster)
- **CircleCI tests**: 42m → 7m (82.7% faster)
- **Developer feedback loop**: 5-6x faster
- **Test stability**: 84% → 100% success rate

### 📊 Total CI/CD Impact (All Optimizations)

Based on real November 2025 data, combining this PR with Docker build
optimization (PR #12859):

**Monthly Time Savings**: **1,358 hours/month** = **170
developer-days/month** = **7.7 FTE**

| System | Runs/Day | Before | After | Savings | Monthly Impact |
|--------|----------|---------|--------|---------|----------------|
| **GitHub Actions Tests** | 30.8 | 36.5m | 6.7m | 29.8m/run | 458 hrs
(34%) |
| **GitHub Actions Docker** | 25.2 | 34.2m | 5.0m | 29.2m/run | 369 hrs
(27%) |
| **CircleCI Tests** | 30.8 | 41.7m | 7.2m | 34.5m/run | 531 hrs (39%) |

*Data source: GitHub Actions Usage Metrics (November 2025, 18 days),
CircleCI Insights (30 days)*

The combined optimizations save the equivalent of **nearly 8 full-time
developers** worth of CI waiting time every month, significantly
improving developer velocity and reducing CI costs. All test isolation
issues exposed by parallelization have been fixed, ensuring reliable and
consistent results across both CI platforms.

woot woot !!!

---------
2025-11-19 15:32:48 +05:30
Shivam Mishra
6d3ecfe3c1 feat: Add new sidebar for Chatwoot V4 (#10291)
This PR has the initial version of the new sidebar targeted for the next major redesign of the app. This PR includes the following changes

- Components in the `layouts-next` and `base-next` directories in `dashboard/components`
- Two generic components `Avatar` and `Icon`
- `SidebarGroup` component to manage expandable sidebar groups with nested navigation items. This includes handling active states, transitions, and permissions.
- `SidebarGroupHeader` component to display the header of each navigation group with optional icons and active state indication.
- `SidebarGroupLeaf` component for individual navigation items within a group, supporting icons and active state.
- `SidebarGroupSeparator` component to visually separate nested navigation items. (They look a lot like header)
- `SidebarGroupEmptyLeaf` component to render empty state of any navigation groups.

----

Co-authored-by: Pranav <pranav@chatwoot.com>
Co-authored-by: Pranav <pranavrajs@gmail.com>
2024-10-23 18:32:37 -07:00
Shivam Mishra
4c7a539f9d feat: use iife build for sdk (#10255)
Vite uses `Rollup` for bundling, when building the sdk, we effectively
run a separate vite config, with `Library Mode`. When migrating from
Webpack to Vite, I selected `umd` format, i.e. Universal Module
Definition, which works as `amd`, `cjs` and `iife` all in one. However a
lot of Chatwoot users ran into issues where UMD sdk.js won't work.
Especially so when used with Google Tag Manager.

As a hotfix we moved the format from `umd` to `cjs`. Here's the thing
CJS is supposed to be used for Node packages. But for some-reason it was
working on browsers too, its no surprising, since the output is a valid
JS and the code we wrote was written for the browser.

There's a catch though, when minifying, esbuild would use tokens like
`$` and `_`, since `CJS` build is not scoped, unlike a `UMD` file, or
(spoiler alert) `IIFE`. Any declarations would be global, and websites
using `jQuery` (uff, culture) and `underscore-js` would break. We pushed
another hotfix disabling the name replacement in `esbuild` unless we
test out `IIFE` builds (which is this PR)

This PR fixes this by using `IIFE` instead, it is always scoped in a
function, so it never binds things globally, unless specifically written
to do so (example. `window.$chatwoot`).

I've tested this SDK on Safari, Chrome and iOS Safari on paperlayer test
site, it seems to be working fine. The sdk build is also scoped
correctly.

---------

Co-authored-by: Pranav <pranav@chatwoot.com>
2024-10-10 12:14:36 -07:00
Pranav
8505aa48c3 fix: Use native a tag for https URL in the sidebar (#10254)
This PR updates the sidebar component to use a native <a> tag for the Help Center URL component. It also updates the build pipeline to use the esbuild options minifyIdentifiers and keepNames set to true.
2024-10-09 21:04:04 -07:00
Shivam Mishra
aa5fa0c758 feat: Use CJS build for SDK (#10247)
The UMD build was causing issues for a few customers, this PR reverts to
using CJS like used in Webpack 4 before the vite migration

---------

Co-authored-by: Pranav <pranavrajs@gmail.com>
Co-authored-by: Pranav <pranav@chatwoot.com>
2024-10-09 14:09:52 +05:30
Vishnu Narayanan
ee02923ace chore: fix circleci on vite build (#10214)
- Switch to pnpm based build
- Switch circleci from docker to machine to have more memory
- Fix frontend and backend tests

Fixes
https://linear.app/chatwoot/issue/CW-3610/fix-circle-ci-for-vite-build
---------

Co-authored-by: Shivam Mishra <scm.mymail@gmail.com>
Co-authored-by: Pranav <pranavrajs@gmail.com>
Co-authored-by: Pranav <pranav@chatwoot.com>
2024-10-07 15:27:41 +05:30
Shivam Mishra
701135df92 fix: vitest resolution in vite.config [CW-3587] (#10204)
The ruby plugin conflicted with vitest resolution, this PR fixes it. We
will need to separate out the vite config for this
2024-10-03 19:54:57 +05:30
Shivam Mishra
42f6621afb feat: Vite + vue 3 💚 (#10047)
Fixes https://github.com/chatwoot/chatwoot/issues/8436
Fixes https://github.com/chatwoot/chatwoot/issues/9767
Fixes https://github.com/chatwoot/chatwoot/issues/10156
Fixes https://github.com/chatwoot/chatwoot/issues/6031
Fixes https://github.com/chatwoot/chatwoot/issues/5696
Fixes https://github.com/chatwoot/chatwoot/issues/9250
Fixes https://github.com/chatwoot/chatwoot/issues/9762

---------

Co-authored-by: Pranav <pranavrajs@gmail.com>
Co-authored-by: Sivin Varghese <64252451+iamsivin@users.noreply.github.com>
2024-10-02 00:36:30 -07:00
Pranav
9de8c27368 feat: Use vitest instead of jest, run all the specs anywhere in app/ folder in the CI (#9722)
Due to the pattern `**/specs/*.spec.js` defined in CircleCI, none of the
frontend spec in the folders such as
`specs/<domain-name>/getters.spec.js` were not executed in Circle CI.

This PR fixes the issue, along with the following changes: 
- Use vitest instead of jest
- Remove jest dependancies
- Update tests to work with vitest

---------

Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
2024-07-10 08:32:16 -07:00