Commit Graph

24 Commits

Author SHA1 Message Date
Vishnu Narayanan
08b9134486 feat: speed up circleci and github actions (#12849)
# 🚀 Speed up CI/CD test execution with parallelization

## TL;DR

- **Problem**: CI tests took 36-42 minutes per commit, blocking
developer workflow
- **Solution**: Implemented 16-way parallelization + optimized slow
tests + fixed Docker builds
- **Impact**: **1,358 hours/month saved** (7.7 FTE) across GitHub
Actions + CircleCI
  - GitHub Actions tests: 36m → 7m (82% faster)
  - Backend tests: 28m → 4m (87% faster with 16-way parallelization)
  - CircleCI tests: 42m → 7m (83% faster)
  - Docker builds: 34m → 5m (85% faster)
- **Result**: 5-6x faster feedback loops, 100% success rate on recent
runs

---

## Problem
CI test runs were taking **36-42 minutes per commit push** (GitHub
Actions: 36m avg, CircleCI: 42m P95), creating a significant bottleneck
in the development workflow.

## Solution
This PR comprehensively restructures both CI test pipelines to leverage
16-way parallelization and optimize test execution, reducing test
runtime from **36-42 minutes to ~7 minutes** - an **82% improvement**.

---

## 📊 Real Performance Data (Both CI Systems)

### GitHub Actions

#### Before (develop branch - 5 recent runs)
```
Individual runs: 35m 29s | 36m 1s | 40m 0s | 36m 4s | 34m 18s
Average: 36m 22s
```

#### After (feat/speed_up_ci branch - 9 successful runs)
```
Individual runs: 6m 39s | 7m 2s | 6m 53s | 6m 26s | 6m 52s | 6m 42s | 6m 45s | 6m 40s | 6m 37s
Average: 6m 44s
Range: 6m 26s - 7m 2s
```

**Improvement**:  **81.5% faster** (29m 38s saved per run)


#### Backend Tests Specific Impact
With 16-way parallelization, backend tests show dramatic improvement:
- **Before**: 27m 52s (sequential execution)
- **After**: 3m 44s (longest of 16 parallel runners)
  - Average across runners: 2m 30s
  - Range: 1m 52s - 3m 44s
- **Improvement**:  **86.6% faster** (24m 8s saved)
---

### CircleCI

#### Before (develop branch - CircleCI Insights)
```
Duration (P95): 41m 44s
Runs: 70 (last 30 days)
Success Rate: 84%
```

#### After (feat/speed_up_ci branch - Last 2 pipeline runs)
```
Run 1 (1h ago): 7m 7s
  ├─ lint: 4m 12s
  ├─ frontend-tests: 5m 36s
  ├─ backend-tests: 6m 23s
  ├─ coverage: 20s
  └─ build: 1s

Run 2 (2h ago): 7m 21s
  ├─ lint: 3m 47s
  ├─ frontend-tests: 5m 4s
  ├─ backend-tests: 6m 33s
  ├─ coverage: 19s
  └─ build: 1s

Average: 7m 14s
Success Rate: 100% 
```

**Improvement**:  **82.7% faster** (34m 30s saved per run)

---

## 🐳 Related Work: Docker Build Optimization

As part of the broader CI/CD optimization effort, Docker build
performance was improved separately in **PR #12859**.

### Docker Build Fix (Merged Separately)
**Problem**: Multi-architecture Docker builds (amd64/arm64) were taking
~34 minutes due to cache thrashing

**Solution**: Added separate cache scopes per platform in
`.github/workflows/test_docker_build.yml`:
```yaml
cache-from: type=gha,scope=${{ matrix.platform }}
cache-to: type=gha,mode=max,scope=${{ matrix.platform }}
```

**Results** (measured from November 2025 data):
- **Before**: 34.2 minutes/run average (15,547 minutes across 454 runs)
- **After**: 5 minutes/run
- **Improvement**: 85% faster, 29.2 minutes saved per run
- **Frequency**: 25.2 runs/day
- **Monthly savings**: **369 hours** (46 developer-days)

This prevents different architectures from invalidating each other's
caches and contributes 27% of total CI/CD time savings.

---

## 🎯 Key Findings

### Both CI Systems Now Perform Similarly
- **CircleCI**: 7m 14s average
- **GitHub Actions**: 6m 44s average
- **Difference**: Only 30 seconds apart (remarkably consistent!)

### Combined Performance
- **Average improvement across both systems**: **82.1% faster**
- **Time saved per commit**: ~32 minutes
- **Developer feedback loop**: 36-42 minutes → ~7 minutes

### Success Rate Improvement
- **CircleCI**: 84% → 100% (on feat/speed_up_ci branch)
- **GitHub Actions**: 100% (all 9 recent runs successful)
- Fixed all test isolation issues that caused intermittent failures

### Impact at Scale (Based on Real November 2025 Data)
- **CI runs per day**: **30.8 average** for tests, **25.2** for Docker
builds
  - Measured from GitHub Actions Usage Metrics (18 days)
  - Weekdays: 38-54 runs/day
  - Peak: up to 68 runs in a single day
- **This PR (test suite only)**:
  - **Daily time saved**: **15.3 hours** (GitHub Actions + CircleCI)
- **Monthly time saved**: **458 hours** (57 developer-days) on GitHub
Actions
  - Additional **531 hours** (66 developer-days) on CircleCI
- **Combined with Docker optimization** (PR #12859): **1,358
hours/month** (see Summary)
- **Developer experience**: 5-6x faster iteration cycles

---

## Code Changes

### 1. **Backend Test Parallelization (16x)**
Both CI systems now use 16-way parallelization with identical
round-robin test distribution:

```bash
# Distribute tests evenly across 16 runners
SPEC_FILES=($(find spec -name '*_spec.rb' | sort))
for i in "${!SPEC_FILES[@]}"; do
  if [ $(( i % 16 )) -eq $RUNNER_INDEX ]; then
    TESTS="$TESTS ${SPEC_FILES[$i]}"
  fi
done
```

**Why round-robin over timing-based?**
- CircleCI's timing-based splitting grouped similar tests together
- This caused race conditions with OAuth callback tests (Linear,
Shopify, Notion)
- Round-robin ensures even distribution and better test isolation
- Both CI systems now behave identically

### 2. **Frontend Test Optimization**
Enabled Vitest thread parallelization in `vite.config.ts`:

```typescript
pool: 'threads',
poolOptions: {
  threads: {
    singleThread: false,
  },
},
```

### 3. **CI Architecture Restructuring**
Split monolithic CI jobs into parallel stages:
- **Lint** (backend + frontend) - runs independently for fast feedback
- **Frontend tests** - runs in parallel with backend
- **Backend tests** - 16-way parallelized across runners
- **Coverage** - aggregates results from test jobs
- **Build** (CircleCI only) - final job for GitHub status check
compatibility

### 4. **Critical Test Optimization**
**report_builder_spec.rb**: Changed `before` to `before_all`
- Reduced execution time from **19 minutes to 1.2 minutes** (16x
speedup)
- Setup now runs once instead of 21 times
- Single biggest performance improvement after parallelization

---

## Test Stability Fixes (10 spec files)

Parallelization exposed latent test isolation issues that were fixed:

### Object Identity Comparisons (6 files)
Tests were comparing Ruby object instances instead of IDs:
- `spec/models/integrations/hook_spec.rb` - Use `.pluck(:id)` for
comparisons
- `spec/enterprise/models/captain/scenario_spec.rb` - Compare IDs
instead of objects
- `spec/models/notification_spec.rb` - Compare IDs for sort order
validation
- `spec/models/account_spec.rb` - Compare IDs in scope queries
- `spec/services/widget/token_service_spec.rb` - Compare class names
instead of class objects
- `spec/models/concerns/avatarable_shared.rb` - Use `respond_to` checks
for ActiveStorage

### Database Query Caching
- `spec/jobs/delete_object_job_spec.rb` - Added `.reload` to force fresh
database queries

### Test Expectations Timing
- `spec/jobs/mutex_application_job_spec.rb` - Removed flaky unlock
expectation after error block
  - Related to original PR #8770
- Expectation after error block never executes in parallel environments

### Timezone Handling
- `spec/mailers/account_notification_mailer_spec.rb` - Fixed date
parsing at timezone boundaries
  - Changed test time from 23:59:59Z to 12:00:00Z

### Test Setup
- `spec/builders/v2/report_builder_spec.rb` - Optimized with
`before_all`

---

##  CircleCI GitHub Integration Fix

### Problem
GitHub PR checks were stuck on "Waiting for status" even when all
CircleCI jobs passed. GitHub was expecting a job named `build` but the
workflow only had a workflow named "build".

### Solution
Added an explicit `build` job that runs after all other jobs:

```yaml
build:
  steps:
    - run:
        name: Legacy build aggregator
        command: echo "All main jobs passed"
  requires:
    - lint
    - coverage
```

This ensures GitHub's required status checks work correctly.


---

##  Testing & Validation

-  **GitHub Actions**: 9 successful runs, consistent 6m 26s - 7m 2s
runtime
-  **CircleCI**: 2 successful runs, consistent 7m 7s - 7m 21s runtime
-  Both CI systems produce identical, consistent results
-  GitHub PR status checks complete correctly
-  Success rate improved from 84% to 100% (recent runs)
-  No test regressions introduced
-  All flaky tests fixed (callback controllers, mutex jobs, etc.)

---

## 🎉 Summary

This PR delivers an **82% improvement** in test execution time across
both CI systems:

- **GitHub Actions tests**: 36m → 7m (81.5% faster)
- Backend tests specifically: 28m → 4m (86.6% faster)
- **CircleCI tests**: 42m → 7m (82.7% faster)
- **Developer feedback loop**: 5-6x faster
- **Test stability**: 84% → 100% success rate

### 📊 Total CI/CD Impact (All Optimizations)

Based on real November 2025 data, combining this PR with Docker build
optimization (PR #12859):

**Monthly Time Savings**: **1,358 hours/month** = **170
developer-days/month** = **7.7 FTE**

| System | Runs/Day | Before | After | Savings | Monthly Impact |
|--------|----------|---------|--------|---------|----------------|
| **GitHub Actions Tests** | 30.8 | 36.5m | 6.7m | 29.8m/run | 458 hrs
(34%) |
| **GitHub Actions Docker** | 25.2 | 34.2m | 5.0m | 29.2m/run | 369 hrs
(27%) |
| **CircleCI Tests** | 30.8 | 41.7m | 7.2m | 34.5m/run | 531 hrs (39%) |

*Data source: GitHub Actions Usage Metrics (November 2025, 18 days),
CircleCI Insights (30 days)*

The combined optimizations save the equivalent of **nearly 8 full-time
developers** worth of CI waiting time every month, significantly
improving developer velocity and reducing CI costs. All test isolation
issues exposed by parallelization have been fixed, ensuring reliable and
consistent results across both CI platforms.

woot woot !!!

---------
2025-11-19 15:32:48 +05:30
Sojan Jose
de5fb7a405 chore: remove unused telegram bot model (#12417)
## Summary
- remove unused TelegramBot model and its association
- drop obsolete telegram_bots table
2025-09-11 22:25:26 +05:30
Shivam Mishra
b533980880 feat: Add support for minutes in auto resolve feature (#11269)
### Summary

- Converts conversation auto-resolution duration from days to minutes
for more
granular control
- Updates validation to allow values from 10 minutes (minimum) to 999
days (maximum)
- Implements smart messaging to show appropriate time units in activity
messages

###  Changes

- Created migration to convert existing durations from days to minutes
(x1440)
- Updated conversation resolver to use minutes instead of days
- Added dynamic translation key selection based on duration value
- Updated related specs and documentation
- Added support for displaying durations in days, hours, or minutes
based on value

###  Test plan

- Verify account validation accepts new minute-based ranges
- Confirm existing account settings are correctly migrated
- Test auto-resolution works properly with minute values
- Ensure proper time unit display in activity messages

---------

Co-authored-by: Sivin Varghese <64252451+iamsivin@users.noreply.github.com>
2025-05-07 00:36:15 -07:00
Shivam Mishra
3a4249da11 feat: Add support for multi-language support for Captain (#11068)
This PR implements the following features

- FAQs from conversations will be generated in account language
- Contact notes will be generated in account language
- Copilot chat will respond in user language, unless the agent asks the
question in a different language

## Changes
### Copilot Chat

- Update the prompt to include an instruction for the language, the bot
will reply in asked language, but will default to account language
- Update the `ChatService` class to include pass the language to
`SystemPromptsService`

### FAQ and Contact note generation

- Update contact note generator and conversation generator to include
account locale
- Pass the account locale to `SystemPromptsService`


<details><summary>Screenshots</summary>

#### FAQs being generated in system langauge

![CleanShot 2025-03-12 at 13 32
30@2x](https://github.com/user-attachments/assets/84685bd8-3785-4432-aff3-419f60d96dd3)


#### Copilot responding in system language

![CleanShot 2025-03-12 at 13 47
03@2x](https://github.com/user-attachments/assets/38383293-4228-47bd-b74a-773e9a194f90)


</details>

---------

Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
Co-authored-by: Pranav <pranav@chatwoot.com>
2025-03-19 18:25:33 -07:00
Shivam Mishra
3b366f43e6 feat: setup captain limits (#10713)
This pull request introduces several changes to implement and manage
usage limits for the Captain AI service. The key changes include adding
configuration for plan limits, updating error messages, modifying
controllers and models to handle usage limits, and updating tests to
ensure the new functionality works correctly.

## Implementation Checklist

- [x] Ability to configure captain limits per check
- [x] Update response for `usage_limits` to include captain limits
- [x] Methods to increment or reset captain responses limits in the
`limits` column for the `Account` model
- [x] Check documents limit using a count query
- [x] Ensure Captain hand-off if a limit is reached
- [x] Ensure limits are enforced for Copilot Chat
- [x] Ensure limits are reset when stripe webhook comes in 
- [x] Increment usage for FAQ generation and Contact notes
- [x] Ensure documents limit is enforced

These changes ensure that the Captain AI service operates within the defined usage limits for different subscription plans, providing appropriate error messages and handling when limits are exceeded.
2025-01-23 01:23:18 +05:30
Sojan Jose
978a8a4cb2 fix: support_email and inbound_email_domain returning empty string (#8963)
chore: Fix for inbound email domain being nil
2024-02-19 15:43:35 +05:30
Shivam Mishra
07ea9694a3 feat: new accounts controller for signup+onboarding (#8804)
* feat: add v2 accounts controller

* feat: allow empty account and user name

* feat: ensure  and  is present for v1 signup

* test: remove validation checks

* chore: apply suggestions

* chore: revert en.yml formatting

* chore: line at EOF

* fix: routes

---------

Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
2024-02-02 16:10:45 +05:30
Sojan Jose
385eab6b96 chore: Add max length validation to text fields (#7073)
Introduces a default max length validation for all string and text fields to prevent processing large payloads.
2023-05-12 22:12:21 +05:30
Tejaswini Chile
f6891aebbb fix: Set account default language in account API (#5086) 2022-07-26 22:42:20 +05:30
Tejaswini Chile
938fb887c4 feat: Portal endpoint (#4633) 2022-05-16 13:59:59 +05:30
Vishnu Narayanan
7577c9c888 fix: drop conv and campaign seq on account delete (#4256)
Conversation and campaign sequences persist in the database even after the related account is deleted. This PR adds an after_desttory callback on the account model that will delete the associated sequences.

Fixes: #4252
2022-03-24 13:33:15 +05:30
Sojan Jose
4260441f8c Chore: clean up Reporting Events (#4044)
Tech debt clean up

Fixes #4057

Co-authored-by: Aswin Dev P S <aswin@chatwoot.com>
2022-02-28 18:16:12 +05:30
Shivam Chahar
9617137688 chore: Account auto resolution time validation (#3651) 2022-01-19 13:16:21 +05:30
Sojan Jose
b1eea7f7d1 chore: Introduce enterprise edition license (#3209)
- Initialize an "enterprise" folder that is copyrighted.
- You can remove this folder and the system will continue functioning normally, in case you want a purely MIT licensed product.
- Enable limit on the number of user accounts in enterprise code.
- Use enterprise edition injector methods (inspired from Gitlab).
- SaaS software would run enterprise edition software always.

Co-authored-by: Pranav Raj S <pranav@chatwoot.com>
2021-12-09 12:07:48 +05:30
Akhil G Krishnan
b81a9f2010 Chore: Replaced dependent destroy with dependent destroy_async in all models (#3249) 2021-11-18 10:32:29 +05:30
Sojan Jose
a0c33254e7 feat: Team APIs (#1654)
Co-authored-by: Pranav Raj S <pranav@chatwoot.com>
2021-01-17 23:56:56 +05:30
Akash Srivastava
074084b258 feat: Auto resolve conversations after n days of inactivity (#1308)
fixes: #418
2020-11-01 12:53:25 +05:30
Subin T P
701eccb35c Feature: Knowledge Base APIs (#1002)
- Introduce models & migrations for portals, categories, folders and articles
- CRUD API for portals
- CRUD API for categories

Addresses: #714

Co-authored-by: Sojan <sojan@pepalo.com>
2020-09-26 02:32:34 +05:30
Sojan Jose
52d28105e4 Chore: Remove dead code related to billing (#935)
- remove subscription model
- remove billing-related code
2020-06-07 20:31:48 +05:30
Subin T P
8f6f07177d Enhancement: Move reporting metrics to postgres (#606) 2020-03-18 16:53:35 +05:30
Sojan Jose
8b4df986bf Chore: Enable Users to create multiple accounts (#440)
Addresses: #402
- migrations to split roles and other attributes from users table
- make changes in code to accommodate this change

Co-authored-by: Sojan Jose <sojan@pepalo.com>
Co-authored-by: Pranav Raj Sreepuram <pranavrajs@gmail.com>
2020-03-07 12:18:16 +05:30
Sony Mathew
7f26b34b15 Feature: Add new notification settings for user (#569)
Added new notification settings API for user 

Co-authored-by: Sojan Jose <sojan@pepalo.com>
2020-02-29 20:41:09 +05:30
Subin T P
919261d843 Feature: Webhooks (#489) 2020-02-14 23:19:17 +05:30
Karthik Sivadas
c758c13ffb Add Account Model specs (#341) 2019-12-03 10:09:45 +05:30