Commit Graph

26 Commits

Author SHA1 Message Date
Vishnu Narayanan
08b9134486 feat: speed up circleci and github actions (#12849)
# 🚀 Speed up CI/CD test execution with parallelization

## TL;DR

- **Problem**: CI tests took 36-42 minutes per commit, blocking
developer workflow
- **Solution**: Implemented 16-way parallelization + optimized slow
tests + fixed Docker builds
- **Impact**: **1,358 hours/month saved** (7.7 FTE) across GitHub
Actions + CircleCI
  - GitHub Actions tests: 36m → 7m (82% faster)
  - Backend tests: 28m → 4m (87% faster with 16-way parallelization)
  - CircleCI tests: 42m → 7m (83% faster)
  - Docker builds: 34m → 5m (85% faster)
- **Result**: 5-6x faster feedback loops, 100% success rate on recent
runs

---

## Problem
CI test runs were taking **36-42 minutes per commit push** (GitHub
Actions: 36m avg, CircleCI: 42m P95), creating a significant bottleneck
in the development workflow.

## Solution
This PR comprehensively restructures both CI test pipelines to leverage
16-way parallelization and optimize test execution, reducing test
runtime from **36-42 minutes to ~7 minutes** - an **82% improvement**.

---

## 📊 Real Performance Data (Both CI Systems)

### GitHub Actions

#### Before (develop branch - 5 recent runs)
```
Individual runs: 35m 29s | 36m 1s | 40m 0s | 36m 4s | 34m 18s
Average: 36m 22s
```

#### After (feat/speed_up_ci branch - 9 successful runs)
```
Individual runs: 6m 39s | 7m 2s | 6m 53s | 6m 26s | 6m 52s | 6m 42s | 6m 45s | 6m 40s | 6m 37s
Average: 6m 44s
Range: 6m 26s - 7m 2s
```

**Improvement**:  **81.5% faster** (29m 38s saved per run)


#### Backend Tests Specific Impact
With 16-way parallelization, backend tests show dramatic improvement:
- **Before**: 27m 52s (sequential execution)
- **After**: 3m 44s (longest of 16 parallel runners)
  - Average across runners: 2m 30s
  - Range: 1m 52s - 3m 44s
- **Improvement**:  **86.6% faster** (24m 8s saved)
---

### CircleCI

#### Before (develop branch - CircleCI Insights)
```
Duration (P95): 41m 44s
Runs: 70 (last 30 days)
Success Rate: 84%
```

#### After (feat/speed_up_ci branch - Last 2 pipeline runs)
```
Run 1 (1h ago): 7m 7s
  ├─ lint: 4m 12s
  ├─ frontend-tests: 5m 36s
  ├─ backend-tests: 6m 23s
  ├─ coverage: 20s
  └─ build: 1s

Run 2 (2h ago): 7m 21s
  ├─ lint: 3m 47s
  ├─ frontend-tests: 5m 4s
  ├─ backend-tests: 6m 33s
  ├─ coverage: 19s
  └─ build: 1s

Average: 7m 14s
Success Rate: 100% 
```

**Improvement**:  **82.7% faster** (34m 30s saved per run)

---

## 🐳 Related Work: Docker Build Optimization

As part of the broader CI/CD optimization effort, Docker build
performance was improved separately in **PR #12859**.

### Docker Build Fix (Merged Separately)
**Problem**: Multi-architecture Docker builds (amd64/arm64) were taking
~34 minutes due to cache thrashing

**Solution**: Added separate cache scopes per platform in
`.github/workflows/test_docker_build.yml`:
```yaml
cache-from: type=gha,scope=${{ matrix.platform }}
cache-to: type=gha,mode=max,scope=${{ matrix.platform }}
```

**Results** (measured from November 2025 data):
- **Before**: 34.2 minutes/run average (15,547 minutes across 454 runs)
- **After**: 5 minutes/run
- **Improvement**: 85% faster, 29.2 minutes saved per run
- **Frequency**: 25.2 runs/day
- **Monthly savings**: **369 hours** (46 developer-days)

This prevents different architectures from invalidating each other's
caches and contributes 27% of total CI/CD time savings.

---

## 🎯 Key Findings

### Both CI Systems Now Perform Similarly
- **CircleCI**: 7m 14s average
- **GitHub Actions**: 6m 44s average
- **Difference**: Only 30 seconds apart (remarkably consistent!)

### Combined Performance
- **Average improvement across both systems**: **82.1% faster**
- **Time saved per commit**: ~32 minutes
- **Developer feedback loop**: 36-42 minutes → ~7 minutes

### Success Rate Improvement
- **CircleCI**: 84% → 100% (on feat/speed_up_ci branch)
- **GitHub Actions**: 100% (all 9 recent runs successful)
- Fixed all test isolation issues that caused intermittent failures

### Impact at Scale (Based on Real November 2025 Data)
- **CI runs per day**: **30.8 average** for tests, **25.2** for Docker
builds
  - Measured from GitHub Actions Usage Metrics (18 days)
  - Weekdays: 38-54 runs/day
  - Peak: up to 68 runs in a single day
- **This PR (test suite only)**:
  - **Daily time saved**: **15.3 hours** (GitHub Actions + CircleCI)
- **Monthly time saved**: **458 hours** (57 developer-days) on GitHub
Actions
  - Additional **531 hours** (66 developer-days) on CircleCI
- **Combined with Docker optimization** (PR #12859): **1,358
hours/month** (see Summary)
- **Developer experience**: 5-6x faster iteration cycles

---

## Code Changes

### 1. **Backend Test Parallelization (16x)**
Both CI systems now use 16-way parallelization with identical
round-robin test distribution:

```bash
# Distribute tests evenly across 16 runners
SPEC_FILES=($(find spec -name '*_spec.rb' | sort))
for i in "${!SPEC_FILES[@]}"; do
  if [ $(( i % 16 )) -eq $RUNNER_INDEX ]; then
    TESTS="$TESTS ${SPEC_FILES[$i]}"
  fi
done
```

**Why round-robin over timing-based?**
- CircleCI's timing-based splitting grouped similar tests together
- This caused race conditions with OAuth callback tests (Linear,
Shopify, Notion)
- Round-robin ensures even distribution and better test isolation
- Both CI systems now behave identically

### 2. **Frontend Test Optimization**
Enabled Vitest thread parallelization in `vite.config.ts`:

```typescript
pool: 'threads',
poolOptions: {
  threads: {
    singleThread: false,
  },
},
```

### 3. **CI Architecture Restructuring**
Split monolithic CI jobs into parallel stages:
- **Lint** (backend + frontend) - runs independently for fast feedback
- **Frontend tests** - runs in parallel with backend
- **Backend tests** - 16-way parallelized across runners
- **Coverage** - aggregates results from test jobs
- **Build** (CircleCI only) - final job for GitHub status check
compatibility

### 4. **Critical Test Optimization**
**report_builder_spec.rb**: Changed `before` to `before_all`
- Reduced execution time from **19 minutes to 1.2 minutes** (16x
speedup)
- Setup now runs once instead of 21 times
- Single biggest performance improvement after parallelization

---

## Test Stability Fixes (10 spec files)

Parallelization exposed latent test isolation issues that were fixed:

### Object Identity Comparisons (6 files)
Tests were comparing Ruby object instances instead of IDs:
- `spec/models/integrations/hook_spec.rb` - Use `.pluck(:id)` for
comparisons
- `spec/enterprise/models/captain/scenario_spec.rb` - Compare IDs
instead of objects
- `spec/models/notification_spec.rb` - Compare IDs for sort order
validation
- `spec/models/account_spec.rb` - Compare IDs in scope queries
- `spec/services/widget/token_service_spec.rb` - Compare class names
instead of class objects
- `spec/models/concerns/avatarable_shared.rb` - Use `respond_to` checks
for ActiveStorage

### Database Query Caching
- `spec/jobs/delete_object_job_spec.rb` - Added `.reload` to force fresh
database queries

### Test Expectations Timing
- `spec/jobs/mutex_application_job_spec.rb` - Removed flaky unlock
expectation after error block
  - Related to original PR #8770
- Expectation after error block never executes in parallel environments

### Timezone Handling
- `spec/mailers/account_notification_mailer_spec.rb` - Fixed date
parsing at timezone boundaries
  - Changed test time from 23:59:59Z to 12:00:00Z

### Test Setup
- `spec/builders/v2/report_builder_spec.rb` - Optimized with
`before_all`

---

##  CircleCI GitHub Integration Fix

### Problem
GitHub PR checks were stuck on "Waiting for status" even when all
CircleCI jobs passed. GitHub was expecting a job named `build` but the
workflow only had a workflow named "build".

### Solution
Added an explicit `build` job that runs after all other jobs:

```yaml
build:
  steps:
    - run:
        name: Legacy build aggregator
        command: echo "All main jobs passed"
  requires:
    - lint
    - coverage
```

This ensures GitHub's required status checks work correctly.


---

##  Testing & Validation

-  **GitHub Actions**: 9 successful runs, consistent 6m 26s - 7m 2s
runtime
-  **CircleCI**: 2 successful runs, consistent 7m 7s - 7m 21s runtime
-  Both CI systems produce identical, consistent results
-  GitHub PR status checks complete correctly
-  Success rate improved from 84% to 100% (recent runs)
-  No test regressions introduced
-  All flaky tests fixed (callback controllers, mutex jobs, etc.)

---

## 🎉 Summary

This PR delivers an **82% improvement** in test execution time across
both CI systems:

- **GitHub Actions tests**: 36m → 7m (81.5% faster)
- Backend tests specifically: 28m → 4m (86.6% faster)
- **CircleCI tests**: 42m → 7m (82.7% faster)
- **Developer feedback loop**: 5-6x faster
- **Test stability**: 84% → 100% success rate

### 📊 Total CI/CD Impact (All Optimizations)

Based on real November 2025 data, combining this PR with Docker build
optimization (PR #12859):

**Monthly Time Savings**: **1,358 hours/month** = **170
developer-days/month** = **7.7 FTE**

| System | Runs/Day | Before | After | Savings | Monthly Impact |
|--------|----------|---------|--------|---------|----------------|
| **GitHub Actions Tests** | 30.8 | 36.5m | 6.7m | 29.8m/run | 458 hrs
(34%) |
| **GitHub Actions Docker** | 25.2 | 34.2m | 5.0m | 29.2m/run | 369 hrs
(27%) |
| **CircleCI Tests** | 30.8 | 41.7m | 7.2m | 34.5m/run | 531 hrs (39%) |

*Data source: GitHub Actions Usage Metrics (November 2025, 18 days),
CircleCI Insights (30 days)*

The combined optimizations save the equivalent of **nearly 8 full-time
developers** worth of CI waiting time every month, significantly
improving developer velocity and reducing CI costs. All test isolation
issues exposed by parallelization have been fixed, ensuring reliable and
consistent results across both CI platforms.

woot woot !!!

---------
2025-11-19 15:32:48 +05:30
Shivam Mishra
76c110e60e fix: resolution count does not have account scope (#12370) 2025-09-04 18:04:00 +05:30
Shivam Mishra
ef4e287f0d fix: wrong resolution count in timeseries reports (#12261)
There was a fundamental difference in how resolution counts were
calculated between the agent summary and timeseries reports, causing
confusion for users when the numbers didn't match.

The agent summary report counted all `conversation_resolved` events
within a time period by querying the `reporting_events` table directly.
However, the timeseries report had an additional constraint that
required the conversation to currently be in resolved status
(`conversations.status = 1`). This meant that if an agent resolved a
conversation that was later reopened, the resolution action would be
counted in the summary but not in the timeseries.

This fix aligns both reports to count resolution events rather than
conversations in resolved state. When an agent resolves a conversation,
they should receive credit for that action regardless of what happens to
the conversation afterward. The same logic now applies to bot
resolutions as well.

The change removes the `conversations: { status: :resolved }` condition
from both `scope_for_resolutions_count` and
`scope_for_bot_resolutions_count` methods in CountReportBuilder, and
updates the corresponding test expectations to reflect that all
resolution events are counted.


## About timezone

When a timezone is specified via `timezone_offset` parameter, the
reporting system:

1. Converts timestamps to the target timezone before grouping
2. Groups data by local day/week/month boundaries in that timezone, but
the primary boundaries are sent by the frontend and used as-is
3. Returns timestamps representing midnight in the target timezone

This means the same events can appear in different day buckets depending
on the timezone used. For summary reports, it works fine, since the user
only needs the total count between two timestamps and the frontend sends
the timestamps adjusted for timezone.

## Testing Locally

Run the following command, this will erase all data for that account and
put in 1000 conversations over last 3 months, parameters of this can be
tweaked in `Seeders::Reports::ReportDataSeeder`

I'd suggest updating the values to generate data over 30 days, with
10000 conversations, it will take it's sweet time to run but then the
data will be really rich, great for testing.

```
ACCOUNT_ID=2 ENABLE_ACCOUNT_SEEDING=true bundle exec rake db:seed:reports_data
```

Pro Tip: Don't run the app when the seeder is active, we manually create
the reporting events anyway. So once done just use `redis-cli FLUSHALL`
to clear all sidekiq jobs. Will be easier on the system

Use the following scripts to test it

- https://gist.github.com/scmmishra/1263a922f5efd24df8e448a816a06257
- https://gist.github.com/scmmishra/ca0b861fa0139e2cccdb72526ea844b2
- https://gist.github.com/scmmishra/5fe73d1f48f35422fd1fd142ea3498f3
- https://gist.github.com/scmmishra/3b7b1f9e2ff149007170e5c329432f45
- https://gist.github.com/scmmishra/f245fa2f44cd973e5d60aac64f979162

---------

Co-authored-by: Sivin Varghese <64252451+iamsivin@users.noreply.github.com>
Co-authored-by: Pranav <pranav@chatwoot.com>
Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
2025-09-03 15:47:16 +05:30
Shivam Mishra
ac3bce3932 fix: missing metrics and labels from label summary (#11718) 2025-06-12 17:58:56 +05:30
Shivam Mishra
35f06f30e7 feat: label reports overview (#11194)
Co-authored-by: Sivin Varghese <64252451+iamsivin@users.noreply.github.com>
Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
2025-06-11 14:35:46 +05:30
Sojan Jose
bc42aec68e chore: upgrade ruby version to 3.4.4 (#11524)
- Chore upgrade ruby version to 3.4.4 before we migrate to rails 7.2
over #11037
2025-05-21 19:40:07 +05:30
Muhsin Keloth
d827e66453 feat: Instagram Inbox using Instagram Business Login (#11054)
This PR introduces basic minimum version of **Instagram Business
Login**, making Instagram inbox setup more straightforward by removing
the Facebook Page dependency. This update enhances user experience and
aligns with Meta’s recommended best practices.

Fixes
https://linear.app/chatwoot/issue/CW-3728/instagram-login-how-to-implement-the-changes


## Why Introduce Instagram as a Separate Inbox?


Currently, our Instagram integration requires linking an Instagram
account to a Facebook Page, making setup complex. To simplify this
process, Instagram now offers **Instagram Business Login**, which allows
users to authenticate directly with their Instagram credentials.

The **Instagram API with Instagram Login** enables businesses and
creators to send and receive messages without needing a Facebook Page
connection. While an Instagram Business or Creator account is still
required, this approach provides a more straightforward integration
process.

| **Existing Approach (Facebook Login for Business)** | **New Approach
(Instagram Business Login)** |
| --- | --- |
| Requires linking Instagram to a Facebook Page | No Facebook Page
required |
| Users log in via Facebook credentials | Users log in via Instagram
credentials |
| Configuration is more complex | Simpler setup |

Meta recommends using **Instagram Business Login** as the preferred
authentication method due to its easier configuration and improved
developer experience.

---

## Implementation Plan

The core messaging functionality is already in place, but the transition
to **Instagram Business Login** requires adjustments.

### Changes & Considerations

- **API Adjustments**: The Instagram API uses `graph.instagram`, whereas
Koala (our existing library) interacts with `graph.facebook`. We may
need to modify API calls accordingly.
- **Three Main Modules**:
  1. **Instagram Business Login** – Handle authentication flow.
2. **Permissions & Features** – Ensure necessary API scopes are granted.
  3. **Webhooks** – Enable real-time message retrieval.

![CleanShot 2025-03-10 at 21 32
28@2x](https://github.com/user-attachments/assets/1b019001-8d16-4e59-aca2-ced81e98f538)


---

## Instagram Login Flow

1. User clicks **"Create Inbox"** for Instagram.
2. App redirects to the [Instagram Authorization
URL](https://developers.facebook.com/docs/instagram-platform/instagram-api-with-instagram-login/business-login#embed-the-business-login-url).
3. After authentication, Instagram returns an authorization code.
5. The app exchanges the code for a **long-lived token** (valid for 60
days).
6. Tokens are refreshed periodically to maintain access.
7. Once completed, the app creates an inbox and redirects to the
Chatwoot dashboard.

---

## How to Test the Instagram Inbox

1. Create a new app on [Meta's Developer
Portal](https://developers.facebook.com/apps/).
2. Select **Business** as the app type and configure it.
3. Add the Instagram product and connect a business account.
4. Copy Instagram app ID and Instagram app secret
5. Add the Instagram app ID and Instagram app secret to your app config
via `{Chatwoot installation
url}/super_admin/app_config?config=instagram`
6. Configure Webhooks:
   - Callback URL: `{your_chatwoot_url}/webhooks/instagram`
   - Verify Token: `INSTAGRAM_VERIFY_TOKEN`
- Subscribe to `messages`, `messaging_seen`, and `message_reactions`
events.
7. Set up **Instagram Business Login**:
   - Redirect URL: `{your_chatwoot_url}/instagram/callback`
8. Test inbox creation via the Chatwoot dashboard.


## Troubleshooting & Common Errors

### Insufficient Developer Role Error

- Ensure the Instagram user is added as a developer:
- **Meta Dashboard → App Roles → Roles → Add People → Enter Instagram
ID**

### API Access Deactivated

- Ensure the **Privacy Policy URL** is valid and correctly set.

### Invalid request: Request parameters are invalid: Invalid
redirect_uri

- Please configure the Frontend URL. The Frontend URL does not match the
authorization URL.
---


## To-Do List

- [x] Basic integration setup completed.  
- [x] Enable sending messages via [Messaging
API](https://developers.facebook.com/docs/instagram-platform/instagram-api-with-instagram-login/messaging-api).
- [x] Implement automatic webhook subscriptions on inbox creation.  
- [x] Handle **canceled authorization errors**.  
- [x] Handle all the errors
https://developers.facebook.com/docs/instagram-platform/instagram-graph-api/reference/error-codes
- [x] Dynamically fetch **account IDs** instead of hardcoding them.  
- [x] Prevent duplicate Instagram channel creation for the same account.
- [x] Use **Global Config** instead of environment variables.  
- [x] Explore **Human Agent feature** for message handling.  
- [x] Write and refine **test cases** for all scenarios.  
- [x] Implement **token refresh mechanism** (tokens expire after 60
days).
Fixes https://github.com/chatwoot/chatwoot/issues/10440

---------

Co-authored-by: Sivin Varghese <64252451+iamsivin@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Shivam Mishra <scm.mymail@gmail.com>
2025-04-08 10:47:41 +05:30
Pranav
c52282307a feat(v4): Update team, agent summary builder to include resolution metrics (#10607)
Following https://github.com/chatwoot/chatwoot/pull/10604, this PR
introduces similar reporting features for Agents and Teams.

Updates in this PR:
- Added additional methods to the base class to avoid repetition.
- Improve reporting for Teams and Agents to include resolution count.
2024-12-20 19:16:56 +05:30
Pranav
4fd9bddb9d feat(v4): Add API to fetch aggregate reports for inboxes (#10604)
The Inbox Overview section is being updated to offer a more detailed
report, showing an overall view of the account grouped by inboxes. To
view detailed reports and access specific graphs for individual inboxes,
click on the inbox name to navigate to its dedicated report page.

---------

Co-authored-by: Sojan Jose <sojan@pepalo.com>
2024-12-19 14:47:19 -08:00
Pranav
87d92f73d4 feat: Improve Report API performance (#9476)
- Re-write the methods for clarity
- Remove the dependency on the ReportHelper class.
- Remove n+1 queries in the average metric time series data.
2024-05-22 17:34:24 -07:00
Sojan Jose
881d4bf644 feat: Add backend APIs for the bot metrics (#9031)
Co-authored-by: Pranav <pranav@chatwoot.com>
2024-03-01 08:20:20 -08:00
Vishnu Narayanan
415bb23c37 fix: Handle invalid metric in ReportsController (#8086)
Co-authored-by: Pranav Raj S <pranav@chatwoot.com>

Fixes CW-2643
2023-10-12 17:01:23 +05:30
Sojan Jose
6ab964b161 chore: fix flaky spec (#7686)
- fix flaky reporting specs
- fix flaky export spec
2023-08-25 00:11:41 -07:00
Pranav Raj S
d7566c453d chore: Take the count directly rather than grouping the conversations (#7535) 2023-07-19 12:12:30 -07:00
Shivam Mishra
2e79a32db7 fix: calculation for resolution count (#7293)
* fix: calculation for resolution count

* test: resolution count bug fix

- ensure enqueued jobs are run
- fix the dates check, conversations resolved today should show up in today

* feat: ensure conversations are resolved

* test: do not count extra events if the conversation is not resolved currently

* fix: typo
2023-06-14 13:50:10 +05:30
Sojan Jose
7ab7bac6bf chore: Enable the new Rubocop rules (#7122)
fixes: https://linear.app/chatwoot/issue/CW-1574/renable-the-disabled-rubocop-rules
2023-05-19 14:37:10 +05:30
Shivam Mishra
6b2736aa63 fix: inconsistency in report and summary for metric counts (#6817)
* feat: include timezone offset in summary calculation

* fix: exlcude end in date range

* test: explicit end of day

* fix: test for report builder

* fix: reports.spec.js

---------

Co-authored-by: Tejaswini Chile <tejaswini@chatwoot.com>
2023-04-20 12:55:04 +05:30
Jan Matuszewski
d46f96e45c Fix performance of report builder spec (#6024) 2023-01-17 09:27:50 +05:30
Sojan Jose
6a6a37a67b chore: Ability to Disable Gravatars (#5027)
fixes: #3853

- Introduced DISABLE_GRAVATAR Global Config, which will stop chatwoot from making API requests to gravatar
- Cleaned up avatar-related logic and centralized it into the avatarable concern
- Added specs for the missing cases
- Added migration for existing installations to move the avatar to attachment, rather than making the API that results in 404.
2022-07-21 19:27:12 +02:00
Sojan Jose
4260441f8c Chore: clean up Reporting Events (#4044)
Tech debt clean up

Fixes #4057

Co-authored-by: Aswin Dev P S <aswin@chatwoot.com>
2022-02-28 18:16:12 +05:30
Aswin Dev P.S
e6f8895c1b feat: Group by filter in reports (#3973) 2022-02-15 17:10:49 +05:30
Fayaz Ahmed
1c6a539c0a feat: Add Reports for teams (#3116)
Co-authored-by: Pranav Raj S <pranav@chatwoot.com>
2021-10-06 23:53:51 +05:30
Aswin Dev P.S
2ddd508aee Add missing conversation id for first response event (#2922)
- Add missing conversation id for first response event
- Fixing the flaky test

Fixes #2746

Co-authored-by: Tejaswini <tejaswini@chatwoot.com>
2021-09-02 00:16:09 +05:30
Tejaswini Chile
65f3e83afd feat: APIs to filter reports (#2889)
Fixes #2823
2021-08-27 22:46:32 +05:30
Subin T P
e56132c506 Non blocking event dispatch (#652)
- Performance improvements for event dispatch
2020-03-29 19:18:30 +05:30
Subin T P
8f6f07177d Enhancement: Move reporting metrics to postgres (#606) 2020-03-18 16:53:35 +05:30