Files
leadchat/.circleci/config.yml
Vinay Keerthi 59cbf57e20 feat: Advanced Search Backend (#12917)
## Description

Implements comprehensive search functionality with advanced filtering
capabilities for Chatwoot (Linear: CW-5956).

This PR adds:
1. **Time-based filtering** for contacts and conversations (SQL-based
search)
2. **Advanced message search** with multiple filters
(OpenSearch/Elasticsearch-based)
- **`from` filter**: Filter messages by sender (format: `contact:42` or
`agent:5`)
   - **`inbox_id` filter**: Filter messages by specific inbox
- **Time range filters**: Filter messages using `since` and `until`
parameters (Unix timestamps in seconds)
- **90-day limit enforcement**: Automatically limits searches to the
last 90 days to prevent performance issues

The implementation extends the existing `Enterprise::SearchService`
module for advanced features and adds time filtering to the base
`SearchService` for SQL-based searches.

## API Documentation

### Base URL
All search endpoints follow this pattern:
```
GET /api/v1/accounts/{account_id}/search/{resource}
```

### Authentication
All requests require authentication headers:
```
api_access_token: YOUR_ACCESS_TOKEN
```

---

## 1. Search All Resources

**Endpoint:** `GET /api/v1/accounts/{account_id}/search`

Returns results from all searchable resources (contacts, conversations,
messages, articles).

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `since` | integer | Unix timestamp (contacts/conversations only) | No
|
| `until` | integer | Unix timestamp (contacts/conversations only) | No
|

### Example Request
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search?q=customer" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "contacts": [...],
    "conversations": [...],
    "messages": [...],
    "articles": [...]
  }
}
```

---

## 2. Search Contacts

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/contacts`

Search contacts by name, email, phone number, or identifier with
optional time filtering.

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `since` | integer | Unix timestamp - filter by last_activity_at | No |
| `until` | integer | Unix timestamp - filter by last_activity_at | No |

### Example Requests

**Basic search:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/contacts?q=john" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search contacts active in the last 7 days:**
```bash
SINCE=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/contacts?q=john&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search contacts active between 30 and 7 days ago:**
```bash
SINCE=$(date -v-30d +%s)
UNTIL=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/contacts?q=john&since=${SINCE}&until=${UNTIL}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "contacts": [
      {
        "id": 42,
        "email": "john@example.com",
        "name": "John Doe",
        "phone_number": "+1234567890",
        "identifier": "user_123",
        "additional_attributes": {},
        "created_at": 1701234567
      }
    ]
  }
}
```

---

## 3. Search Conversations

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/conversations`

Search conversations by display ID, contact name, email, phone number,
or identifier with optional time filtering.

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `since` | integer | Unix timestamp - filter by last_activity_at | No |
| `until` | integer | Unix timestamp - filter by last_activity_at | No |

### Example Requests

**Basic search:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/conversations?q=billing" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search conversations active in the last 24 hours:**
```bash
SINCE=$(date -v-1d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/conversations?q=billing&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search conversations from last month:**
```bash
SINCE=$(date -v-30d +%s)
UNTIL=$(date +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/conversations?q=billing&since=${SINCE}&until=${UNTIL}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "conversations": [
      {
        "id": 123,
        "display_id": 45,
        "inbox_id": 1,
        "status": "open",
        "messages": [...],
        "meta": {...}
      }
    ]
  }
}
```

---

## 4. Search Messages (Advanced)

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/messages`

Advanced message search with multiple filters powered by
OpenSearch/Elasticsearch.

### Prerequisites
- OpenSearch/Elasticsearch must be running (`OPENSEARCH_URL` env var
configured)
- Account must have `advanced_search` feature flag enabled
- Messages must be indexed in OpenSearch

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `from` | string | Filter by sender: `contact:{id}` or `agent:{id}` |
No |
| `inbox_id` | integer | Filter by specific inbox ID | No |
| `since` | integer | Unix timestamp - searches from this time (max 90
days ago) | No |
| `until` | integer | Unix timestamp - searches until this time | No |

### Important Notes
- **90-Day Limit**: If `since` is not provided, searches default to the
last 90 days
- If `since` exceeds 90 days, returns `422` error: "Search is limited to
the last 90 days"
- All time filters use message `created_at` timestamp

### Example Requests

**Basic message search:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages from a specific contact:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&from=contact:42" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages from a specific agent:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&from=agent:5" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages in a specific inbox:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&inbox_id=3" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages from the last 7 days:**
```bash
SINCE=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages between specific dates:**
```bash
SINCE=$(date -v-30d +%s)
UNTIL=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&since=${SINCE}&until=${UNTIL}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Combine all filters:**
```bash
SINCE=$(date -v-14d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&from=contact:42&inbox_id=3&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Attempt to search beyond 90 days (returns error):**
```bash
SINCE=$(date -v-120d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response (Success)
```json
{
  "payload": {
    "messages": [
      {
        "id": 789,
        "content": "I need a refund for my purchase",
        "message_type": "incoming",
        "created_at": 1701234567,
        "conversation_id": 123,
        "inbox_id": 3,
        "sender": {
          "id": 42,
          "type": "contact"
        }
      }
    ]
  }
}
```

### Example Response (90-day limit exceeded)
```json
{
  "error": "Search is limited to the last 90 days"
}
```
**Status Code:** `422 Unprocessable Entity`

---

## 5. Search Articles

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/articles`

Search help center articles by title or content.

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |

### Example Request
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/articles?q=installation" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "articles": [
      {
        "id": 456,
        "title": "Installation Guide",
        "slug": "installation-guide",
        "portal_slug": "help",
        "account_id": 1,
        "category_name": "Getting Started",
        "status": "published",
        "updated_at": 1701234567
      }
    ]
  }
}
```

---

## Technical Implementation

### SQL-Based Search (Contacts, Conversations, Articles)
- Uses PostgreSQL `ILIKE` queries by default
- Optional GIN index support via `search_with_gin` feature flag for
better performance
- Time filtering uses `last_activity_at` for contacts/conversations
- Returns paginated results (15 per page)

### Advanced Search (Messages)
- Powered by OpenSearch/Elasticsearch via Searchkick gem
- Requires `OPENSEARCH_URL` environment variable
- Requires `advanced_search` account feature flag
- Enforces 90-day lookback limit via
`Limits::MESSAGE_SEARCH_TIME_RANGE_LIMIT_DAYS`
- Validates inbox access permissions before filtering
- Returns paginated results (15 per page)

---

## Type of change

- [x] New feature (non-breaking change which adds functionality)
- [x] Enhancement (improves existing functionality)

---

## How Has This Been Tested?

### Unit Tests
- **Contact Search Tests**: 3 new test cases for time filtering
(`since`, `until`, combined)
- **Conversation Search Tests**: 3 new test cases for time filtering
- **Message Search Tests**: 10+ test cases covering:
  - Individual filters (`from`, `inbox_id`, time range)
  - Combined filters
  - Permission validation for inbox access
  - Feature flag checks
  - 90-day limit enforcement
  - Error handling for exceeded time limits

### Test Commands
```bash
# Run all search controller tests
bundle exec rspec spec/controllers/api/v1/accounts/search_controller_spec.rb

# Run search service tests (includes enterprise specs)
bundle exec rspec spec/services/search_service_spec.rb
```

### Manual Testing Setup
A rake task is provided to create 50,000 test messages across multiple
inboxes:

```bash
# 1. Create test data
bundle exec rake search:setup_test_data

# 2. Start OpenSearch
mise elasticsearch-start

# 3. Reindex messages
rails runner "Message.search_index.import Message.all"

# 4. Enable feature flag
rails runner "Account.first.enable_features('advanced_search')"

# 5. Test via API or Rails console
```

---

## Checklist

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented on my code, particularly in hard-to-understand
areas
- [x] I have made corresponding changes to the documentation (this PR
description)
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream
modules

---

## Additional Notes

### Requirements
- **OpenSearch/Elasticsearch**: Required for advanced message search
  - Set `OPENSEARCH_URL` environment variable
  - Example: `export OPENSEARCH_URL=http://localhost:9200`
- **Feature Flags**:
  - `advanced_search`: Account-level flag for message advanced search
- `search_with_gin` (optional): Account-level flag for GIN-based SQL
search

### Performance Considerations
- 90-day limit prevents expensive long-range queries on large datasets
- GIN indexes recommended for high-volume search on SQL-based resources
- OpenSearch/Elasticsearch provides faster full-text search for messages

### Breaking Changes
- None. All new parameters are optional and backward compatible.

### Frontend Integration
- Frontend PR tracking advanced search UI will consume these endpoints
- Time range pickers should convert JavaScript `Date` to Unix timestamps
(seconds)
- Date conversion: `Math.floor(date.getTime() / 1000)`

### Error Handling
- Invalid `from` parameter format is silently ignored (filter not
applied)
- Time range exceeding 90 days returns `422` with error message
- Missing `q` parameter returns `422` (existing behavior)
- Unauthorized inbox access is filtered out (no error, just excluded
from results)

---------

Co-authored-by: iamsivin <iamsivin@gmail.com>
Co-authored-by: Sivin Varghese <64252451+iamsivin@users.noreply.github.com>
Co-authored-by: Pranav <pranav@chatwoot.com>
Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
2026-01-07 15:30:49 +05:30

375 lines
12 KiB
YAML

version: 2.1
orbs:
node: circleci/node@6.1.0
qlty-orb: qltysh/qlty-orb@0.0
# Shared defaults for setup steps
defaults: &defaults
working_directory: ~/build
machine:
image: ubuntu-2204:2024.05.1
resource_class: large
environment:
RAILS_LOG_TO_STDOUT: false
COVERAGE: true
LOG_LEVEL: warn
jobs:
# Separate job for linting (no parallelism needed)
lint:
<<: *defaults
steps:
- checkout
# Install minimal system dependencies for linting
- run:
name: Install System Dependencies
command: |
sudo apt-get update
DEBIAN_FRONTEND=noninteractive sudo apt-get install -y \
libpq-dev \
build-essential \
git \
curl \
libssl-dev \
zlib1g-dev \
libreadline-dev \
libyaml-dev \
openjdk-11-jdk \
jq \
software-properties-common \
ca-certificates \
imagemagick \
libxml2-dev \
libxslt1-dev \
file \
g++ \
gcc \
autoconf \
gnupg2 \
patch \
ruby-dev \
liblzma-dev \
libgmp-dev \
libncurses5-dev \
libffi-dev \
libgdbm6 \
libgdbm-dev \
libvips
- run:
name: Install RVM and Ruby 3.4.4
command: |
sudo apt-get install -y gpg
gpg --keyserver hkp://keyserver.ubuntu.com --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB
\curl -sSL https://get.rvm.io | bash -s stable
echo 'source ~/.rvm/scripts/rvm' >> $BASH_ENV
source ~/.rvm/scripts/rvm
rvm install "3.4.4"
rvm use 3.4.4 --default
gem install bundler -v 2.5.16
- run:
name: Install Application Dependencies
command: |
source ~/.rvm/scripts/rvm
bundle install
- node/install:
node-version: '23.7'
- node/install-pnpm
- node/install-packages:
pkg-manager: pnpm
override-ci-command: pnpm i
# Swagger verification
- run:
name: Verify swagger API specification
command: |
bundle exec rake swagger:build
if [[ `git status swagger/swagger.json --porcelain` ]]
then
echo "ERROR: The swagger.json file is not in sync with the yaml specification. Run 'rake swagger:build' and commit 'swagger/swagger.json'."
exit 1
fi
mkdir -p ~/tmp
curl -L https://repo1.maven.org/maven2/org/openapitools/openapi-generator-cli/6.3.0/openapi-generator-cli-6.3.0.jar > ~/tmp/openapi-generator-cli-6.3.0.jar
java -jar ~/tmp/openapi-generator-cli-6.3.0.jar validate -i swagger/swagger.json
# Bundle audit
- run:
name: Bundle audit
command: bundle exec bundle audit update && bundle exec bundle audit check -v
# Rubocop linting
- run:
name: Rubocop
command: bundle exec rubocop --parallel
# ESLint linting
- run:
name: eslint
command: pnpm run eslint
# Separate job for frontend tests
frontend-tests:
<<: *defaults
steps:
- checkout
- node/install:
node-version: '23.7'
- node/install-pnpm
- node/install-packages:
pkg-manager: pnpm
override-ci-command: pnpm i
- run:
name: Run frontend tests (with coverage)
command: pnpm run test:coverage
- run:
name: Move coverage files if they exist
command: |
if [ -d "coverage" ]; then
mkdir -p ~/build/coverage
cp -r coverage ~/build/coverage/frontend || true
fi
when: always
- persist_to_workspace:
root: ~/build
paths:
- coverage
# Backend tests with parallelization
backend-tests:
<<: *defaults
parallelism: 16
steps:
- checkout
- node/install:
node-version: '23.7'
- node/install-pnpm
- node/install-packages:
pkg-manager: pnpm
override-ci-command: pnpm i
- run:
name: Add PostgreSQL repository and update
command: |
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt-get update -y
- run:
name: Install System Dependencies
command: |
sudo apt-get update
DEBIAN_FRONTEND=noninteractive sudo apt-get install -y \
libpq-dev \
redis-server \
postgresql-common \
postgresql-16 \
postgresql-16-pgvector \
build-essential \
git \
curl \
libssl-dev \
zlib1g-dev \
libreadline-dev \
libyaml-dev \
openjdk-11-jdk \
jq \
software-properties-common \
ca-certificates \
imagemagick \
libxml2-dev \
libxslt1-dev \
file \
g++ \
gcc \
autoconf \
gnupg2 \
patch \
ruby-dev \
liblzma-dev \
libgmp-dev \
libncurses5-dev \
libffi-dev \
libgdbm6 \
libgdbm-dev \
libvips
- run:
name: Install RVM and Ruby 3.4.4
command: |
sudo apt-get install -y gpg
gpg --keyserver hkp://keyserver.ubuntu.com --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB
\curl -sSL https://get.rvm.io | bash -s stable
echo 'source ~/.rvm/scripts/rvm' >> $BASH_ENV
source ~/.rvm/scripts/rvm
rvm install "3.4.4"
rvm use 3.4.4 --default
gem install bundler -v 2.5.16
- run:
name: Install Application Dependencies
command: |
source ~/.rvm/scripts/rvm
bundle install
# Install and configure OpenSearch
- run:
name: Install OpenSearch
command: |
# Download and install OpenSearch 2.11.0 (compatible with Elasticsearch 7.x clients)
wget https://artifacts.opensearch.org/releases/bundle/opensearch/2.11.0/opensearch-2.11.0-linux-x64.tar.gz
tar -xzf opensearch-2.11.0-linux-x64.tar.gz
sudo mv opensearch-2.11.0 /opt/opensearch
- run:
name: Configure and Start OpenSearch
command: |
# Configure OpenSearch for single-node testing
cat > /opt/opensearch/config/opensearch.yml \<< EOF
cluster.name: chatwoot-test
node.name: node-1
network.host: 0.0.0.0
http.port: 9200
discovery.type: single-node
plugins.security.disabled: true
EOF
# Set ownership and permissions
sudo chown -R $USER:$USER /opt/opensearch
# Start OpenSearch in background
/opt/opensearch/bin/opensearch -d -p /tmp/opensearch.pid
- run:
name: Wait for OpenSearch to be ready
command: |
echo "Waiting for OpenSearch to start..."
for i in {1..30}; do
if curl -s http://localhost:9200/_cluster/health | grep -q '"status"'; then
echo "OpenSearch is ready!"
exit 0
fi
echo "Waiting... ($i/30)"
sleep 2
done
echo "OpenSearch failed to start"
exit 1
# Configure environment and database
- run:
name: Database Setup and Configure Environment Variables
command: |
pg_pass=$(head /dev/urandom | tr -dc A-Za-z0-9 | head -c 15 ; echo '')
sed -i "s/REPLACE_WITH_PASSWORD/${pg_pass}/g" ${PWD}/.circleci/setup_chatwoot.sql
chmod 644 ${PWD}/.circleci/setup_chatwoot.sql
mv ${PWD}/.circleci/setup_chatwoot.sql /tmp/
sudo -i -u postgres psql -f /tmp/setup_chatwoot.sql
cp .env.example .env
sed -i '/^FRONTEND_URL/d' .env
sed -i -e '/REDIS_URL/ s/=.*/=redis:\/\/localhost:6379/' .env
sed -i -e '/POSTGRES_HOST/ s/=.*/=localhost/' .env
sed -i -e '/POSTGRES_USERNAME/ s/=.*/=chatwoot/' .env
sed -i -e "/POSTGRES_PASSWORD/ s/=.*/=$pg_pass/" .env
echo -en "\nINSTALLATION_ENV=circleci" >> ".env"
echo -en "\nOPENSEARCH_URL=http://localhost:9200" >> ".env"
# Database setup
- run:
name: Run DB migrations
command: bundle exec rails db:chatwoot_prepare
# Run backend tests (parallelized)
- run:
name: Run backend tests
command: |
mkdir -p ~/tmp/test-results/rspec
mkdir -p ~/tmp/test-artifacts
mkdir -p ~/build/coverage/backend
# Use round-robin distribution (same as GitHub Actions) for better test isolation
# This prevents tests with similar timing from being grouped on the same runner
SPEC_FILES=($(find spec -name '*_spec.rb' | sort))
TESTS=""
for i in "${!SPEC_FILES[@]}"; do
if [ $(( i % $CIRCLE_NODE_TOTAL )) -eq $CIRCLE_NODE_INDEX ]; then
TESTS="$TESTS ${SPEC_FILES[$i]}"
fi
done
bundle exec rspec -I ./spec --require coverage_helper --require spec_helper --format progress \
--format RspecJunitFormatter \
--out ~/tmp/test-results/rspec.xml \
-- $TESTS
no_output_timeout: 30m
# Store test results for better splitting in future runs
- store_test_results:
path: ~/tmp/test-results
- run:
name: Move coverage files if they exist
command: |
if [ -d "coverage" ]; then
mkdir -p ~/build/coverage
cp -r coverage ~/build/coverage/backend || true
fi
when: always
- persist_to_workspace:
root: ~/build
paths:
- coverage
# Collect coverage from all jobs
coverage:
<<: *defaults
steps:
- checkout
- attach_workspace:
at: ~/build
# Qlty coverage publish
- qlty-orb/coverage_publish:
files: |
coverage/frontend/lcov.info
- run:
name: List coverage directory contents
command: |
ls -R ~/build/coverage || echo "No coverage directory"
- store_artifacts:
path: coverage
destination: coverage
build:
<<: *defaults
steps:
- run:
name: Legacy build aggregator
command: |
echo "All main jobs passed; build job kept only for GitHub required check compatibility."
workflows:
version: 2
build:
jobs:
- lint
- frontend-tests
- backend-tests
- coverage:
requires:
- frontend-tests
- backend-tests
- build:
requires:
- lint
- coverage