Files
leadchat/app/services/search_service.rb
Vinay Keerthi 59cbf57e20 feat: Advanced Search Backend (#12917)
## Description

Implements comprehensive search functionality with advanced filtering
capabilities for Chatwoot (Linear: CW-5956).

This PR adds:
1. **Time-based filtering** for contacts and conversations (SQL-based
search)
2. **Advanced message search** with multiple filters
(OpenSearch/Elasticsearch-based)
- **`from` filter**: Filter messages by sender (format: `contact:42` or
`agent:5`)
   - **`inbox_id` filter**: Filter messages by specific inbox
- **Time range filters**: Filter messages using `since` and `until`
parameters (Unix timestamps in seconds)
- **90-day limit enforcement**: Automatically limits searches to the
last 90 days to prevent performance issues

The implementation extends the existing `Enterprise::SearchService`
module for advanced features and adds time filtering to the base
`SearchService` for SQL-based searches.

## API Documentation

### Base URL
All search endpoints follow this pattern:
```
GET /api/v1/accounts/{account_id}/search/{resource}
```

### Authentication
All requests require authentication headers:
```
api_access_token: YOUR_ACCESS_TOKEN
```

---

## 1. Search All Resources

**Endpoint:** `GET /api/v1/accounts/{account_id}/search`

Returns results from all searchable resources (contacts, conversations,
messages, articles).

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `since` | integer | Unix timestamp (contacts/conversations only) | No
|
| `until` | integer | Unix timestamp (contacts/conversations only) | No
|

### Example Request
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search?q=customer" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "contacts": [...],
    "conversations": [...],
    "messages": [...],
    "articles": [...]
  }
}
```

---

## 2. Search Contacts

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/contacts`

Search contacts by name, email, phone number, or identifier with
optional time filtering.

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `since` | integer | Unix timestamp - filter by last_activity_at | No |
| `until` | integer | Unix timestamp - filter by last_activity_at | No |

### Example Requests

**Basic search:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/contacts?q=john" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search contacts active in the last 7 days:**
```bash
SINCE=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/contacts?q=john&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search contacts active between 30 and 7 days ago:**
```bash
SINCE=$(date -v-30d +%s)
UNTIL=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/contacts?q=john&since=${SINCE}&until=${UNTIL}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "contacts": [
      {
        "id": 42,
        "email": "john@example.com",
        "name": "John Doe",
        "phone_number": "+1234567890",
        "identifier": "user_123",
        "additional_attributes": {},
        "created_at": 1701234567
      }
    ]
  }
}
```

---

## 3. Search Conversations

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/conversations`

Search conversations by display ID, contact name, email, phone number,
or identifier with optional time filtering.

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `since` | integer | Unix timestamp - filter by last_activity_at | No |
| `until` | integer | Unix timestamp - filter by last_activity_at | No |

### Example Requests

**Basic search:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/conversations?q=billing" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search conversations active in the last 24 hours:**
```bash
SINCE=$(date -v-1d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/conversations?q=billing&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search conversations from last month:**
```bash
SINCE=$(date -v-30d +%s)
UNTIL=$(date +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/conversations?q=billing&since=${SINCE}&until=${UNTIL}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "conversations": [
      {
        "id": 123,
        "display_id": 45,
        "inbox_id": 1,
        "status": "open",
        "messages": [...],
        "meta": {...}
      }
    ]
  }
}
```

---

## 4. Search Messages (Advanced)

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/messages`

Advanced message search with multiple filters powered by
OpenSearch/Elasticsearch.

### Prerequisites
- OpenSearch/Elasticsearch must be running (`OPENSEARCH_URL` env var
configured)
- Account must have `advanced_search` feature flag enabled
- Messages must be indexed in OpenSearch

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `from` | string | Filter by sender: `contact:{id}` or `agent:{id}` |
No |
| `inbox_id` | integer | Filter by specific inbox ID | No |
| `since` | integer | Unix timestamp - searches from this time (max 90
days ago) | No |
| `until` | integer | Unix timestamp - searches until this time | No |

### Important Notes
- **90-Day Limit**: If `since` is not provided, searches default to the
last 90 days
- If `since` exceeds 90 days, returns `422` error: "Search is limited to
the last 90 days"
- All time filters use message `created_at` timestamp

### Example Requests

**Basic message search:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages from a specific contact:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&from=contact:42" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages from a specific agent:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&from=agent:5" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages in a specific inbox:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&inbox_id=3" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages from the last 7 days:**
```bash
SINCE=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages between specific dates:**
```bash
SINCE=$(date -v-30d +%s)
UNTIL=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&since=${SINCE}&until=${UNTIL}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Combine all filters:**
```bash
SINCE=$(date -v-14d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&from=contact:42&inbox_id=3&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Attempt to search beyond 90 days (returns error):**
```bash
SINCE=$(date -v-120d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response (Success)
```json
{
  "payload": {
    "messages": [
      {
        "id": 789,
        "content": "I need a refund for my purchase",
        "message_type": "incoming",
        "created_at": 1701234567,
        "conversation_id": 123,
        "inbox_id": 3,
        "sender": {
          "id": 42,
          "type": "contact"
        }
      }
    ]
  }
}
```

### Example Response (90-day limit exceeded)
```json
{
  "error": "Search is limited to the last 90 days"
}
```
**Status Code:** `422 Unprocessable Entity`

---

## 5. Search Articles

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/articles`

Search help center articles by title or content.

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |

### Example Request
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/articles?q=installation" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "articles": [
      {
        "id": 456,
        "title": "Installation Guide",
        "slug": "installation-guide",
        "portal_slug": "help",
        "account_id": 1,
        "category_name": "Getting Started",
        "status": "published",
        "updated_at": 1701234567
      }
    ]
  }
}
```

---

## Technical Implementation

### SQL-Based Search (Contacts, Conversations, Articles)
- Uses PostgreSQL `ILIKE` queries by default
- Optional GIN index support via `search_with_gin` feature flag for
better performance
- Time filtering uses `last_activity_at` for contacts/conversations
- Returns paginated results (15 per page)

### Advanced Search (Messages)
- Powered by OpenSearch/Elasticsearch via Searchkick gem
- Requires `OPENSEARCH_URL` environment variable
- Requires `advanced_search` account feature flag
- Enforces 90-day lookback limit via
`Limits::MESSAGE_SEARCH_TIME_RANGE_LIMIT_DAYS`
- Validates inbox access permissions before filtering
- Returns paginated results (15 per page)

---

## Type of change

- [x] New feature (non-breaking change which adds functionality)
- [x] Enhancement (improves existing functionality)

---

## How Has This Been Tested?

### Unit Tests
- **Contact Search Tests**: 3 new test cases for time filtering
(`since`, `until`, combined)
- **Conversation Search Tests**: 3 new test cases for time filtering
- **Message Search Tests**: 10+ test cases covering:
  - Individual filters (`from`, `inbox_id`, time range)
  - Combined filters
  - Permission validation for inbox access
  - Feature flag checks
  - 90-day limit enforcement
  - Error handling for exceeded time limits

### Test Commands
```bash
# Run all search controller tests
bundle exec rspec spec/controllers/api/v1/accounts/search_controller_spec.rb

# Run search service tests (includes enterprise specs)
bundle exec rspec spec/services/search_service_spec.rb
```

### Manual Testing Setup
A rake task is provided to create 50,000 test messages across multiple
inboxes:

```bash
# 1. Create test data
bundle exec rake search:setup_test_data

# 2. Start OpenSearch
mise elasticsearch-start

# 3. Reindex messages
rails runner "Message.search_index.import Message.all"

# 4. Enable feature flag
rails runner "Account.first.enable_features('advanced_search')"

# 5. Test via API or Rails console
```

---

## Checklist

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented on my code, particularly in hard-to-understand
areas
- [x] I have made corresponding changes to the documentation (this PR
description)
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream
modules

---

## Additional Notes

### Requirements
- **OpenSearch/Elasticsearch**: Required for advanced message search
  - Set `OPENSEARCH_URL` environment variable
  - Example: `export OPENSEARCH_URL=http://localhost:9200`
- **Feature Flags**:
  - `advanced_search`: Account-level flag for message advanced search
- `search_with_gin` (optional): Account-level flag for GIN-based SQL
search

### Performance Considerations
- 90-day limit prevents expensive long-range queries on large datasets
- GIN indexes recommended for high-volume search on SQL-based resources
- OpenSearch/Elasticsearch provides faster full-text search for messages

### Breaking Changes
- None. All new parameters are optional and backward compatible.

### Frontend Integration
- Frontend PR tracking advanced search UI will consume these endpoints
- Time range pickers should convert JavaScript `Date` to Unix timestamps
(seconds)
- Date conversion: `Math.floor(date.getTime() / 1000)`

### Error Handling
- Invalid `from` parameter format is silently ignored (filter not
applied)
- Time range exceeding 90 days returns `422` with error message
- Missing `q` parameter returns `422` (existing behavior)
- Unauthorized inbox access is filtered out (no error, just excluded
from results)

---------

Co-authored-by: iamsivin <iamsivin@gmail.com>
Co-authored-by: Sivin Varghese <64252451+iamsivin@users.noreply.github.com>
Co-authored-by: Pranav <pranav@chatwoot.com>
Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
2026-01-07 15:30:49 +05:30

209 lines
6.8 KiB
Ruby

class SearchService
pattr_initialize [:current_user!, :current_account!, :params!, :search_type!]
def account_user
@account_user ||= current_account.account_users.find_by(user: current_user)
end
def perform
case search_type
when 'Message'
{ messages: filter_messages }
when 'Conversation'
{ conversations: filter_conversations }
when 'Contact'
{ contacts: filter_contacts }
when 'Article'
{ articles: filter_articles }
else
{ contacts: filter_contacts, messages: filter_messages, conversations: filter_conversations, articles: filter_articles }
end
end
private
def accessable_inbox_ids
@accessable_inbox_ids ||= @current_user.assigned_inboxes.pluck(:id)
end
def search_query
@search_query ||= params[:q].to_s.strip
end
def filter_conversations
conversations_query = current_account.conversations.where(inbox_id: accessable_inbox_ids)
.joins('INNER JOIN contacts ON conversations.contact_id = contacts.id')
.where("cast(conversations.display_id as text) ILIKE :search OR contacts.name ILIKE :search OR contacts.email
ILIKE :search OR contacts.phone_number ILIKE :search OR contacts.identifier ILIKE :search", search: "%#{search_query}%")
if current_account.feature_enabled?('advanced_search')
conversations_query = apply_time_filter(conversations_query,
'conversations.last_activity_at')
end
@conversations = conversations_query.order('conversations.created_at DESC')
.page(params[:page])
.per(15)
end
def filter_messages
@messages = if use_gin_search
filter_messages_with_gin
elsif should_run_advanced_search?
advanced_search_with_fallback
else
filter_messages_with_like
end
end
def advanced_search_with_fallback
advanced_search
rescue Faraday::ConnectionFailed, Searchkick::Error, Elasticsearch::Transport::Transport::Error => e
Rails.logger.warn("Elasticsearch unavailable, falling back to SQL search: #{e.message}")
use_gin_search ? filter_messages_with_gin : filter_messages_with_like
end
def should_run_advanced_search?
ChatwootApp.advanced_search_allowed? && current_account.feature_enabled?('advanced_search')
end
def advanced_search; end
def filter_messages_with_gin
base_query = message_base_query
base_query = apply_message_filters(base_query)
if search_query.present?
# Use the @@ operator with to_tsquery for better GIN index utilization
# Convert search query to tsquery format with prefix matching
# Use this if we wanna match splitting the words
# split_query = search_query.split.map { |term| "#{term} | #{term}:*" }.join(' & ')
# This will do entire sentence matching using phrase distance operator
tsquery = search_query.split.join(' <-> ')
# Apply the text search using the GIN index
base_query.where('content @@ to_tsquery(?)', tsquery)
.reorder('created_at DESC')
.page(params[:page])
.per(15)
else
base_query.reorder('created_at DESC')
.page(params[:page])
.per(15)
end
end
def filter_messages_with_like
base_query = message_base_query
base_query = apply_message_filters(base_query)
base_query.where('messages.content ILIKE :search', search: "%#{search_query}%")
.reorder('created_at DESC')
.page(params[:page])
.per(15)
end
def message_base_query
query = current_account.messages.where('created_at >= ?', 3.months.ago)
query = query.where(inbox_id: accessable_inbox_ids) unless should_skip_inbox_filtering?
query
end
def apply_message_filters(query)
return query unless current_account.feature_enabled?('advanced_search')
query = apply_time_filter(query, 'messages.created_at')
query = apply_sender_filter(query)
apply_inbox_id_filter(query)
end
def apply_sender_filter(query)
sender_type, sender_id = parse_from_param(params[:from])
return query unless sender_type && sender_id
query.where(sender_type: sender_type, sender_id: sender_id)
end
def parse_from_param(from_param)
return [nil, nil] unless from_param&.match?(/\A(contact|agent):\d+\z/)
type, id = from_param.split(':')
sender_type = type == 'agent' ? 'User' : 'Contact'
[sender_type, id.to_i]
end
def apply_inbox_id_filter(query)
return query if params[:inbox_id].blank?
inbox_id = params[:inbox_id].to_i
return query if inbox_id.zero?
return query unless validate_inbox_access(inbox_id)
query.where(inbox_id: inbox_id)
end
def validate_inbox_access(inbox_id)
return true if should_skip_inbox_filtering?
accessable_inbox_ids.include?(inbox_id)
end
def should_skip_inbox_filtering?
account_user.administrator? || user_has_access_to_all_inboxes?
end
def user_has_access_to_all_inboxes?
accessable_inbox_ids.sort == current_account.inboxes.pluck(:id).sort
end
def use_gin_search
current_account.feature_enabled?('search_with_gin')
end
def filter_contacts
contacts_query = current_account.contacts.where(
"name ILIKE :search OR email ILIKE :search OR phone_number
ILIKE :search OR identifier ILIKE :search", search: "%#{search_query}%"
)
contacts_query = apply_time_filter(contacts_query, 'last_activity_at') if current_account.feature_enabled?('advanced_search')
@contacts = contacts_query.resolved_contacts(
use_crm_v2: current_account.feature_enabled?('crm_v2')
).order_on_last_activity_at('desc').page(params[:page]).per(15)
end
def filter_articles
articles_query = current_account.articles.text_search(search_query)
articles_query = apply_time_filter(articles_query, 'updated_at') if current_account.feature_enabled?('advanced_search')
@articles = articles_query.page(params[:page]).per(15)
end
def apply_time_filter(query, column_name)
return query if params[:since].blank? && params[:until].blank?
query = query.where("#{column_name} >= ?", cap_since_time(params[:since])) if params[:since].present?
query = query.where("#{column_name} <= ?", cap_until_time(params[:until])) if params[:until].present?
query
end
def cap_since_time(since_param)
max_lookback = 90.days.ago
requested_time = Time.zone.at(since_param.to_i)
# Silently cap to max_lookback if requested time is too far back
[requested_time, max_lookback].max
end
def cap_until_time(until_param)
max_future = 90.days.from_now
requested_time = Time.zone.at(until_param.to_i)
[requested_time, max_future].min
end
end
SearchService.prepend_mod_with('SearchService')