feat: Advanced Search Backend (#12917)

## Description

Implements comprehensive search functionality with advanced filtering
capabilities for Chatwoot (Linear: CW-5956).

This PR adds:
1. **Time-based filtering** for contacts and conversations (SQL-based
search)
2. **Advanced message search** with multiple filters
(OpenSearch/Elasticsearch-based)
- **`from` filter**: Filter messages by sender (format: `contact:42` or
`agent:5`)
   - **`inbox_id` filter**: Filter messages by specific inbox
- **Time range filters**: Filter messages using `since` and `until`
parameters (Unix timestamps in seconds)
- **90-day limit enforcement**: Automatically limits searches to the
last 90 days to prevent performance issues

The implementation extends the existing `Enterprise::SearchService`
module for advanced features and adds time filtering to the base
`SearchService` for SQL-based searches.

## API Documentation

### Base URL
All search endpoints follow this pattern:
```
GET /api/v1/accounts/{account_id}/search/{resource}
```

### Authentication
All requests require authentication headers:
```
api_access_token: YOUR_ACCESS_TOKEN
```

---

## 1. Search All Resources

**Endpoint:** `GET /api/v1/accounts/{account_id}/search`

Returns results from all searchable resources (contacts, conversations,
messages, articles).

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `since` | integer | Unix timestamp (contacts/conversations only) | No
|
| `until` | integer | Unix timestamp (contacts/conversations only) | No
|

### Example Request
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search?q=customer" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "contacts": [...],
    "conversations": [...],
    "messages": [...],
    "articles": [...]
  }
}
```

---

## 2. Search Contacts

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/contacts`

Search contacts by name, email, phone number, or identifier with
optional time filtering.

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `since` | integer | Unix timestamp - filter by last_activity_at | No |
| `until` | integer | Unix timestamp - filter by last_activity_at | No |

### Example Requests

**Basic search:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/contacts?q=john" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search contacts active in the last 7 days:**
```bash
SINCE=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/contacts?q=john&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search contacts active between 30 and 7 days ago:**
```bash
SINCE=$(date -v-30d +%s)
UNTIL=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/contacts?q=john&since=${SINCE}&until=${UNTIL}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "contacts": [
      {
        "id": 42,
        "email": "john@example.com",
        "name": "John Doe",
        "phone_number": "+1234567890",
        "identifier": "user_123",
        "additional_attributes": {},
        "created_at": 1701234567
      }
    ]
  }
}
```

---

## 3. Search Conversations

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/conversations`

Search conversations by display ID, contact name, email, phone number,
or identifier with optional time filtering.

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `since` | integer | Unix timestamp - filter by last_activity_at | No |
| `until` | integer | Unix timestamp - filter by last_activity_at | No |

### Example Requests

**Basic search:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/conversations?q=billing" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search conversations active in the last 24 hours:**
```bash
SINCE=$(date -v-1d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/conversations?q=billing&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search conversations from last month:**
```bash
SINCE=$(date -v-30d +%s)
UNTIL=$(date +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/conversations?q=billing&since=${SINCE}&until=${UNTIL}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "conversations": [
      {
        "id": 123,
        "display_id": 45,
        "inbox_id": 1,
        "status": "open",
        "messages": [...],
        "meta": {...}
      }
    ]
  }
}
```

---

## 4. Search Messages (Advanced)

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/messages`

Advanced message search with multiple filters powered by
OpenSearch/Elasticsearch.

### Prerequisites
- OpenSearch/Elasticsearch must be running (`OPENSEARCH_URL` env var
configured)
- Account must have `advanced_search` feature flag enabled
- Messages must be indexed in OpenSearch

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |
| `from` | string | Filter by sender: `contact:{id}` or `agent:{id}` |
No |
| `inbox_id` | integer | Filter by specific inbox ID | No |
| `since` | integer | Unix timestamp - searches from this time (max 90
days ago) | No |
| `until` | integer | Unix timestamp - searches until this time | No |

### Important Notes
- **90-Day Limit**: If `since` is not provided, searches default to the
last 90 days
- If `since` exceeds 90 days, returns `422` error: "Search is limited to
the last 90 days"
- All time filters use message `created_at` timestamp

### Example Requests

**Basic message search:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages from a specific contact:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&from=contact:42" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages from a specific agent:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&from=agent:5" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages in a specific inbox:**
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&inbox_id=3" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages from the last 7 days:**
```bash
SINCE=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Search messages between specific dates:**
```bash
SINCE=$(date -v-30d +%s)
UNTIL=$(date -v-7d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&since=${SINCE}&until=${UNTIL}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Combine all filters:**
```bash
SINCE=$(date -v-14d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&from=contact:42&inbox_id=3&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

**Attempt to search beyond 90 days (returns error):**
```bash
SINCE=$(date -v-120d +%s)
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/messages?q=refund&since=${SINCE}" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response (Success)
```json
{
  "payload": {
    "messages": [
      {
        "id": 789,
        "content": "I need a refund for my purchase",
        "message_type": "incoming",
        "created_at": 1701234567,
        "conversation_id": 123,
        "inbox_id": 3,
        "sender": {
          "id": 42,
          "type": "contact"
        }
      }
    ]
  }
}
```

### Example Response (90-day limit exceeded)
```json
{
  "error": "Search is limited to the last 90 days"
}
```
**Status Code:** `422 Unprocessable Entity`

---

## 5. Search Articles

**Endpoint:** `GET /api/v1/accounts/{account_id}/search/articles`

Search help center articles by title or content.

### Parameters
| Parameter | Type | Description | Required |
|-----------|------|-------------|----------|
| `q` | string | Search query | Yes |
| `page` | integer | Page number (15 items per page) | No |

### Example Request
```bash
curl -X GET "https://app.chatwoot.com/api/v1/accounts/1/search/articles?q=installation" \
  -H "api_access_token: YOUR_ACCESS_TOKEN"
```

### Example Response
```json
{
  "payload": {
    "articles": [
      {
        "id": 456,
        "title": "Installation Guide",
        "slug": "installation-guide",
        "portal_slug": "help",
        "account_id": 1,
        "category_name": "Getting Started",
        "status": "published",
        "updated_at": 1701234567
      }
    ]
  }
}
```

---

## Technical Implementation

### SQL-Based Search (Contacts, Conversations, Articles)
- Uses PostgreSQL `ILIKE` queries by default
- Optional GIN index support via `search_with_gin` feature flag for
better performance
- Time filtering uses `last_activity_at` for contacts/conversations
- Returns paginated results (15 per page)

### Advanced Search (Messages)
- Powered by OpenSearch/Elasticsearch via Searchkick gem
- Requires `OPENSEARCH_URL` environment variable
- Requires `advanced_search` account feature flag
- Enforces 90-day lookback limit via
`Limits::MESSAGE_SEARCH_TIME_RANGE_LIMIT_DAYS`
- Validates inbox access permissions before filtering
- Returns paginated results (15 per page)

---

## Type of change

- [x] New feature (non-breaking change which adds functionality)
- [x] Enhancement (improves existing functionality)

---

## How Has This Been Tested?

### Unit Tests
- **Contact Search Tests**: 3 new test cases for time filtering
(`since`, `until`, combined)
- **Conversation Search Tests**: 3 new test cases for time filtering
- **Message Search Tests**: 10+ test cases covering:
  - Individual filters (`from`, `inbox_id`, time range)
  - Combined filters
  - Permission validation for inbox access
  - Feature flag checks
  - 90-day limit enforcement
  - Error handling for exceeded time limits

### Test Commands
```bash
# Run all search controller tests
bundle exec rspec spec/controllers/api/v1/accounts/search_controller_spec.rb

# Run search service tests (includes enterprise specs)
bundle exec rspec spec/services/search_service_spec.rb
```

### Manual Testing Setup
A rake task is provided to create 50,000 test messages across multiple
inboxes:

```bash
# 1. Create test data
bundle exec rake search:setup_test_data

# 2. Start OpenSearch
mise elasticsearch-start

# 3. Reindex messages
rails runner "Message.search_index.import Message.all"

# 4. Enable feature flag
rails runner "Account.first.enable_features('advanced_search')"

# 5. Test via API or Rails console
```

---

## Checklist

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented on my code, particularly in hard-to-understand
areas
- [x] I have made corresponding changes to the documentation (this PR
description)
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream
modules

---

## Additional Notes

### Requirements
- **OpenSearch/Elasticsearch**: Required for advanced message search
  - Set `OPENSEARCH_URL` environment variable
  - Example: `export OPENSEARCH_URL=http://localhost:9200`
- **Feature Flags**:
  - `advanced_search`: Account-level flag for message advanced search
- `search_with_gin` (optional): Account-level flag for GIN-based SQL
search

### Performance Considerations
- 90-day limit prevents expensive long-range queries on large datasets
- GIN indexes recommended for high-volume search on SQL-based resources
- OpenSearch/Elasticsearch provides faster full-text search for messages

### Breaking Changes
- None. All new parameters are optional and backward compatible.

### Frontend Integration
- Frontend PR tracking advanced search UI will consume these endpoints
- Time range pickers should convert JavaScript `Date` to Unix timestamps
(seconds)
- Date conversion: `Math.floor(date.getTime() / 1000)`

### Error Handling
- Invalid `from` parameter format is silently ignored (filter not
applied)
- Time range exceeding 90 days returns `422` with error message
- Missing `q` parameter returns `422` (existing behavior)
- Unauthorized inbox access is filtered out (no error, just excluded
from results)

---------

Co-authored-by: iamsivin <iamsivin@gmail.com>
Co-authored-by: Sivin Varghese <64252451+iamsivin@users.noreply.github.com>
Co-authored-by: Pranav <pranav@chatwoot.com>
Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
This commit is contained in:
Vinay Keerthi
2026-01-07 15:30:49 +05:30
committed by GitHub
parent 566de02385
commit 59cbf57e20
49 changed files with 3816 additions and 385 deletions

View File

@@ -9,6 +9,7 @@ module Limits
COMPANY_NAME_LENGTH_LIMIT = 100
COMPANY_DESCRIPTION_LENGTH_LIMIT = 1000
MAX_CUSTOM_FILTERS_PER_USER = 1000
MESSAGE_SEARCH_TIME_RANGE_LIMIT_DAYS = 90
def self.conversation_message_per_minute_limit
ENV.fetch('CONVERSATION_MESSAGE_PER_MINUTE_LIMIT', '200').to_i

View File

@@ -0,0 +1,183 @@
# rubocop:disable Metrics/BlockLength
namespace :search do
desc 'Create test messages for advanced search manual testing across multiple inboxes'
task setup_test_data: :environment do
puts '🔍 Setting up test data for advanced search...'
account = Account.first
unless account
puts '❌ No account found. Please create an account first.'
exit 1
end
agents = account.users.to_a
unless agents.any?
puts '❌ No agents found. Please create users first.'
exit 1
end
puts "✅ Using account: #{account.name} (ID: #{account.id})"
puts "✅ Found #{agents.count} agent(s)"
# Create missing inbox types for comprehensive testing
puts "\n📥 Checking and creating inboxes..."
# API inbox
unless account.inboxes.exists?(channel_type: 'Channel::Api')
puts ' Creating API inbox...'
account.inboxes.create!(
name: 'Search Test API',
channel: Channel::Api.create!(account: account)
)
end
# Web Widget inbox
unless account.inboxes.exists?(channel_type: 'Channel::WebWidget')
puts ' Creating WebWidget inbox...'
account.inboxes.create!(
name: 'Search Test WebWidget',
channel: Channel::WebWidget.create!(account: account, website_url: 'https://example.com')
)
end
# Email inbox
unless account.inboxes.exists?(channel_type: 'Channel::Email')
puts ' Creating Email inbox...'
account.inboxes.create!(
name: 'Search Test Email',
channel: Channel::Email.create!(
account: account,
email: 'search-test@example.com',
imap_enabled: false,
smtp_enabled: false
)
)
end
inboxes = account.inboxes.to_a
puts "✅ Using #{inboxes.count} inbox(es):"
inboxes.each { |i| puts " - #{i.name} (ID: #{i.id}, Type: #{i.channel_type})" }
# Create 10 test contacts
contacts = []
10.times do |i|
contacts << account.contacts.find_or_create_by!(
email: "test-customer-#{i}@example.com"
) do |c|
c.name = Faker::Name.name
end
end
puts "✅ Created/found #{contacts.count} test contacts"
target_messages = 50_000
messages_per_conversation = 100
total_conversations = target_messages / messages_per_conversation
puts "\n📝 Creating #{target_messages} messages across #{total_conversations} conversations..."
puts " Distribution: #{inboxes.count} inboxes × #{total_conversations / inboxes.count} conversations each"
start_time = 2.years.ago
end_time = Time.current
time_range = end_time - start_time
created_count = 0
failed_count = 0
conversations_per_inbox = total_conversations / inboxes.count
conversation_statuses = [:open, :resolved]
inboxes.each do |inbox|
conversations_per_inbox.times do
# Pick random contact and agent for this conversation
contact = contacts.sample
agent = agents.sample
# Create or find ContactInbox
contact_inbox = ContactInbox.find_or_create_by!(
contact: contact,
inbox: inbox
) do |ci|
ci.source_id = "test_#{SecureRandom.hex(8)}"
end
# Create conversation
conversation = inbox.conversations.create!(
account: account,
contact: contact,
inbox: inbox,
contact_inbox: contact_inbox,
status: conversation_statuses.sample
)
# Create messages for this conversation (50 incoming, 50 outgoing)
50.times do
random_time = start_time + (rand * time_range)
# Incoming message from contact
begin
Message.create!(
content: Faker::Movie.quote,
account: account,
inbox: inbox,
conversation: conversation,
message_type: :incoming,
sender: contact,
created_at: random_time,
updated_at: random_time
)
created_count += 1
rescue StandardError => e
failed_count += 1
puts "❌ Failed to create message: #{e.message}" if failed_count <= 5
end
# Outgoing message from agent
begin
Message.create!(
content: Faker::Movie.quote,
account: account,
inbox: inbox,
conversation: conversation,
message_type: :outgoing,
sender: agent,
created_at: random_time + rand(60..600),
updated_at: random_time + rand(60..600)
)
created_count += 1
rescue StandardError => e
failed_count += 1
puts "❌ Failed to create message: #{e.message}" if failed_count <= 5
end
print "\r🔄 Progress: #{created_count}/#{target_messages} messages created..." if (created_count % 500).zero?
end
end
end
puts "\n\n✅ Successfully created #{created_count} messages!"
puts "❌ Failed: #{failed_count}" if failed_count.positive?
puts "\n📊 Summary:"
puts " - Total messages: #{Message.where(account: account).count}"
puts " - Total conversations: #{Conversation.where(account: account).count}"
min_date = Message.where(account: account).minimum(:created_at)&.strftime('%Y-%m-%d')
max_date = Message.where(account: account).maximum(:created_at)&.strftime('%Y-%m-%d')
puts " - Date range: #{min_date} to #{max_date}"
puts "\nBreakdown by inbox:"
inboxes.each do |inbox|
msg_count = Message.where(inbox: inbox).count
conv_count = Conversation.where(inbox: inbox).count
puts " - #{inbox.name} (#{inbox.channel_type}): #{msg_count} messages, #{conv_count} conversations"
end
puts "\nBreakdown by sender type:"
puts " - Incoming (from contacts): #{Message.where(account: account, message_type: :incoming).count}"
puts " - Outgoing (from agents): #{Message.where(account: account, message_type: :outgoing).count}"
puts "\n🔧 Next steps:"
puts ' 1. Ensure OpenSearch is running: mise elasticsearch-start'
puts ' 2. Reindex messages: rails runner "Message.search_index.import Message.all"'
puts " 3. Enable feature: rails runner \"Account.find(#{account.id}).enable_features('advanced_search')\""
puts "\n💡 Then test the search with filters via API or Rails console!"
end
end
# rubocop:enable Metrics/BlockLength