Commit Graph

3 Commits

Author SHA1 Message Date
Tanmay Deep Sharma
75f75ce786 fix: sanitize integer fields to prevent Elasticsearch mapping errors (#13276)
## Linear task:

https://linear.app/chatwoot/issue/CW-6318/searchkickimporterror-type-=-mapper-parsing-exception-reason-=-failed

## Description

Fixes Elasticsearch `mapper_parsing_exception` errors that occur when
`campaign_id` contain non-numeric string values

## Type of change

- [x] Bug fix (non-breaking change which fixes an issue)

## How Has This Been Tested?

- Unit tests
- use a local OpenSearch 3.4.0 cluster to verify actual indexing
behavior.


## Checklist:

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented on my code, particularly in hard-to-understand
areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream
modules


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Removes `campaign_id` from the message search index payload to
simplify `additional_attributes`, keeping only `automation_rule_id`.
> 
> - `Messages::SearchDataPresenter#additional_attributes_data` now
returns only `automation_rule_id`
> - Specs updated to stop asserting `campaign_id` and continue
validating `automation_rule_id` and email subject handling
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
5a9c8eb794a044e3f258b644f67a6731de9e904c. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-01-22 18:29:49 +05:30
Pranav
40c75941f5 fix: Avoid introducing new attributes in search (#12791)
Fix `Limit of total fields [1000] has been exceeded`


https://linear.app/chatwoot/issue/CW-5861/searchkickimporterror-type-=-illegal-argument-exception-reason-=-limit#comment-6b6e41bd
2025-11-04 13:28:51 -08:00
Vinay Keerthi
d9b840f161 fix: Optimize Message search_data to prevent OpenSearch field explosion (#12786)
## Description

Refactored the `Message#search_data` method to prevent exceeding
OpenSearch's 1000 field limit during reindex operations.

**Problem:** The previous implementation serialized entire ActiveRecord
objects (Inbox, Sender, Conversation) with all their JSONB fields,
causing dynamic field explosion in OpenSearch. This resulted in
`Searchkick::ImportError` with "Limit of total fields [1000] has been
exceeded".

**Solution:** Whitelisted only necessary fields for search and
filtering, and flattened JSONB `custom_attributes` into key-value pair
arrays to prevent unbounded field creation.

Linked to: CW-5861

## Type of change

- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality not to work as expected)
- [x] This change requires a documentation update

## How Has This Been Tested?

- Verified rubocop passes with no offenses
- Code review of search field usage from
`enterprise/app/services/enterprise/search_service.rb`
- Analyzed actual search queries to determine required indexed fields

**Still needed:**
- Full reindex test on staging/production environment
- Verify search functionality still works after reindex
- Confirm field count is under 1000 limit

## Changes Made

### Before
- Indexed 1000+ fields (entire AR objects with JSONB)
- `inbox` = full Inbox object (23+ fields + JSONB)
- `sender` = full Contact/User/AgentBot object (10+ fields + JSONB)
- `conversation` = full push_event_data
- Dynamic JSONB keys creating unlimited fields

### After
- ~35-40 controlled fields
- Whitelisted search fields: `content`, `attachment_transcribed_text`,
`email_subject`
- Filter fields: `account_id`, `inbox_id`, `conversation_id`,
`sender_id`, `sender_type`, etc.
- Flattened `custom_attributes`: `[{key, value, value_type}]` format
- Helper methods: `search_conversation_data`, `search_inbox_data`,
`search_sender_data`, `search_additional_data`

## Checklist:

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented on my code, particularly in hard-to-understand
areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream
modules

## Post-merge Steps

After merging, the following steps are required:

1. **Reindex all messages:**
   ```bash
   bundle exec rails runner "Message.reindex"
   ```

2. **Verify field count:**
   ```bash
   bundle exec rails runner "
     client = Searchkick.client
     index_name = Message.searchkick_index.name
     mapping = client.indices.get_mapping(index: index_name)
     fields = mapping.dig(index_name, 'mappings', 'properties')
     puts 'Total fields: ' + fields.keys.count.to_s
   "
   ```

3. **Test search functionality** to ensure queries still work as
expected

---------

Co-authored-by: Vishnu Narayanan <iamwishnu@gmail.com>
Co-authored-by: Pranav <pranav@chatwoot.com>
2025-11-03 17:37:51 -08:00