Commit Graph

10 Commits

Author SHA1 Message Date
eloijrseganfredo
14b4c83dc6 fix: Prevent AudioTranscriptionJob from crashing on OpenAI 401 error (#13653)
Describe the bug
In v4.8.0, when an audio message is received, the system enqueues
Messages::AudioTranscriptionJob even if OpenAI and Captain are disabled.
This causes a Faraday::UnauthorizedError (401) which crashes the Sidekiq
job and breaks the pipeline for that message.

To Reproduce
Disable OpenAI/Captain integrations.

Send an audio message to an inbox.

Check Sidekiq logs and observe the 401 crash in
AudioTranscriptionService.

What this PR does
Adds a rescue Faraday::UnauthorizedError block inside
AudioTranscriptionService#perform. Instead of crashing the worker, it
logs a warning and gracefully exits, allowing the job to complete
successfully.

Note: This fixes the backend crash. However, there is still a frontend
reactivity issue where the audio player UI requires an F5 to load the
media, which has been reported in Issue #11013.

---------

Co-authored-by: Eloi Junior Seganfredo <eloi@seganfredo.local>
Co-authored-by: Aakash Bakhle <48802744+aakashb95@users.noreply.github.com>
Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
2026-02-27 14:12:03 +04:00
Sojan Jose
61eaa098ae fix(messages): reduce audio transcription 400 retry noise (#13487)
## Summary
This PR reduces duplicate failure noise for audio transcription jobs
that fail with permanent HTTP 400 responses, and fixes a file-format
edge case causing intermittent 400s.

Sentry issue: [CHATWOOT-99E /
6660541334](https://chatwoot-p3.sentry.io/issues/6660541334/)

## Confirmed root cause
For some attachments, the stored filename had no extension (example:
`speech`, content type `audio/mpeg`).
When the temporary transcription upload file was created without an
extension, OpenAI returned:
`Unrecognized file format` (HTTP 400).

## Scope of changes
1. `Messages::AudioTranscriptionJob`
- Keeps `discard_on Faraday::BadRequestError` to avoid retry storms on
permanent request errors.
- Adds explicit Rails warning logs for discarded jobs with
attachment/job/status context.

2. `Messages::AudioTranscriptionService`
- Keeps guaranteed temp file cleanup via `ensure`.
- Ensures temp upload files include an extension when the original
filename has none, derived from blob `content_type`.
- This addresses intermittent failures like extensionless `audio/mpeg`
files.

## Reproduction
Enable audio transcription for an account and process an audio
attachment whose stored filename has no extension (for example `speech`)
but valid audio content type (`audio/mpeg`).
Before this fix, OpenAI transcription could return HTTP 400
`Unrecognized file format` for that attachment while similar attachments
with extensions succeeded.

## Testing
Ran:
`bundle exec rubocop
enterprise/app/jobs/messages/audio_transcription_job.rb
enterprise/app/services/messages/audio_transcription_service.rb`

Result: both modified files pass lint with no offenses.
2026-02-17 13:25:13 +05:30
Aakash Bakhle
1de8d3e56d feat: legacy features to ruby llm (#12994) 2025-12-11 14:17:28 +05:30
Sojan Jose
cc86b8c7f1 fix: stream attachment handling in workers (#12870)
We’ve been watching Sidekiq workers climb from ~600 MB at boot to
1.4–1.5 GB after an hour whenever attachment-heavy jobs run. This PR is
an experiment to curb that growth by streaming attachments instead of
loading the whole blob into Ruby: reply-mailer inline attachments,
Telegram uploads, and audio transcriptions now read/write in chunks. If
this keeps RSS stable in production we’ll keep it; otherwise we’ll roll
it back and keep digging
2025-12-05 13:02:53 -08:00
Aakash Bakhle
eed2eaceb0 feat: Migrate ruby llm captain (#12981)
Co-authored-by: aakashb95 <aakash@chatwoot.com>
Co-authored-by: Shivam Mishra <scm.mymail@gmail.com>
2025-12-04 18:26:10 +05:30
Pranav
0c2ab7f5e7 feat(ee): Setup advanced, performant message search (#12193)
We now support searching within the actual message content, email
subject lines, and audio transcriptions. This enables a faster, more
accurate search experience going forward. Unlike the standard message
search, which is limited to the last 3 months, this search has no time
restrictions.

The search engine also accounts for small variations in queries. Minor
spelling mistakes, such as searching for slck instead of Slack, will
still return the correct results. It also ignores differences in accents
and diacritics, so searching for Deja vu will match content containing
Déjà vu.


We can also refine searches in the future by criteria such as:
- Searching within a specific inbox
- Filtering by sender or recipient
- Limiting to messages sent by an agent


Fixes https://github.com/chatwoot/chatwoot/issues/11656
Fixes https://github.com/chatwoot/chatwoot/issues/10669
Fixes https://github.com/chatwoot/chatwoot/issues/5910



---

Rake tasks to reindex all the messages. 

```sh
bundle exec rake search:all
```

Rake task to reindex messages from one account only
```sh
bundle exec rake search:account ACCOUNT_ID=1
```
2025-08-28 10:10:28 +05:30
Pranav
ea4477ccde fix: Update ActiveStorage::FileNotFoundError error and fix the captain condition in audio transcription (#11779)
Update the error to `ActiveStorage::FileNotFoundError`. Fix the
condition to enable audio transcription and added a spec for it.
2025-06-20 13:20:55 -07:00
Pranav
dc77b5bb2b feat: Enable audio transcriptions for self hosted instances (#11755)
- Enable audio transcriptions feature for self hosted instances
2025-06-17 16:54:43 -07:00
Pranav
9b43a0f72b fix: Retry job if file not found (#11683)
Removed StandardError rescue blocks and added retry_on for
ResponseBuilderJob and AudioTranscriptionJob
2025-06-05 22:53:11 -05:00
Pranav
8bc00f707b feat(ee): Add transcription support for audio messages (#11670)
<img width="419" alt="Screenshot 2025-06-03 at 4 25 37 PM"
src="https://github.com/user-attachments/assets/4b6ddd11-9b91-4981-a571-83746cc4d40b"
/>


Fixes https://github.com/chatwoot/chatwoot/issues/10182

---------

Co-authored-by: Sojan Jose <sojan@pepalo.com>
2025-06-05 18:29:37 -05:00