fix: stream attachment handling in workers (#12870)

We’ve been watching Sidekiq workers climb from ~600 MB at boot to
1.4–1.5 GB after an hour whenever attachment-heavy jobs run. This PR is
an experiment to curb that growth by streaming attachments instead of
loading the whole blob into Ruby: reply-mailer inline attachments,
Telegram uploads, and audio transcriptions now read/write in chunks. If
this keeps RSS stable in production we’ll keep it; otherwise we’ll roll
it back and keep digging
This commit is contained in:
Sojan Jose
2025-12-05 13:02:53 -08:00
committed by GitHub
parent a971ff00f8
commit cc86b8c7f1
12 changed files with 203 additions and 74 deletions

View File

@@ -27,10 +27,16 @@ class Messages::AudioTranscriptionService < Llm::LegacyBaseOpenAiService
end
def fetch_audio_file
temp_dir = Rails.root.join('tmp/uploads')
temp_dir = Rails.root.join('tmp/uploads/audio-transcriptions')
FileUtils.mkdir_p(temp_dir)
temp_file_path = File.join(temp_dir, attachment.file.filename.to_s)
File.write(temp_file_path, attachment.file.download, mode: 'wb')
temp_file_path = File.join(temp_dir, "#{attachment.file.blob.key}-#{attachment.file.filename}")
File.open(temp_file_path, 'wb') do |file|
attachment.file.blob.open do |blob_file|
IO.copy_stream(blob_file, file)
end
end
temp_file_path
end
@@ -40,18 +46,24 @@ class Messages::AudioTranscriptionService < Llm::LegacyBaseOpenAiService
temp_file_path = fetch_audio_file
response = @client.audio.transcribe(
parameters: {
model: 'whisper-1',
file: File.open(temp_file_path),
temperature: 0.4
}
)
response_text = nil
File.open(temp_file_path, 'rb') do |file|
response = @client.audio.transcribe(
parameters: {
model: 'whisper-1',
file: file,
temperature: 0.4
}
)
response_text = response['text']
end
FileUtils.rm_f(temp_file_path)
update_transcription(response['text'])
response['text']
update_transcription(response_text)
response_text
end
def update_transcription(transcribed_text)