feat: Add support for image files in Captain (#11730)

# Pull Request Template ## Linear links: - https://linear.app/chatwoot/issue/CW-4479/if-image-is-sent-by-the-customer-send-it-to-openai ## Description This pull request adds “Captain image support” to Chatwoot. It introduces multimodal message handling so that when a customer sends an image, Captain can forward the file to OpenAI’s vision endpoint, generate a caption/analysis ## Type of change Please delete options that are not relevant. - [x] New feature (non-breaking change which adds functionality) ## How Has This Been Tested? <img width="891" alt="image" src="https://github.com/user-attachments/assets/c7cc98ed-cc44-4865-a53a-83d129e2fe2c" /> ## Checklist: - [ ] My code follows the style guidelines of this project - [ ] I have performed a self-review of my code - [ ] I have commented on my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [ ] Any dependent changes have been merged and published in downstream modules --------- Co-authored-by: Pranav <pranav@chatwoot.com>
2025-06-26 07:46:09 +05:30
parent 257cd07ee6
commit 811eb66615
7 changed files with 418 additions and 37 deletions
--- a/spec/enterprise/jobs/captain/conversation/response_builder_job_spec.rb
+++ b/spec/enterprise/jobs/captain/conversation/response_builder_job_spec.rb
@@ -30,5 +30,30 @@ RSpec.describe Captain::Conversation::ResponseBuilderJob, type: :job do
      account.reload
      expect(account.usage_limits[:captain][:responses][:consumed]).to eq(1)
    end
+
+    context 'when message contains an image' do
+      let(:message_with_image) { create(:message, conversation: conversation, message_type: :incoming, content: 'Can you help with this error?') }
+      let(:image_attachment) { message_with_image.attachments.create!(account: account, file_type: :image, external_url: 'https://example.com/error.jpg') }
+
+      before do
+        image_attachment
+      end
+
+      it 'includes image URL directly in the message content for OpenAI vision analysis' do
+        # Expect the generate_response to receive multimodal content with image URL
+        expect(mock_llm_chat_service).to receive(:generate_response) do |**kwargs|
+          history = kwargs[:message_history]
+          last_entry = history.last
+          expect(last_entry[:content]).to be_an(Array)
+          expect(last_entry[:content].any? { |part| part[:type] == 'text' && part[:text] == 'Can you help with this error?' }).to be true
+          expect(last_entry[:content].any? do |part|
+            part[:type] == 'image_url' && part[:image_url][:url] == 'https://example.com/error.jpg'
+          end).to be true
+          { 'response' => 'I can see the error in your image. It appears to be a database connection issue.' }
+        end
+
+        described_class.perform_now(conversation, assistant)
+      end
+    end
  end
 end