feat: Add company backfill migration for existing contacts (Part 1) (#12657)
## Description Implements company backfill migration infrastructure for existing contacts. This is **Part 1 of 2** for the company model production rollout as described in [CW-5726](https://linear.app/chatwoot/issue/CW-5726/company-model-setting-it-up-on-production). Creates jobs and services to associate existing contacts with companies based on their email domains, filtering out free email providers (gmail, yahoo, etc.) and disposable addresses. **What's included:** - Business email detector service with ValidEmail2 (uses `disposable_domain?` to avoid DNS lookups) - Per-account batch job to process contacts for one account - Orchestrator job to iterate all accounts - Rake task: `bundle exec rake companies:backfill` ~~*NOTE*: I'm using a hard-coded approach to determine if something is a "business" email by filtering out emails that are usually personal. I've also added domains that are common to some of our customers' regions. This should be simpler. I looked into `Valid_Email2` and I couldn't find anything to dictate whether an email is a personal email or a business one. I don't think the approach used in the frontend is valid here.~~ UPDATE: Using `email_provider_info` gem instead. **Pending - Part 2 (separate PR):** Real-time company creation for new contacts ## Type of change - [x] New feature (non-breaking change which adds functionality) ## How Has This Been Tested? ```bash # Run all new tests bundle exec rspec spec/enterprise/services/companies/business_email_detector_service_spec.rb \\ spec/enterprise/jobs/migration/company_account_batch_job_spec.rb \\ spec/enterprise/jobs/migration/company_backfill_job_spec.rb # Run RuboCop bundle exec rubocop enterprise/app/services/companies/business_email_detector_service.rb \\ enterprise/app/jobs/migration/company_account_batch_job.rb \\ enterprise/app/jobs/migration/company_backfill_job.rb \\ lib/tasks/companies.rake ``` **Performance optimization:** - Uses `disposable_domain?` instead of `disposable?` to avoid DNS MX lookups (discovered via tcpdump analysis - `disposable?` was making network calls for every email, causing 100x slowdown) ## Checklist: - [x] My code follows the style guidelines of this project - [x] I have performed a self-review of my code - [x] I have commented on my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [ ] Any dependent changes have been merged and published in downstream modules --------- Co-authored-by: Sojan Jose <sojan@pepalo.com>
This commit is contained in:
@@ -0,0 +1,99 @@
|
||||
require 'rails_helper'
|
||||
|
||||
RSpec.describe Companies::BusinessEmailDetectorService, type: :service do
|
||||
let(:service) { described_class.new(email) }
|
||||
|
||||
describe '#perform' do
|
||||
context 'when email is from a business domain' do
|
||||
let(:email) { 'user@acme.com' }
|
||||
let(:valid_email_address) { instance_double(ValidEmail2::Address, valid?: true, disposable_domain?: false) }
|
||||
|
||||
before do
|
||||
allow(ValidEmail2::Address).to receive(:new).with(email).and_return(valid_email_address)
|
||||
allow(EmailProviderInfo).to receive(:call).with(email).and_return(nil)
|
||||
end
|
||||
|
||||
it 'returns true' do
|
||||
expect(service.perform).to be(true)
|
||||
end
|
||||
end
|
||||
|
||||
context 'when email is from gmail' do
|
||||
let(:email) { 'user@gmail.com' }
|
||||
let(:valid_email_address) { instance_double(ValidEmail2::Address, valid?: true, disposable_domain?: false) }
|
||||
|
||||
before do
|
||||
allow(ValidEmail2::Address).to receive(:new).with(email).and_return(valid_email_address)
|
||||
allow(EmailProviderInfo).to receive(:call).with(email).and_return('gmail')
|
||||
end
|
||||
|
||||
it 'returns false' do
|
||||
expect(service.perform).to be(false)
|
||||
end
|
||||
end
|
||||
|
||||
context 'when email is from Brazilian free provider' do
|
||||
let(:email) { 'user@uol.com.br' }
|
||||
let(:valid_email_address) { instance_double(ValidEmail2::Address, valid?: true, disposable_domain?: false) }
|
||||
|
||||
before do
|
||||
allow(ValidEmail2::Address).to receive(:new).with(email).and_return(valid_email_address)
|
||||
allow(EmailProviderInfo).to receive(:call).with(email).and_return('uol')
|
||||
end
|
||||
|
||||
it 'returns false' do
|
||||
expect(service.perform).to be(false)
|
||||
end
|
||||
end
|
||||
|
||||
context 'when email is disposable' do
|
||||
let(:email) { 'user@mailinator.com' }
|
||||
let(:disposable_email_address) { instance_double(ValidEmail2::Address, valid?: true, disposable_domain?: true) }
|
||||
|
||||
it 'returns false' do
|
||||
allow(ValidEmail2::Address).to receive(:new).with(email).and_return(disposable_email_address)
|
||||
expect(service.perform).to be(false)
|
||||
end
|
||||
end
|
||||
|
||||
context 'when email is invalid format' do
|
||||
let(:email) { 'invalid-email' }
|
||||
let(:invalid_email_address) { instance_double(ValidEmail2::Address, valid?: false) }
|
||||
|
||||
it 'returns false' do
|
||||
allow(ValidEmail2::Address).to receive(:new).with(email).and_return(invalid_email_address)
|
||||
expect(service.perform).to be(false)
|
||||
end
|
||||
end
|
||||
|
||||
context 'when email is nil' do
|
||||
let(:email) { nil }
|
||||
|
||||
it 'remains false' do
|
||||
expect(service.perform).to be(false)
|
||||
end
|
||||
end
|
||||
|
||||
context 'when email is empty string' do
|
||||
let(:email) { '' }
|
||||
|
||||
it 'returns false' do
|
||||
expect(service.perform).to be(false)
|
||||
end
|
||||
end
|
||||
|
||||
context 'when email domain is uppercase' do
|
||||
let(:email) { 'user@GMAIL.COM' }
|
||||
let(:valid_email_address) { instance_double(ValidEmail2::Address, valid?: true, disposable_domain?: false) }
|
||||
|
||||
before do
|
||||
allow(ValidEmail2::Address).to receive(:new).with(email).and_return(valid_email_address)
|
||||
allow(EmailProviderInfo).to receive(:call).with(email).and_return('gmail')
|
||||
end
|
||||
|
||||
it 'returns false (case insensitive)' do
|
||||
expect(service.perform).to be(false)
|
||||
end
|
||||
end
|
||||
end
|
||||
end
|
||||
Reference in New Issue
Block a user