Persian to Arabic: AI Translation Comparison

Persian (Farsi) and Arabic have a deep historical relationship spanning over a millennium of cultural, literary, and religious contact. Persian has approximately 110 million speakers across Iran, Afghanistan (as Dari), and Tajikistan (as Tajik), while Arabic has 400 million speakers. Despite both using Arabic script (with Persian adding four letters), they belong to different language families: Persian is Indo-European (Iranian branch) with SOV word order, while Arabic is Semitic (Afroasiatic) with VSO/SVO flexibility. Persian borrowed massively from Arabic, with an estimated 40 to 60 percent of its vocabulary being Arabic-origin, but these loanwords often shifted in meaning and pronunciation. This pair is critical for Middle Eastern diplomacy, religious scholarship, trade, media, and the large Iranian diaspora in Arabic-speaking countries.

This comparison evaluates five leading AI translation systems on Persian-to-Arabic accuracy, naturalness, and suitability for different use cases.

Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.

Accuracy Comparison Table

System	BLEU Score	COMET Score	Editorial Rating (1-10)	Best For
Google Translate	30.8	0.836	7.2	General-purpose, speed
DeepL	33.5	0.853	7.7	Formal content
GPT-4	36.2	0.868	8.2	Cultural context, register
Claude	32.8	0.848	7.5	Long-form content
NLLB-200	28.1	0.816	6.8	Budget, self-hosted

Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained

Example Translations

Formal Business Email

Source: “Jenab-e Aghaye Hosseini, ba masarrat be ettela-e shoma miresanim ke darkhast-e shoma mored-e tasvib gharar gerefteh ast. Lotfan madarek-e zamineh ra molahedhe farmayid.”

System	Translation
Google	As-sayyed al-muhtaram Hosseini, nasurru bi-iblagikum anna talabakum qad tamma al-muwafaqa alayhi. Yurja al-ittila’ ala al-watha’iq al-murfaqa.
DeepL	Hadrat as-sayyed Hosseini, nasurru an nubilagkum anna talabakum qad hadhiya bil-muwafaqa. Yurja murajaa’t al-mustanadat al-murfaqa.
GPT-4	Hadrat as-sayyed Hosseini al-muhtaram, yusiruna wa yatiibu lana an nubilagkum bi-anna talabakum qad naala al-muwafaqa ar-rasmiyya. Narjoo minakum at-tafaddul bil-ittila’ ala al-watha’iq al-murfaqa.
Claude	As-sayyed Hosseini al-muhtaram, nasurru bi-iblagikum anna talabakum qad tamma al-muwafaqa alayhi. Yurja al-ittila’ ala al-watha’iq al-murfaqa.
NLLB-200	As-sayyed Hosseini, talabakum tamma al-muwafaqa alayhi. Al-watha’iq murfaqa.

Assessment: GPT-4 produces the most elaborately formal Modern Standard Arabic with yusiruna wa yatiibu lana (it pleases and delights us), naala al-muwafaqa ar-rasmiyya (received official approval), and the polite Narjoo minakum at-tafaddul. This matches the elaborate Persian formal register of the original. NLLB-200 is too terse for either culture’s business norms.

Casual Conversation

Source: “Salam! Dirooz raftam oon restaurant-e jadideh. Kheyli ali bood! Hatman bero.”

System	Translation
Google	Marhaba! Ams ruht ila al-mat’am al-jadid. Kan mumtazan jiddan! Lazim truh.
DeepL	Ahlan! Ams zurt al-mat’am al-jadid. Kan ra’i’an! Yajib an tadhhab bi-kull ta’kid.
GPT-4	Hala! Ams ruht ala al-mat’am al-jdid. Wallahi kan khtiyaar! Lazim truh, jad!
Claude	Marhaba! Ams dhahbt ila al-mat’am al-jadid. Kan jayyidan jiddan! Yajib an tadhhab.
NLLB-200	Marhaba. Dhahbt ila al-mat’am al-jadid. Kan jayyidan. Yajib an tadhhab.

Assessment: GPT-4 captures casual Arabic best with Levantine dialect features (Hala, ruht ala, khtiyaar, jad). DeepL uses the polite bi-kull ta’kid (absolutely). Google also produces colloquial Arabic. NLLB-200 and Claude default to flat MSA with the bland jayyidan.

Technical Content

Source: “In model-e yadgiri-ye amigh az me’mari-ye transformer ba saz-o-kar-e tavajoh baraye pardazesh-e dadeha-ye motevali estefadeh mikonad.”

System	Translation
Google	Yastakhdimu hadha an-namudhaj at-ta’allum al-‘amiq binya transformer ma’a aliyyat al-intibah li-mu’alajat al-bayanat at-tasalsuliyya.
DeepL	Yastakhdimu namudhaj at-ta’allum al-‘amiq binya transformer mujahazza bi-aliyyat al-intibah li-mu’alajat al-bayanat at-tatabu’iyya.
GPT-4	Hadha al-deep learning model yastakhdimu transformer architecture ma’a attention mechanism li-mu’alajat sequential data.
Claude	Yastakhdimu namudhaj at-ta’allum al-‘amiq binya transformer ma’a aliyyat al-intibah li-mu’alajat al-bayanat at-tasalsuliyya.
NLLB-200	Yastakhdimu namudhaj at-ta’allum al-‘amiq binya al-muhawwil ma’a aliyyat al-intibah li-mu’alajat al-bayanat.

Assessment: GPT-4 keeps most terms in English, common in Arabic tech writing. NLLB-200 translates transformer as al-muhawwil, which Arabic ML practitioners avoid. Other systems correctly retain transformer as a loanword. See State of Machine Translation in 2026 for broader analysis.

Strengths and Weaknesses

Google Translate

Strengths: Fast and free. Benefits from Google’s investment in both Persian and Arabic NLP. Weaknesses: Defaults to MSA output. Less natural than GPT-4 on dialectal Arabic or formal register nuance.

DeepL

Strengths: Better formal MSA output. Handles Persian-Arabic shared vocabulary conversion reasonably. Weaknesses: Limited dialectal Arabic support. May miss Persian-specific Arabic loanword meaning shifts.

GPT-4

Strengths: Best cultural context and register handling. Can target specific Arabic dialects. Handles the complex shared vocabulary most accurately. Weaknesses: Higher cost. May require explicit prompting for dialect selection.

Claude

Strengths: Consistent long-form quality. Good for academic and analytical content. Weaknesses: Less distinctive than GPT-4 on cultural adaptation and dialectal nuance.

NLLB-200

Strengths: Free and self-hostable. Both languages are well-represented in NLLB-200. Weaknesses: Lowest quality. Over-literal translations. Missing courtesy markers. No dialectal support.

Recommendations

Use Case	Recommended System
Personal communication	Google Translate
Diplomatic correspondence	GPT-4
Religious scholarship	GPT-4 or Claude
Technical content	DeepL
Long-form content	Claude
High-volume processing	NLLB-200 (self-hosted)

Best Translation AI in 2026: Complete Model Comparison

Key Takeaways

GPT-4 leads for Persian-to-Arabic with the best cultural sensitivity and register handling across both Modern Standard Arabic and dialectal varieties.
The massive Arabic vocabulary layer in Persian provides useful bridges but creates traps where borrowed words have shifted meaning over centuries.
SOV-to-VSO/SVO restructuring is a fundamental grammatical challenge that all systems handle at different levels of naturalness.
The choice between MSA and dialectal Arabic output significantly affects usability and should be guided by the target audience.

Next Steps

Try it yourself: Compare these systems on your own text in the Translation AI Playground: Compare Models Side-by-Side.
Reverse direction: See Tamil to Telugu: AI Translation Comparison.
Check the leaderboard: Browse our full Translation Accuracy Leaderboard by Language Pair.
Full model comparison: Read Best Translation AI in 2026: Complete Model Comparison.