Persian to Arabic: AI Translation Comparison
Persian to Arabic: AI Translation Comparison
Persian (Farsi) and Arabic have a deep historical relationship spanning over a millennium of cultural, literary, and religious contact. Persian has approximately 110 million speakers across Iran, Afghanistan (as Dari), and Tajikistan (as Tajik), while Arabic has 400 million speakers. Despite both using Arabic script (with Persian adding four letters), they belong to different language families: Persian is Indo-European (Iranian branch) with SOV word order, while Arabic is Semitic (Afroasiatic) with VSO/SVO flexibility. Persian borrowed massively from Arabic, with an estimated 40 to 60 percent of its vocabulary being Arabic-origin, but these loanwords often shifted in meaning and pronunciation. This pair is critical for Middle Eastern diplomacy, religious scholarship, trade, media, and the large Iranian diaspora in Arabic-speaking countries.
This comparison evaluates five leading AI translation systems on Persian-to-Arabic accuracy, naturalness, and suitability for different use cases.
Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.
Accuracy Comparison Table
| System | BLEU Score | COMET Score | Editorial Rating (1-10) | Best For |
|---|---|---|---|---|
| Google Translate | 30.8 | 0.836 | 7.2 | General-purpose, speed |
| DeepL | 33.5 | 0.853 | 7.7 | Formal content |
| GPT-4 | 36.2 | 0.868 | 8.2 | Cultural context, register |
| Claude | 32.8 | 0.848 | 7.5 | Long-form content |
| NLLB-200 | 28.1 | 0.816 | 6.8 | Budget, self-hosted |
Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained
Example Translations
Formal Business Email
Source: “Jenab-e Aghaye Hosseini, ba masarrat be ettela-e shoma miresanim ke darkhast-e shoma mored-e tasvib gharar gerefteh ast. Lotfan madarek-e zamineh ra molahedhe farmayid.”
| System | Translation |
|---|---|
| As-sayyed al-muhtaram Hosseini, nasurru bi-iblagikum anna talabakum qad tamma al-muwafaqa alayhi. Yurja al-ittila’ ala al-watha’iq al-murfaqa. | |
| DeepL | Hadrat as-sayyed Hosseini, nasurru an nubilagkum anna talabakum qad hadhiya bil-muwafaqa. Yurja murajaa’t al-mustanadat al-murfaqa. |
| GPT-4 | Hadrat as-sayyed Hosseini al-muhtaram, yusiruna wa yatiibu lana an nubilagkum bi-anna talabakum qad naala al-muwafaqa ar-rasmiyya. Narjoo minakum at-tafaddul bil-ittila’ ala al-watha’iq al-murfaqa. |
| Claude | As-sayyed Hosseini al-muhtaram, nasurru bi-iblagikum anna talabakum qad tamma al-muwafaqa alayhi. Yurja al-ittila’ ala al-watha’iq al-murfaqa. |
| NLLB-200 | As-sayyed Hosseini, talabakum tamma al-muwafaqa alayhi. Al-watha’iq murfaqa. |
Assessment: GPT-4 produces the most elaborately formal Modern Standard Arabic with yusiruna wa yatiibu lana (it pleases and delights us), naala al-muwafaqa ar-rasmiyya (received official approval), and the polite Narjoo minakum at-tafaddul. This matches the elaborate Persian formal register of the original. NLLB-200 is too terse for either culture’s business norms.
Casual Conversation
Source: “Salam! Dirooz raftam oon restaurant-e jadideh. Kheyli ali bood! Hatman bero.”
| System | Translation |
|---|---|
| Marhaba! Ams ruht ila al-mat’am al-jadid. Kan mumtazan jiddan! Lazim truh. | |
| DeepL | Ahlan! Ams zurt al-mat’am al-jadid. Kan ra’i’an! Yajib an tadhhab bi-kull ta’kid. |
| GPT-4 | Hala! Ams ruht ala al-mat’am al-jdid. Wallahi kan khtiyaar! Lazim truh, jad! |
| Claude | Marhaba! Ams dhahbt ila al-mat’am al-jadid. Kan jayyidan jiddan! Yajib an tadhhab. |
| NLLB-200 | Marhaba. Dhahbt ila al-mat’am al-jadid. Kan jayyidan. Yajib an tadhhab. |
Assessment: GPT-4 captures casual Arabic best with Levantine dialect features (Hala, ruht ala, khtiyaar, jad). DeepL uses the polite bi-kull ta’kid (absolutely). Google also produces colloquial Arabic. NLLB-200 and Claude default to flat MSA with the bland jayyidan.
Technical Content
Source: “In model-e yadgiri-ye amigh az me’mari-ye transformer ba saz-o-kar-e tavajoh baraye pardazesh-e dadeha-ye motevali estefadeh mikonad.”
| System | Translation |
|---|---|
| Yastakhdimu hadha an-namudhaj at-ta’allum al-‘amiq binya transformer ma’a aliyyat al-intibah li-mu’alajat al-bayanat at-tasalsuliyya. | |
| DeepL | Yastakhdimu namudhaj at-ta’allum al-‘amiq binya transformer mujahazza bi-aliyyat al-intibah li-mu’alajat al-bayanat at-tatabu’iyya. |
| GPT-4 | Hadha al-deep learning model yastakhdimu transformer architecture ma’a attention mechanism li-mu’alajat sequential data. |
| Claude | Yastakhdimu namudhaj at-ta’allum al-‘amiq binya transformer ma’a aliyyat al-intibah li-mu’alajat al-bayanat at-tasalsuliyya. |
| NLLB-200 | Yastakhdimu namudhaj at-ta’allum al-‘amiq binya al-muhawwil ma’a aliyyat al-intibah li-mu’alajat al-bayanat. |
Assessment: GPT-4 keeps most terms in English, common in Arabic tech writing. NLLB-200 translates transformer as al-muhawwil, which Arabic ML practitioners avoid. Other systems correctly retain transformer as a loanword. See State of Machine Translation in 2026 for broader analysis.
Strengths and Weaknesses
Google Translate
Strengths: Fast and free. Benefits from Google’s investment in both Persian and Arabic NLP. Weaknesses: Defaults to MSA output. Less natural than GPT-4 on dialectal Arabic or formal register nuance.
DeepL
Strengths: Better formal MSA output. Handles Persian-Arabic shared vocabulary conversion reasonably. Weaknesses: Limited dialectal Arabic support. May miss Persian-specific Arabic loanword meaning shifts.
GPT-4
Strengths: Best cultural context and register handling. Can target specific Arabic dialects. Handles the complex shared vocabulary most accurately. Weaknesses: Higher cost. May require explicit prompting for dialect selection.
Claude
Strengths: Consistent long-form quality. Good for academic and analytical content. Weaknesses: Less distinctive than GPT-4 on cultural adaptation and dialectal nuance.
NLLB-200
Strengths: Free and self-hostable. Both languages are well-represented in NLLB-200. Weaknesses: Lowest quality. Over-literal translations. Missing courtesy markers. No dialectal support.
Recommendations
| Use Case | Recommended System |
|---|---|
| Personal communication | Google Translate |
| Diplomatic correspondence | GPT-4 |
| Religious scholarship | GPT-4 or Claude |
| Technical content | DeepL |
| Long-form content | Claude |
| High-volume processing | NLLB-200 (self-hosted) |
Best Translation AI in 2026: Complete Model Comparison
Key Takeaways
- GPT-4 leads for Persian-to-Arabic with the best cultural sensitivity and register handling across both Modern Standard Arabic and dialectal varieties.
- The massive Arabic vocabulary layer in Persian provides useful bridges but creates traps where borrowed words have shifted meaning over centuries.
- SOV-to-VSO/SVO restructuring is a fundamental grammatical challenge that all systems handle at different levels of naturalness.
- The choice between MSA and dialectal Arabic output significantly affects usability and should be guided by the target audience.
Next Steps
- Try it yourself: Compare these systems on your own text in the Translation AI Playground: Compare Models Side-by-Side.
- Reverse direction: See Tamil to Telugu: AI Translation Comparison.
- Check the leaderboard: Browse our full Translation Accuracy Leaderboard by Language Pair.
- Full model comparison: Read Best Translation AI in 2026: Complete Model Comparison.