Persian to English: AI Translation Comparison
Persian to English: AI Translation Comparison
Persian (Farsi) is spoken by approximately 110 million people across Iran, Afghanistan (as Dari), and Tajikistan (as Tajik), with significant diaspora communities in the United States, Canada, Germany, and the United Kingdom. It is an Indo-Iranian language written in a modified Arabic script (Perso-Arabic), with Tajik using Cyrillic. Persian features relatively simple grammar for an Indo-European language — no grammatical gender, no case marking on nouns, and a consistent SOV word order — but has a rich literary tradition, complex politeness registers (taarof), and extensive Arabic vocabulary integration. Translation demand is driven by diaspora communication, academic and literary research, legal and immigration documentation, business relations, and media.
This comparison evaluates five leading AI translation systems on Persian-to-English accuracy, naturalness, and suitability for different use cases.
Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.
Accuracy Comparison Table
| System | BLEU Score | COMET Score | Editorial Rating (1-10) | Best For |
|---|---|---|---|---|
| Google Translate | 33.4 | 0.828 | 7.1 | General-purpose, free access |
| DeepL | 31.2 | 0.811 | 6.8 | Basic functionality |
| GPT-4 | 36.8 | 0.851 | 7.8 | Contextual understanding, literary quality |
| Claude | 35.1 | 0.839 | 7.4 | Long-form, academic content |
| NLLB-200 | 30.5 | 0.804 | 6.6 | Free, self-hosted option |
Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained
Example Translations
Formal Legal Document
Source: “Dadgah-e ali-ye keshvar ra’y-e khood ra dar mowred-e e’teraz-e matrah-shodeh be hoghoogh-e malekiyyat-e fekri sader kard.”
| System | Translation |
|---|---|
| The Supreme Court of the country issued its ruling on the objection raised regarding intellectual property rights. | |
| DeepL | The country’s Supreme Court has issued its ruling on the objection to intellectual property rights. |
| GPT-4 | The Supreme Court has issued its ruling on the appeal filed regarding intellectual property rights. |
| Claude | The Supreme Court of the country has issued its ruling on the objection raised concerning intellectual property rights. |
| NLLB-200 | The Supreme Court of the country issued its ruling on the objection raised about intellectual property rights. |
Assessment: GPT-4 correctly renders “e’teraz-e matrah-shodeh” as “the appeal filed” rather than “the objection raised” — in a Supreme Court context, “appeal” is the appropriate English legal term. DeepL’s “objection to intellectual property rights” misrepresents the meaning (it is an objection concerning IP rights, not against them). Claude’s “concerning” is a natural legal English preposition for this context.
Casual Conversation
Source: “Salam, chetori? Kheili vaghte nadidemit. Biya berim ye chai bokhhorim.”
| System | Translation |
|---|---|
| Hello, how are you? I haven’t seen you for a long time. Come, let’s go have a tea. | |
| DeepL | Hello, how are you? It’s been a long time since I’ve seen you. Come, let’s go have tea. |
| GPT-4 | Hey, how are you? It’s been forever since I’ve seen you. Come on, let’s go grab a tea. |
| Claude | Hello, how are you? I haven’t seen you for a very long time. Come, let’s go have a tea. |
| NLLB-200 | Hello, how are you? I haven’t seen you for a long time. Come, let’s go drink a tea. |
Assessment: GPT-4 captures the casual warmth best with “Hey,” “It’s been forever,” and “grab a tea.” Persian casual speech is warmer and more effusive than English, and GPT-4’s translation best bridges this cultural gap. The phrase “chai bokhhorim” (literally “drink tea”) is a social invitation — “grab a tea” captures the casualness while “drink a tea” (NLLB-200) sounds unnatural in English. Persian tea culture is central to social interaction.
Technical Content
Source: “In sakhtar-e narm-afzari az me’mari-ye microservice estefadeh mikonad va ba komak-e containerha moghayas-paziri-ye bala ra taamin mikonad.”
| System | Translation |
|---|---|
| This software structure uses microservice architecture and provides high scalability with the help of containers. | |
| DeepL | This software architecture uses microservices and provides high scalability using containers. |
| GPT-4 | This software architecture utilizes a microservices-based design and achieves high scalability through containerization. |
| Claude | This software structure uses microservice architecture and provides high scalability with the help of containers. |
| NLLB-200 | This software structure uses microservice architecture and provides high scalability with the help of containers. |
Assessment: GPT-4 stands out with “containerization” (the correct technical term rather than “containers”) and “achieves high scalability through” (more natural than “provides…with the help of”). DeepL correctly uses “microservices” (plural) and has clean sentence flow. Google, Claude, and NLLB-200 produce identical but less polished translations with the awkward “with the help of.” How AI Translation Works: Neural Machine Translation Explained
Strengths and Weaknesses
Google Translate
Strengths: Free and accessible. Handles Perso-Arabic script well. Substantial Persian web training data. Weaknesses: Misses taarof (politeness) nuances. Literal translations. Less natural than GPT-4 or Claude.
DeepL
Strengths: Reasonable sentence restructuring. Acceptable for general content. Weaknesses: Lower accuracy for Persian specifically. Occasional meaning distortion. Does not handle Dari or Tajik variants.
GPT-4
Strengths: Best overall quality. Excellent literary and contextual understanding. Handles taarof and register shifts. Strong technical and legal terminology. Weaknesses: Higher cost. May occasionally mix Dari or Tajik influences with Iranian Persian.
Claude
Strengths: Strong quality for long documents. Good academic register. Consistent and reliable. Weaknesses: Less dynamic with casual Persian. Sometimes overly literal with literary expressions.
NLLB-200
Strengths: Free and self-hostable. Covers Persian, Dari, and Tajik as separate languages. Reasonable baseline. Weaknesses: Identical output to Google for many inputs. No register adaptation. Less fluent than GPT-4 or Claude.
Recommendations
| Use Case | Recommended System |
|---|---|
| Quick personal translation | Google Translate (free) |
| Legal and immigration docs | GPT-4 with human review |
| Literary and academic texts | GPT-4 or Claude |
| Business communication | GPT-4 |
| High-volume processing | NLLB-200 (self-hosted) |
| Diaspora communication | Google Translate or GPT-4 |
| News and media | Google Translate or Claude |
Best Translation AI in 2026: Complete Model Comparison
Key Takeaways
- GPT-4 leads for Persian-to-English with the highest scores across all metrics and particularly strong performance on literary, legal, and contextually nuanced content.
- Persian’s taarof (elaborate politeness) system creates translation challenges, as overly literal translations of polite phrases sound bizarre in English, while omitting them loses cultural meaning.
- The Persian-Dari-Tajik continuum means training data from all three varieties contributes to AI quality, but can also introduce cross-variant confusion, particularly for Dari-specific or Tajik-specific terminology.
- Persian’s relatively simple grammar (no gender, no case) makes it more amenable to AI translation than many languages of similar resource level, contributing to scores that approach high-resource pair quality.
Next Steps
- Try it yourself: Compare these systems on your own text in the Translation AI Playground: Compare Models Side-by-Side.
- Check the leaderboard: Browse our full Translation Accuracy Leaderboard by Language Pair.
- Casual translation: See our guide to Best AI Translation Tools for Casual Use.
- Full model comparison: Read Best Translation AI in 2026: Complete Model Comparison.