Persian to English: AI Translation Comparison

Persian (Farsi) is spoken by approximately 110 million people across Iran, Afghanistan (as Dari), and Tajikistan (as Tajik), with significant diaspora communities in the United States, Canada, Germany, and the United Kingdom. It is an Indo-Iranian language written in a modified Arabic script (Perso-Arabic), with Tajik using Cyrillic. Persian features relatively simple grammar for an Indo-European language — no grammatical gender, no case marking on nouns, and a consistent SOV word order — but has a rich literary tradition, complex politeness registers (taarof), and extensive Arabic vocabulary integration. Translation demand is driven by diaspora communication, academic and literary research, legal and immigration documentation, business relations, and media.

This comparison evaluates five leading AI translation systems on Persian-to-English accuracy, naturalness, and suitability for different use cases.

Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.

Accuracy Comparison Table

System	BLEU Score	COMET Score	Editorial Rating (1-10)	Best For
Google Translate	33.4	0.828	7.1	General-purpose, free access
DeepL	31.2	0.811	6.8	Basic functionality
GPT-4	36.8	0.851	7.8	Contextual understanding, literary quality
Claude	35.1	0.839	7.4	Long-form, academic content
NLLB-200	30.5	0.804	6.6	Free, self-hosted option

Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained

Example Translations

Formal Legal Document

Source: “Dadgah-e ali-ye keshvar ra’y-e khood ra dar mowred-e e’teraz-e matrah-shodeh be hoghoogh-e malekiyyat-e fekri sader kard.”

System	Translation
Google	The Supreme Court of the country issued its ruling on the objection raised regarding intellectual property rights.
DeepL	The country’s Supreme Court has issued its ruling on the objection to intellectual property rights.
GPT-4	The Supreme Court has issued its ruling on the appeal filed regarding intellectual property rights.
Claude	The Supreme Court of the country has issued its ruling on the objection raised concerning intellectual property rights.
NLLB-200	The Supreme Court of the country issued its ruling on the objection raised about intellectual property rights.

Assessment: GPT-4 correctly renders “e’teraz-e matrah-shodeh” as “the appeal filed” rather than “the objection raised” — in a Supreme Court context, “appeal” is the appropriate English legal term. DeepL’s “objection to intellectual property rights” misrepresents the meaning (it is an objection concerning IP rights, not against them). Claude’s “concerning” is a natural legal English preposition for this context.

Casual Conversation

Source: “Salam, chetori? Kheili vaghte nadidemit. Biya berim ye chai bokhhorim.”

System	Translation
Google	Hello, how are you? I haven’t seen you for a long time. Come, let’s go have a tea.
DeepL	Hello, how are you? It’s been a long time since I’ve seen you. Come, let’s go have tea.
GPT-4	Hey, how are you? It’s been forever since I’ve seen you. Come on, let’s go grab a tea.
Claude	Hello, how are you? I haven’t seen you for a very long time. Come, let’s go have a tea.
NLLB-200	Hello, how are you? I haven’t seen you for a long time. Come, let’s go drink a tea.

Assessment: GPT-4 captures the casual warmth best with “Hey,” “It’s been forever,” and “grab a tea.” Persian casual speech is warmer and more effusive than English, and GPT-4’s translation best bridges this cultural gap. The phrase “chai bokhhorim” (literally “drink tea”) is a social invitation — “grab a tea” captures the casualness while “drink a tea” (NLLB-200) sounds unnatural in English. Persian tea culture is central to social interaction.

Technical Content

Source: “In sakhtar-e narm-afzari az me’mari-ye microservice estefadeh mikonad va ba komak-e containerha moghayas-paziri-ye bala ra taamin mikonad.”

System	Translation
Google	This software structure uses microservice architecture and provides high scalability with the help of containers.
DeepL	This software architecture uses microservices and provides high scalability using containers.
GPT-4	This software architecture utilizes a microservices-based design and achieves high scalability through containerization.
Claude	This software structure uses microservice architecture and provides high scalability with the help of containers.
NLLB-200	This software structure uses microservice architecture and provides high scalability with the help of containers.

Assessment: GPT-4 stands out with “containerization” (the correct technical term rather than “containers”) and “achieves high scalability through” (more natural than “provides…with the help of”). DeepL correctly uses “microservices” (plural) and has clean sentence flow. Google, Claude, and NLLB-200 produce identical but less polished translations with the awkward “with the help of.” How AI Translation Works: Neural Machine Translation Explained

Strengths and Weaknesses

Google Translate

Strengths: Free and accessible. Handles Perso-Arabic script well. Substantial Persian web training data. Weaknesses: Misses taarof (politeness) nuances. Literal translations. Less natural than GPT-4 or Claude.

DeepL

Strengths: Reasonable sentence restructuring. Acceptable for general content. Weaknesses: Lower accuracy for Persian specifically. Occasional meaning distortion. Does not handle Dari or Tajik variants.

GPT-4

Strengths: Best overall quality. Excellent literary and contextual understanding. Handles taarof and register shifts. Strong technical and legal terminology. Weaknesses: Higher cost. May occasionally mix Dari or Tajik influences with Iranian Persian.

Claude

Strengths: Strong quality for long documents. Good academic register. Consistent and reliable. Weaknesses: Less dynamic with casual Persian. Sometimes overly literal with literary expressions.

NLLB-200

Strengths: Free and self-hostable. Covers Persian, Dari, and Tajik as separate languages. Reasonable baseline. Weaknesses: Identical output to Google for many inputs. No register adaptation. Less fluent than GPT-4 or Claude.

Recommendations

Use Case	Recommended System
Quick personal translation	Google Translate (free)
Legal and immigration docs	GPT-4 with human review
Literary and academic texts	GPT-4 or Claude
Business communication	GPT-4
High-volume processing	NLLB-200 (self-hosted)
Diaspora communication	Google Translate or GPT-4
News and media	Google Translate or Claude

Best Translation AI in 2026: Complete Model Comparison

Key Takeaways

GPT-4 leads for Persian-to-English with the highest scores across all metrics and particularly strong performance on literary, legal, and contextually nuanced content.
Persian’s taarof (elaborate politeness) system creates translation challenges, as overly literal translations of polite phrases sound bizarre in English, while omitting them loses cultural meaning.
The Persian-Dari-Tajik continuum means training data from all three varieties contributes to AI quality, but can also introduce cross-variant confusion, particularly for Dari-specific or Tajik-specific terminology.
Persian’s relatively simple grammar (no gender, no case) makes it more amenable to AI translation than many languages of similar resource level, contributing to scores that approach high-resource pair quality.

Next Steps

Try it yourself: Compare these systems on your own text in the Translation AI Playground: Compare Models Side-by-Side.
Check the leaderboard: Browse our full Translation Accuracy Leaderboard by Language Pair.
Casual translation: See our guide to Best AI Translation Tools for Casual Use.
Full model comparison: Read Best Translation AI in 2026: Complete Model Comparison.