Pashto to Dari: AI Translation Comparison
Pashto to Dari: AI Translation Comparison
Pashto and Dari connect approximately 60 million Pashto speakers (in Afghanistan and Pakistan) with 29 million Dari speakers (primarily in Afghanistan), the two official languages of Afghanistan. This pair is critically important for Afghan governance, inter-ethnic communication, humanitarian operations, and the Afghan diaspora. Both are Iranian languages (Indo-European family): Pashto belongs to the Eastern Iranian branch, while Dari (Afghan Persian) belongs to the Western Iranian branch. They share some vocabulary through centuries of contact and shared Arabic/Islamic loanwords, but differ significantly in grammar. Pashto has SOV order, an ergative-absolutive alignment in past tenses, grammatical gender, and a retroflex consonant series, while Dari has SOV order, no grammatical gender, and simpler verb morphology. Both use modified Arabic scripts but with different additional characters. Parallel corpora are limited but benefit from Afghan government bilingual documents.
This comparison evaluates five leading AI translation systems on Pashto-to-Dari accuracy, naturalness, and suitability for different use cases.
Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.
Accuracy Comparison Table
| System | BLEU Score | COMET Score | Editorial Rating (1-10) | Best For |
|---|---|---|---|---|
| Google Translate | 21.8 | 0.785 | 6.3 | Speed, basic use |
| DeepL | 19.5 | 0.768 | 5.8 | Formal documents |
| GPT-4 | 28.3 | 0.828 | 7.4 | Government, cultural content |
| Claude | 25.7 | 0.81 | 6.9 | Long-form content |
| NLLB-200 | 22.4 | 0.79 | 6.4 | Low-resource pairs |
Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained
Example Translations
Formal Business Email
Source: “محترم صاحب، موږ خوشحاله یو چې تاسو ته خبر درکړو چې ستاسو غوښتنلیک منظور شوی دی. مهرباني وکړئ ضمیمه شوي اسناد وګورئ.”
| System | Translation |
|---|---|
| جناب محترم، خوشحالیم که به شما اطلاع دهیم که درخواست شما تایید شده است. لطفاً اسناد ضمیمه شده را بررسی کنید. | |
| DeepL | جناب محترم، مسرور هستیم که اعلام نماییم درخواست شما مورد تایید قرار گرفته است. لطفاً اسناد پیوست را مرور فرمایید. |
| GPT-4 | جناب محترم، با کمال افتخار و مسرت به اطلاع شما میرسانیم که درخواست شما پس از بررسی دقیق مورد تایید و تصویب قرار گرفته است. خواهشمندیم اسناد ضمیمه شده را به دقت مطالعه فرمایید. |
| Claude | جناب محترم، خوشحالیم که به شما اطلاع دهیم که درخواست شما تایید شده است. لطفاً اسناد ضمیمه را بررسی فرمایید. |
| NLLB-200 | جناب، درخواست شما تایید شد. اسناد را ببینید. |
Assessment: GPT-4 produces the most refined formal Dari with با کمال افتخار و مسرت (with complete pride and pleasure) and خواهشمندیم (we humbly request), matching the Pashto formal register. The shared Arabic-derived vocabulary (محترم, منظور, اسناد) transfers well between both languages. NLLB-200 produces bare minimum Dari but benefits from the shared vocabulary base.
Casual Conversation
Source: “اې! نوي رستوران ته تللی یې؟ خواړه ډېر خوندور وو! باید ورشې.”
| System | Translation |
|---|---|
| هی! رستوران نو رفتی؟ غذا خیلی خوشمزه بود! باید بروی. | |
| DeepL | سلام! رستوران جدید رفتی؟ غذاش خیلی خوب بود! حتماً برو. |
| GPT-4 | اوهوی! رستوران نو رفتی؟ والله غذاش بسیار لذیذ بود! حتماً باید بروی! |
| Claude | هی! رستوران جدید رفتی؟ غذا خیلی خوب بود! حتماً برو. |
| NLLB-200 | سلام. رستوران جدید خوب است. بروید. |
Assessment: GPT-4 captures the Pashto casual tone with colloquial Dari including اوهوی (hey!) and والله (I swear). Google and Claude produce functional casual Dari. NLLB-200 uses formal بروید (go, respectful plural) instead of casual برو, and strips all enthusiasm.
Technical Content
Source: “د ژورې زده کړې ماډل د پرله پسې ډاټا پروسس کولو لپاره د پام میکانیزم سره د transformer جوړښت کاروي.”
| System | Translation |
|---|---|
| مدل یادگیری عمیق از معماری ترنسفورمر با مکانیزم توجه برای پردازش دادههای متوالی استفاده میکند. | |
| DeepL | مدل یادگیری عمیق معماری ترنسفورمر با مکانیزمهای توجه را برای پردازش دادههای پیدرپی به کار میگیرد. |
| GPT-4 | این مدل یادگیری عمیق از معماری Transformer مجهز به مکانیزمهای توجه برای پردازش کارآمد دادههای متوالی بهره میبرد. |
| Claude | مدل یادگیری عمیق از معماری Transformer با مکانیزم توجه برای پردازش دادههای متوالی استفاده میکند. |
| NLLB-200 | مدل یادگیری از ساختار ترنسفورمر و توجه برای دادهها استفاده میکند. |
Assessment: The Pashto source uses native terms (ژورې زده کړې for deep learning, پام for attention), which the major systems correctly map to Dari ML terminology. GPT-4 adds کارآمد (efficient) and uses بهره میبرد (benefits from), producing natural technical Dari. NLLB-200 drops عمیق (deep) and oversimplifies. The shared modified Arabic script helps with loanword transfer.
Strengths and Weaknesses
Google Translate
Strengths: Fast, free, some coverage from Afghan multilingual content. Weaknesses: Limited Pashto training data. Pashto ergative alignment is poorly handled.
DeepL
Strengths: Reasonable Dari output quality when Pashto input is parsed correctly. Weaknesses: Pashto is not a core DeepL language. Inconsistent quality.
GPT-4
Strengths: Best overall quality. Understands Afghan cultural context and inter-ethnic communication needs. Weaknesses: Higher cost. Still limited by scarce parallel data.
Claude
Strengths: Reasonable long-form quality. Consistent Dari output. Weaknesses: Limited by scarce Pashto-Dari parallel data.
NLLB-200
Strengths: Free, self-hostable. NLLB-200 includes both Pashto and Dari. Relatively competitive due to shared vocabulary. Weaknesses: Low absolute quality but benefits from shared Iranian language features and Arabic loanwords.
Recommendations
| Use Case | Recommended System |
|---|---|
| Afghan government documents | GPT-4 with human review |
| Basic comprehension | Google Translate |
| Cultural and media content | GPT-4 |
| Humanitarian reports | Claude |
| Bulk processing on budget | NLLB-200 (self-hosted) |
| Legal and official documents | Human translator recommended |
Best Translation AI in 2026: Complete Model Comparison
Key Takeaways
- GPT-4 leads for Pashto-to-Dari with the best understanding of Afghan inter-ethnic communication context.
- Shared Iranian heritage and extensive Arabic loanwords help all systems, but Pashto’s ergative alignment and retroflex consonants create unique challenges.
- This pair is critically important for Afghan governance and humanitarian operations, where translation quality can have real-world consequences.
- For government policy documents, legal texts, and humanitarian communications, professional human translation by Afghan bilingual translators is strongly recommended.
Next Steps
- Try it yourself: Compare these systems on your own text in the Translation AI Playground: Compare Models Side-by-Side.
- Reverse direction: See Tamil to Sinhala: AI Translation Comparison.
- Check the leaderboard: Browse our full Translation Accuracy Leaderboard by Language Pair.
- Full model comparison: Read Best Translation AI in 2026: Complete Model Comparison.