English to Catalan: AI Translation Comparison
English to Catalan: AI Translation Comparison
Catalan is spoken by approximately 10 million people across Catalonia, Valencia, the Balearic Islands, Andorra (where it is the sole official language), parts of Aragon, Roussillon in France, and the city of Alghero in Sardinia. It is a distinct Romance language, not a dialect of Spanish, with its own grammar, phonology, and literary tradition dating to the medieval period. Catalan uses the Latin script, has a two-gender system, employs pronominal clitics that combine in complex clusters, and distinguishes between “ser” and “estar” (like Spanish) but with different usage rules. Translation demand comes from the Catalan autonomous government (Generalitat), education, publishing, technology localization, and Barcelona’s status as a major international business and tourism hub.
This comparison evaluates five leading AI translation systems on English-to-Catalan accuracy, naturalness, and suitability for different use cases.
Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.
Accuracy Comparison Table
| System | BLEU Score | COMET Score | Editorial Rating (1-10) | Best For |
|---|---|---|---|---|
| Google Translate | 34.8 | 0.854 | 7.1 | General-purpose, free access |
| DeepL | 37.2 | 0.871 | 7.6 | Business and formal documents |
| GPT-4 | 36.5 | 0.865 | 7.4 | Contextual accuracy, cultural content |
| Claude | 35.1 | 0.856 | 7.2 | Long-form content |
| NLLB-200 | 32.3 | 0.836 | 6.6 | Free option, self-hosted |
Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained
Example Translations
Formal Business Email
Source: “We are writing to confirm your registration for the Mobile World Congress in Barcelona. Your exhibitor badge and event schedule are attached.”
| System | Translation |
|---|---|
| Us escrivim per confirmar la vostra inscripció al Mobile World Congress de Barcelona. La vostra acreditació d’expositor i el programa de l’esdeveniment estan adjunts. | |
| DeepL | Ens adrecem a vostè per confirmar la seva inscripció al Mobile World Congress de Barcelona. L’acreditació d’expositor i el programa de l’esdeveniment es troben adjunts. |
| GPT-4 | Us escrivim per confirmar la vostra inscripció al Mobile World Congress de Barcelona. Trobareu l’acreditació d’expositor i el programa de l’esdeveniment en els documents adjunts. |
| Claude | Us escrivim per confirmar la vostra inscripció al Mobile World Congress de Barcelona. La vostra acreditació d’expositor i el programa de l’esdeveniment estan adjunts. |
| NLLB-200 | Us escrivim per confirmar la vostra inscripció al Mobile World Congress de Barcelona. La vostra acreditació d’expositor i el programa de l’esdeveniment estan adjunts. |
Assessment: DeepL uses the formal singular “vostè” with “seva,” while Google, GPT-4, and Claude use the formal plural “vostra/vostre” form, which is more traditional in Catalan formal correspondence. GPT-4’s “Trobareu” (You will find) is a natural Catalan way to introduce attachments. All systems correctly handle the apostrophe contractions (“l’acreditació,” “l’esdeveniment”), which follow Catalan elision rules.
Casual Conversation
Source: “Are you going to the Castellers performance on Sunday? It was insane last year. Let’s grab vermut at the plaça after.”
| System | Translation |
|---|---|
| Aniràs a l’actuació dels Castellers diumenge? L’any passat va ser una bogeria. Anem a prendre un vermut a la plaça després. | |
| DeepL | Aniràs a veure els Castellers diumenge? L’any passat va ser increïble. Anem a prendre un vermut a la plaça després. |
| GPT-4 | Aniràs a veure els Castellers diumenge? L’any passat va ser una passada. Prenem un vermut a la plaça després? |
| Claude | Aniràs a l’actuació dels Castellers diumenge? L’any passat va ser increïble. Anem a prendre un vermut a la plaça després. |
| NLLB-200 | Anireu a l’actuació dels Castellers diumenge? L’any passat va ser increïble. Anem a prendre un vermut a la plaça després. |
Assessment: GPT-4’s “va ser una passada” (it was wild/amazing) is the most natural casual Catalan expression. GPT-4 also uses “Prenem un vermut” (Let’s have a vermut) as a question, which is more conversational. DeepL’s “a veure els Castellers” (to see the Castellers) is more natural than “a l’actuació dels Castellers” (to the performance of the Castellers). NLLB-200 uses formal “Anireu” instead of informal “Aniràs.” “Vermut a la plaça” (vermouth at the square) is a key Catalan social tradition that all systems preserve. Best Translation AI for Casual Content
Technical Content
Source: “The smart city platform aggregates IoT sensor data from Barcelona’s Superblock neighborhoods to optimize traffic flow and air quality monitoring.”
| System | Translation |
|---|---|
| La plataforma de ciutat intel·ligent agrega dades de sensors IoT dels barris de Superilles de Barcelona per optimitzar el flux de trànsit i la monitorització de la qualitat de l’aire. | |
| DeepL | La plataforma de ciutat intel·ligent agrega dades de sensors IoT dels barris de les Superilles de Barcelona per optimitzar el flux de trànsit i el seguiment de la qualitat de l’aire. |
| GPT-4 | La plataforma de smart city agrega dades de sensors IoT dels barris de Superilles de Barcelona per optimitzar el flux de trànsit i la monitorització de la qualitat de l’aire. |
| Claude | La plataforma de ciutat intel·ligent agrega dades de sensors IoT dels barris de Superilles de Barcelona per optimitzar el flux de trànsit i la monitorització de la qualitat de l’aire. |
| NLLB-200 | La plataforma de ciutat intel·ligent agrega dades de sensors IoT dels barris de Superilles de Barcelona per optimitzar el flux de trànsit i el seguiment de la qualitat de l’aire. |
Assessment: All systems correctly use “Superilles” (the Catalan name for Barcelona’s Superblock urban planning initiative). Google, DeepL, Claude, and NLLB-200 translate “smart city” as “ciutat intel·ligent,” which is the official Catalan term, while GPT-4 retains the English. DeepL and NLLB-200 use “seguiment” (monitoring/tracking), which is the preferred Catalan term over the Castilian-influenced “monitorització.” The interpunct in “intel·ligent” is a distinctive Catalan orthographic feature handled correctly by all systems. Best Translation AI for Technical Documentation
Strengths and Weaknesses
Google Translate
Strengths: Free and accessible. Good general quality. Benefits from Generalitat translation resources and Catalan web content. Weaknesses: Sometimes produces Castilian-influenced forms. Pronominal clitic combination errors.
DeepL
Strengths: Best overall quality. Natural formal register. Good vocabulary choices that avoid Castilian calques. Strong business content. Weaknesses: Premium pricing. Occasionally mixes central Catalan with Valencian forms.
GPT-4
Strengths: Best casual and cultural content. Good understanding of Catalan traditions. Natural colloquial register. Weaknesses: Higher cost. Sometimes retains English terms unnecessarily. Occasional Castilian influence.
Claude
Strengths: Consistent quality for long documents. Reliable formal register. Weaknesses: Less idiomatic than DeepL or GPT-4. Limited Catalan cultural knowledge.
NLLB-200
Strengths: Free and self-hostable. Reasonable quality for a mid-sized language. Good formal vocabulary. Weaknesses: Formal register default. Lower overall quality. Pronominal clitic errors.
Recommendations
| Use Case | Recommended System |
|---|---|
| Government / institutional | DeepL |
| Business correspondence | DeepL |
| Tourism / cultural content | GPT-4 |
| Technology localization | DeepL or GPT-4 |
| High-volume, cost-sensitive | NLLB-200 (self-hosted) |
| Quick personal translation | Google Translate (free) |
| Long-form content | Claude |
Best Translation AI in 2026: Complete Model Comparison
Key Takeaways
- DeepL leads for formal English-to-Catalan translation with the best vocabulary choices and least Castilian influence. GPT-4 excels at casual and culturally rooted content.
- Catalan’s pronominal clitic system, where multiple unstressed pronouns combine and change form, remains the biggest challenge for AI translation, particularly in complex sentences with multiple objects.
- The distinction between Catalan and Spanish is critical: AI systems trained primarily on Spanish data sometimes produce Castilian-influenced Catalan, which is noticeable to native speakers.
- With 10 million speakers and strong institutional support from the Generalitat, Catalan is relatively well-resourced for a minority language, and translation quality reflects this advantage.
Next Steps
- Try it yourself: Compare these systems on your own text in the Translation AI Playground: Compare Models Side-by-Side.
- Reverse direction: See how systems handle Catalan to English translation.
- Check the leaderboard: Browse our full Translation Accuracy Leaderboard by Language Pair.
- Compare with Spanish: See Google Translate vs DeepL vs AI Models.