Catalan to French: AI Translation Comparison

Catalan is spoken by approximately 10 million people across Catalonia, Valencia, the Balearic Islands, Andorra (where it is the sole official language), the Eastern Pyrenees department of France (Northern Catalonia), and the city of Alghero in Sardinia. As a Western Romance language, Catalan shares significant structural similarities with both French and Spanish, though it is linguistically closest to Occitan. The Catalan-French pair is relatively high-resource compared to most European minority language pairs: both are well-documented Romance languages with substantial digital corpora, and Northern Catalonia’s bilingual communities provide natural parallel text. Key translation challenges include Catalan’s two-article system (definite and “personal” articles), the periphrastic past tense (“anar + infinitive” for the simple past), enclitic and proclitic pronoun placement rules, and dialectal variation between Central Catalan, Valencian, and Balearic forms. Translation demand is driven by cross-border commerce between Catalonia and southern France, EU institutional needs (Catalan is a semi-official EU language), tourism, academic exchange, cultural industries, and the Catalan Government’s active internationalization efforts.

This comparison evaluates five leading AI translation systems on Catalan-to-French accuracy, naturalness, and suitability for different use cases.

Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.

Accuracy Comparison Table

System	BLEU Score	COMET Score	Editorial Rating (1-10)	Best For
Google Translate	29.4	0.798	7.0	General-purpose, quick translation
DeepL	28.1	0.789	6.8	Business and formal documents
GPT-4	30.7	0.811	7.4	Literary, contextual, and nuanced content
Claude	29.8	0.803	7.1	Long-form documents, academic content
NLLB-200	28.6	0.792	6.9	Free, self-hosted, consistent quality

Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained

Example Translations

Formal Business Email

Source: “Benvolgut senyor Puig, Li escrivim per confirmar l’acord comercial signat oficialment dilluns passat entre les nostres dues empreses. Restem a la seva disposicio per a qualsevol consulta addicional.”

System	Translation
Google	Cher Monsieur Puig, Nous vous ecrivons pour confirmer l’accord commercial signe officiellement lundi dernier entre nos deux entreprises. Nous restons a votre disposition pour toute consultation supplementaire.
DeepL	Cher Monsieur Puig, Nous vous ecrivons pour confirmer l’accord commercial officiellement signe lundi dernier entre nos deux entreprises. Nous restons a votre disposition pour toute demande complementaire.
GPT-4	Cher Monsieur Puig, Nous vous ecrivons afin de confirmer l’accord commercial officiellement signe lundi dernier entre nos deux entreprises. Nous demeurons a votre entiere disposition pour toute question ou demande complementaire.
Claude	Cher Monsieur Puig, Nous vous ecrivons pour confirmer l’accord commercial signe officiellement lundi dernier entre nos deux entreprises. Nous restons a votre disposition pour toute consultation supplementaire.
NLLB-200	Cher Monsieur Puig, Nous vous ecrivons pour confirmer l’accord commercial signe officiellement lundi dernier entre nos deux entreprises. Nous restons a votre disposition pour toute consultation supplementaire.

Assessment: The close Romance kinship between Catalan and French allows all systems to produce competent formal translations. GPT-4 elevates with “afin de” (in order to, more formal than “pour”), “demeurons” (remain, more literary than “restons”), “entiere disposition” (complete disposal, standard formal French), and “question ou demande complementaire” (question or additional request, more comprehensive). DeepL correctly uses “demande” (request) rather than the calque “consultation,” which is more natural in French business correspondence. The structural similarity between “Restem a la seva disposicio” and “Nous restons a votre disposition” reflects the pair’s shared Romance heritage.

Casual Conversation

Source: “Ei, que tal? Ahir vaig anar a la platja i l’aigua estava genial. Demà hi torno. T’hi apuntes? Podem agafar unes birres després.”

System	Translation
Google	Salut, comment ca va? Hier je suis alle a la plage et l’eau etait geniale. Demain j’y retourne. Tu veux venir? On peut prendre des bieres apres.
DeepL	Salut, ca va? Hier je suis alle a la plage et l’eau etait super. Demain j’y retourne. Ca te dit? On peut prendre des bieres apres.
GPT-4	Salut, ca va? Hier je suis alle a la plage, l’eau etait trop bien. Demain j’y retourne. Ca te dit de venir? On pourrait se prendre des bieres apres.
Claude	Salut, comment ca va? Hier je suis alle a la plage et l’eau etait geniale. Demain j’y retourne. Tu veux venir? On peut prendre des bieres apres.
NLLB-200	Salut, comment ca va? Hier je suis alle a la plage et l’eau etait geniale. Demain j’y retourne. Tu veux venir? On peut prendre des bieres apres.

Assessment: GPT-4 best captures the casual register with “trop bien” (so good, contemporary French slang), “ca te dit de venir?” (feel like coming? — distinctly casual French), and “on pourrait se prendre des bieres” (we could grab some beers, with the reflexive “se prendre” adding casual flavor). DeepL also uses the colloquial “ca te dit?” but loses precision by not specifying “de venir.” The Catalan periphrastic past “vaig anar” (I went, literally “I go go”) is correctly rendered as “je suis alle” by all systems, demonstrating solid handling of this Catalan-specific construction.

Technical Content

Source: “El sistema d’energia renovable combina aerogeneradors marins amb panells solars terrestres per generar electricitat per a la xarxa nacional, reduint aixi la dependencia dels combustibles fossils.”

System	Translation
Google	Le systeme d’energie renouvelable combine des eoliennes marines avec des panneaux solaires terrestres pour generer de l’electricite pour le reseau national, reduisant ainsi la dependance aux combustibles fossiles.
DeepL	Le systeme d’energie renouvelable associe des eoliennes en mer a des panneaux solaires terrestres pour produire de l’electricite destinee au reseau national, reduisant ainsi la dependance aux energies fossiles.
GPT-4	Le systeme d’energie renouvelable combine des aerogenerateurs offshore et des panneaux photovoltaiques terrestres afin de produire de l’electricite pour le reseau national, reduisant ainsi la dependance aux combustibles fossiles.
Claude	Le systeme d’energie renouvelable combine des eoliennes marines avec des panneaux solaires terrestres pour generer de l’electricite pour le reseau national, reduisant ainsi la dependance aux combustibles fossiles.
NLLB-200	Le systeme d’energie renouvelable combine des eoliennes marines avec des panneaux solaires terrestres pour generer de l’electricite pour le reseau national, reduisant ainsi la dependance aux combustibles fossiles.

Assessment: GPT-4 uses the most precise technical French with “aerogenerateurs” (the exact French technical term for wind turbines), “offshore” (standard in French energy discourse), and “panneaux photovoltaiques” (more technically precise than “solaires”). DeepL uses “associe” (combines/pairs) and “destinee au” (intended for), which add technical sophistication. The Catalan-French cognate density in technical vocabulary makes this domain consistently well-handled across all systems. How AI Translation Works: Neural Machine Translation Explained

Strengths and Weaknesses

Google Translate

Strengths: Strong baseline quality. Good Catalan support as an EU semi-official language. Free and accessible. Weaknesses: Generic register. Produces correct but stylistically flat translations. Limited vocabulary variation.

DeepL

Strengths: Natural French business prose. Good vocabulary choices. Efficient condensation. Weaknesses: Occasionally loses source nuances. Less reliable on dialectal Catalan (Valencian, Balearic).

GPT-4

Strengths: Best register adaptation. Superior vocabulary precision in both casual and technical contexts. Handles dialectal variation well. Weaknesses: Higher cost. May occasionally introduce terms not in the source. Slower for bulk processing.

Claude

Strengths: Reliable for long documents. Consistent quality. Good academic register. Weaknesses: Conservative translations. Less creative with casual content. Similar to Google in stylistic flatness.

NLLB-200

Strengths: Free and self-hostable. Competitive quality for this high-resource pair. Consistent output. Weaknesses: No register adaptation. Produces functional but unremarkable translations. Limited vocabulary variation.

Recommendations

Use Case	Recommended System
Quick personal translation	Google Translate (free)
Cross-border business correspondence	GPT-4 or DeepL
Literary translation	GPT-4 with human review
Academic and research	Claude
EU institutional documents	Claude or GPT-4
High-volume processing	NLLB-200 (self-hosted)
Tourism content	GPT-4

Best Translation AI in 2026: Complete Model Comparison

Key Takeaways

The Catalan-French pair benefits from strong Romance language similarity and relatively abundant parallel corpora, producing higher baseline quality across all systems compared to most minority language pairs.
GPT-4 leads with the best register adaptation and vocabulary precision, though the margin over competitors is narrower than for more structurally divergent language pairs.
The primary differentiator between systems is not basic accuracy but stylistic quality: casual register handling, vocabulary sophistication, and the ability to produce French that reads as natively written rather than translated.
NLLB-200 provides a strong free option for this pair, where the structural similarity between source and target languages compensates for any training data limitations.

Next Steps

Try it yourself: Compare these systems on your own text in the Translation AI Playground: Compare Models Side-by-Side.
Check the leaderboard: Browse our full Translation Accuracy Leaderboard by Language Pair.
Understand the metrics: Learn what BLEU and COMET scores mean in Translation Quality Metrics.
Explore rare languages: Read Best AI Translation for Rare and Low-Resource Languages.