Language Pairs

English to Welsh: AI Translation Comparison

Updated 2026-03-11

English to Welsh: AI Translation Comparison

Welsh (Cymraeg) is spoken by approximately 880,000 people, primarily in Wales, with small communities in England, Patagonia (Argentina), and other diaspora locations. Welsh is a Celtic language that uses the Latin script and features initial consonant mutations (where the first consonant of a word changes based on grammatical context), VSO (verb-subject-object) word order, inflected prepositions, and a vigesimal (base-20) counting system. The Welsh Language (Wales) Act 2011 gives Welsh official status equal to English in Wales, driving substantial translation demand for government services, education, healthcare, road signage, and digital services.

This comparison evaluates five leading AI translation systems on English-to-Welsh accuracy, naturalness, and suitability for different use cases.

Translation comparisons are based on automated metrics and editorial evaluation. Quality varies by language pair and content type.

Accuracy Comparison Table

SystemBLEU ScoreCOMET ScoreEditorial Rating (1-10)Best For
Google Translate27.40.7986.0General-purpose, free access
DeepL29.80.8146.4Basic formal content
GPT-430.60.8226.7Contextual accuracy, cultural content
Claude28.10.8036.1Long-form content
NLLB-20025.30.7815.6Free option, self-hosted

Translation Quality Metrics: BLEU, COMET, and Human Evaluation Explained

Example Translations

Formal Business Email

Source: “The Welsh Government is pleased to announce additional funding for Welsh-medium education programs. Details of the application process are enclosed.”

SystemTranslation
GoogleMae Llywodraeth Cymru yn falch o gyhoeddi cyllid ychwanegol ar gyfer rhaglenni addysg cyfrwng Cymraeg. Mae manylion y broses ymgeisio wedi’u hamgáu.
DeepLMae Llywodraeth Cymru yn falch o gyhoeddi cyllid ychwanegol ar gyfer rhaglenni addysg cyfrwng Cymraeg. Mae manylion y broses ymgeisio ynghlwm.
GPT-4Mae Llywodraeth Cymru yn falch o gyhoeddi cyllid ychwanegol ar gyfer rhaglenni addysg cyfrwng Cymraeg. Amgaeir manylion y broses ymgeisio.
ClaudeMae Llywodraeth Cymru yn falch o gyhoeddi cyllid ychwanegol ar gyfer rhaglenni addysg cyfrwng Cymraeg. Mae manylion y broses ymgeisio wedi’u hamgáu.
NLLB-200Mae Llywodraeth Cymru yn falch o gyhoeddi cyllid ychwanegol ar gyfer rhaglenni addysg Cymraeg. Mae manylion y broses ymgeisio ynghlwm.

Assessment: GPT-4’s “Amgaeir” (are enclosed, impersonal passive) is the most formally correct Welsh construction for official documents. Google and Claude use “wedi’u hamgáu” with an aspirate mutation on “amgáu” triggered by the possessive “eu” (their), showing correct mutation application. NLLB-200 drops “cyfrwng” (medium) from “Welsh-medium education,” losing specificity. All systems correctly use “Llywodraeth Cymru” (Welsh Government).

Casual Conversation

Source: “Are you going to the Eisteddfod this year? I heard the main competition is going to be brilliant. We should get there early.”

SystemTranslation
GoogleWyt ti’n mynd i’r Eisteddfod eleni? Clywais i fod y brif gystadleuaeth yn mynd i fod yn wych. Dylen ni gyrraedd yno’n gynnar.
DeepLWyt ti’n mynd i’r Eisteddfod eleni? Clywais i fod y brif gystadleuaeth yn mynd i fod yn ardderchog. Dylen ni fynd yno’n gynnar.
GPT-4Ti’n mynd i’r Steddfod eleni? Glywes i fod y brif gystadleuaeth yn mynd i fod yn grêt. Dylen ni gyrraedd yno’n gynnar.
ClaudeWyt ti’n mynd i’r Eisteddfod eleni? Clywais i fod y brif gystadleuaeth yn mynd i fod yn wych. Dylen ni gyrraedd yno’n gynnar.
NLLB-200Ydych chi’n mynd i’r Eisteddfod eleni? Clywais fod y brif gystadleuaeth yn mynd i fod yn wych. Dylen ni gyrraedd yno’n gynnar.

Assessment: GPT-4 captures casual spoken Welsh with “Ti’n” (dropping “Wyt”), the colloquial abbreviation “Steddfod” (for Eisteddfod), the southern dialectal past tense “Glywes i” (I heard), and the informal “grêt” (great, borrowed from English). Other systems produce correct but more literary Welsh. NLLB-200 uses formal “Ydych chi’n” instead of informal “Wyt ti’n.” All systems correctly apply the soft mutation in “brif gystadleuaeth” (main competition, where “prif” causes soft mutation). Best Translation AI for Casual Content

Technical Content

Source: “The bilingual content management system automatically generates Welsh-language versions of government web pages using translation memory and terminology databases.”

SystemTranslation
GoogleMae’r system rheoli cynnwys dwyieithog yn cynhyrchu fersiynau Cymraeg o dudalennau gwe’r llywodraeth yn awtomatig gan ddefnyddio cof cyfieithu a chronfeydd data terminoleg.
DeepLMae’r system rheoli cynnwys dwyieithog yn cynhyrchu fersiynau Cymraeg o dudalennau gwe’r llywodraeth yn awtomatig gan ddefnyddio cof cyfieithu a chronfeydd data terminoleg.
GPT-4Mae’r system rheoli cynnwys dwyieithog yn creu fersiynau Cymraeg o dudalennau gwe’r llywodraeth yn awtomatig gan ddefnyddio cof cyfieithu a chronfeydd data termau.
ClaudeMae’r system rheoli cynnwys dwyieithog yn cynhyrchu fersiynau Cymraeg o dudalennau gwe’r llywodraeth yn awtomatig gan ddefnyddio cof cyfieithu a chronfeydd data terminoleg.
NLLB-200Mae’r system rheoli cynnwys dwyieithog yn cynhyrchu fersiynau Cymraeg o dudalennau gwe’r llywodraeth yn awtomatig gan ddefnyddio cof cyfieithu a chronfeydd data terminoleg.

Assessment: The outputs are remarkably similar, reflecting well-established Welsh IT terminology. GPT-4 uses “creu” (create) instead of “cynhyrchu” (produce/generate) and “termau” (terms) instead of “terminoleg” (terminology), both of which are acceptable. All systems correctly apply the soft mutation in “dudalennau” (from “tudalennau,” pages) after the preposition “o” (of). “Cof cyfieithu” (translation memory) is standard Welsh language technology terminology. Best Translation AI for Technical Documentation

Strengths and Weaknesses

Google Translate

Strengths: Free and accessible. Reasonable quality. Benefits from Welsh Government translation data and BBC Cymru Fyw content. Weaknesses: Inconsistent mutation application. Sometimes produces anglicized word order. Limited dialectal awareness.

DeepL

Strengths: Good formal document quality. Correct mutations in simple sentences. Natural vocabulary. Weaknesses: Premium pricing. Mutation errors in complex sentences. Limited casual Welsh capability.

GPT-4

Strengths: Best overall quality. Good dialectal awareness (northern vs. southern Welsh). Handles mutations most consistently. Natural casual register. Weaknesses: Higher cost. Occasionally produces non-standard dialectal forms in formal content.

Claude

Strengths: Consistent quality for long documents. Reliable formal register. Weaknesses: Similar quality to Google. Limited mutation handling in complex contexts. No dialectal variation.

NLLB-200

Strengths: Free and self-hostable. Basic functionality. Weaknesses: Lowest quality. Formal register default. Frequent mutation errors. Limited Welsh training data.

Recommendations

Use CaseRecommended System
Government servicesGPT-4 with human review
Education materialsGPT-4 or DeepL
Cultural / Eisteddfod contentGPT-4
Healthcare communicationsGPT-4 with human review
High-volume, cost-sensitiveNLLB-200 (self-hosted)
Quick personal translationGoogle Translate (free)
Long-form contentClaude

Best Translation AI in 2026: Complete Model Comparison

Key Takeaways

  • GPT-4 leads for English-to-Welsh with the most consistent mutation handling and best dialectal awareness. All systems still require human review for published content due to mutation errors.
  • Initial consonant mutations are Welsh’s signature challenge for AI translation: soft, nasal, and aspirate mutations change word-initial consonants based on grammatical triggers, and errors are immediately noticeable to native speakers.
  • The Welsh Language Act’s equal status requirement creates massive institutional translation demand, producing parallel corpus data that benefits all AI systems.
  • The northern/southern dialect divide affects vocabulary, verb forms, and pronunciation, but most AI systems default to a somewhat standardized literary Welsh that may not match regional expectations.

Next Steps