This training helps:

Anyone who creates and manages webpages and content.

  • Developers
  • Content management system managers
  • People who manage a website
  • … and more

Review of “Understanding missing document language

About document language

Document language is simple code that identifies the language of the web page.

This is used by:

  • Browsers
  • Assistive technology
  • Third-party APIs like media players with captions or grammar checkers
Document with its language set to US English

When the document language is available, it allows the code to present the information correctly.

For example, a website set to “US English” will be correctly interpreted by grammar checkers and assistive technology.

How missing document language can impact assistive technology users

If language is not set, a device will read content based on its default settings. For example, if your computer is set to United States English, screen readers, voice assistants, and other assistive technologies will default to speaking with U.S. English pronunciation.

Person confused at monitor’s announcement

Play the audio below for an example. In this scenario, a screen reader announces Spanish in an U.S. English accent that’s hard to understand.

[Screen reader pronunciation of Spanish in the intended Spanish accent] “La selección masculina de fútbol de México es el equipo formado por jugadores de nacionalidad mexicana que representa a la Federación Mexicana de Fútbol (FMF).”

[Screen reader pronunciation of Spanish in an English accent]
”La seleck-eon masculine-a de foot-bowl de Mexico es el equip-o for-mad-o por juga-dores de national-y-dad mexican-a que represent-a a la Federation Mexican-a de foot-bowl (FMF).”

The format for document languages

In all cases, you will set the document language in the <html> element.

It typically consists of two or three parts:

Attribute: lang

Language: An ISO language code reference. For example, “en” for English and “es” for Spanish.

*Region (optional): A region code, connected to the language code with a hyphen. Examples are “US” for American English or “MX” for Mexican Spanish.

Setting up document language

Language code only

< html lang = “en” >

The most common way to assign language is to use the language value alone like “en.”

Language and region code

< html lang = “enUS” >

The same language can be used or spoken differently by location and it may benefit from adding the regional code like “US.

Note: The region attribute is optional, but we encourage using it wherever possible. It ensures the content appears based on the correct regional dialect.

Tips to fix missing document language

Recall the scenario with the audio recording at the start of this lesson. It is clear that the Spanish version of the site doesn’t have the correct document language. The next step is to figure out why and address it.

Steps:

1. Check the format

A simple mistake like omitting the language can cause issues. Do a manual inspection since automated tools can’t detect all issues.

a. Is the document language missing?

Automated tools can detect if the document language code is missing. This happens when there is no “lang” listed.

Code is not present

< html >

Code is present

< html lang = “ar” >

b. Is the element correct?

Sometimes, the document language is placed in the wrong location like the body element, instead of the html element. Not all automated tools can flag this error, so a manual review is recommended.

Wrong element

< body lang = “es” >

Correct element

< html lang = “es” >

c. Is there a hyphen for regional codes?

If you’re using a regional code, it needs to be connected to the language code with a hyphen.

Missing hyphen

< html lang = “esMX” >

With hyphen

< html lang = “es-MX” >

2. Validate the code

A simple mistake like entering it improperly can cause issues. Do a manual inspection since automated tools can’t detect all issues.

a. Identify the language code

For this activity, pretend the language code was set to “ed” instead of “es.”

Code is not present

< html >

Code is present

< html lang = “ar” >

b. Confirm the correct language code

In a separate window, navigate to a language code database. We compare two tools. Feel free to choose whichever you prefer.

Language subtag lookup app

Pro: This app’s functions narrow and match results. It’s better at finding and verifying codes.

Con: It shows the language and region code separately.

language code page has a table for language code and English name

Language codes table

Pro: The language and region codes appear in proper format. The page lets you view all the region codes in one place.

Con: The only way to navigate is to scroll or use Ctrl + F to find a code, which can lead to unrelated results.

c. Use the tool to find and verify the code

1. Look up the language code

Use the “Look up” function and type in the language or region code like “ed.”

If it is an invalid code, it will show no matching results and it may be incorrectly spelled. Proceed to step 2.

If it is a valid code, it will show matching results for both language codes and region codes. For example, “es” matches with the language code for Spanish and the region code for Spain.

Note: Find will only search codes like “ed,” not the full name like “Spanish.”

Invalid code

Screenshot of the results for “ed” when you use Look Up. The subtag ed was not found and no valid subtags were found

Valid code

Screenshot of the results for “ed” when you use Look Up. Language code “es” and geographic region “es” is shown

2. Find the language code

Use “Find” to search up codes by the name of the language like “Spanish.”

Screenshot of the results for “spanish” when you use Find. It shows multiple results for language code

3. Optional: Find the specific region

Use “Find” to search up codes by the name of the region like “Mexico.”

To confirm it’s a region or language code, you can select it to see its type: “region.” Now, you know the region subtag to use is “MX.”

Screenshot of the results for “Mexico” when you use Find. It shows the expanded result for MX, the region

3. Add the correct language attribute

After you identify the issue with your code, add the correct language code.

Correct Examples

With the previous example, it can be “es” for Spanish language and “MX” for Mexico to represent the region code.

Language code only

< html lang = “es” >

Optional: Language and region code

< html lang = “es-MX” >

Frequently asked questions

Is any part of the code case-sensitive?

The code is not case-sensitive. Many use lowercase for the language and upper case for the region. (Example: es-MX)

Does the language or region code always have to be two letters?

No, but screen readers do not provide broad support for all language code formats. For instance, yue and zh-yue both mean Cantonese, however zh has broader language support than yue.

Do website builder platforms like WordPress automatically set the document language?

Some platforms check code structure to detect if the language code is correct. It’s good practice to manually check the databases to make sure.

Takeaways

A simple mistake like omitting the document language or using the wrong one can cause formatting issues. Here are some questions to help you ensure the document language is correct.

Questions to ask yourself:

  • What is the primary language of the content?
  • Does the code reference the correct language?
  • Does the code use the right format?

Practice exercises

1. Provide the following region code for a website in Mandarin Chinese.” (only one region code)

Hong Kong: <html lang="zh-____">

Scenario:

An engineer wants to create speech synthesizers that are localized to specific regions to be more inclusive to dialects worldwide.

Use Validate the code in Language Code Table tool to provide the region code for Hong Kong screen reader localization.

They achieve this by using the region subtag. Use Validate the code in Language Code Table tool to provide the region codes for Hong Kong screen reader localization.

The correct answer is HK. This is the value assigned to this region. The region code is optional, but it highly recommended to add where possible.

2. Find and apply the correct language code for this text written for a blockquote in non-U.S. English:

Scenario:

You are tasked with the job of fixing missing document language on a website. No one on the team can identify the language. Use a language detection tool to determine the correct language for the blockquote, and enter the correct language code.

The text is written in Portuguese and the language code is “pt” Applying this to a document will guarantee that other technologies can translate it if needed.

We’d love to hear from you.
Help improve our training and take our survey!

Explore our other training

Describe images accessibly, and making them available to everyone

Make link destinations clear for everyone