Cristina's®     Capt. Peter - web design

Character Sets and Language Codes

The character set tells the browser what characters to use,
and the language tags explain for both the browsers and spiders
which languages you use on your webpage.
This page is part of Don Pedro's Website Design Handbook

Site Goldaward - Pakistani Maritime  International Association of Webmasters and Designers
Site Gold Awards for Excellence on the Web in 2004
Classification: Maritime, Marine, and Boating

Last up-dated: Aug. 31, 2010

This page:
At the bottom of the page, there is a link
to a print ready version.
Character Tags
Special Characters for Different Languages
Symbol Codes
Language Tags
This page is best in any browser

Character Tags

The characters tag for most European languages looks like this:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

There is a new recommendation to use "ISO-2022-JP" instead of "ISO-8859-1", but I haven't seen anybody using that yet (Aug. 2006).

The "charset" tags for HTML 4.0, XML, and XHTML are a little bit different:
  • HTML 4.0 use <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  • XML use at start of a document or an entity: <?xml version="1.0" encoding="utf-8"?>. XML creates documents, not web pages and is therefore used very much to make web pages available on more than just the Web (Internet). You must therefore include display instructions for the browser (.css or HTML 4.0) to make a web page with XML.
  • XHTML use <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> Note the end slash (/).
The Unicode character sets (utf) should make HTML browsers and (XML) processors behave as if they used Unicode internally without transferring documents in Unicode. Info on character set codes: Unicode Consortium

Special Characters for Different Languages

Some languages, especially East European, have their own alphabets with very special accented characters. These require special HTML codes (ASCII - ascii) to display correctly. All browsers do not recognize every code, you need to check in those browsers your visitors are using (you need "Visitor Statistics").

Whenever possible use the "Friendly code". See list below:
Facebook Buttons By ButtonsHut.com
Cristina's Website
Design and Promotion

 
Česky/Slovník ( Czech, Slovak )
François ( French )
Deutsch ( German )
Połski ( Polish )
Pycckий ( Russian ). See note
Español ( Spanish )
Turkish. See note below.

When you use any of the "numbered codes", remember these are always of same format, i.e. &#123; - sometimes you must add the numbers sign (#) yourself.

Please note: Russian and Turkish characters are part of the Unicode characters. You need to declare this in the <HEAD> section of your webpage:

<meta http-equiv="content-type" content="text/html; charset=utf-8">
Don't copy this, type it exactly as it is.

Example: If you, for instance, write the German name "Jürgen" without the special code, i.e. like Jurgen, it's not the same name anymore. I don't think you yourself would be happy if someone wrote and misspelled your own name. Would you?

Symbol Codes

When you want to display symbols, also used in the HTML code, it's recommended you use "HTML equivalent" code - and when these are not enough, then you use the numbers from the ISO standards list. See below in list of special characters.

< &lt;  (less than)
>  &gt;  (more than)
"   &quot;  (quotation mark)
&  &amp;  (and)
#   &#35;  (number sign)
<< &laquo; (left angle quote)
>>  &raquo; (right angle quote
=    &#61;  (equals)
/     &#47;  (forward slash)
\     &#92;  (back slash)

When you use several text specifying tags together, the sequence is the same as in many Asian languages - i.e. you go from less specific to more specific.
Example: "house, big, yellow" or in Computerese:
<Font color="#FFF700"><B><u>HELLO !</u></b></Font>
which gives: HELLO ! Note the end tags are always in opposite order.
B = Bold
U = Underlined
Font = Character

The code is always in English with American spelling (i.e. "color" = American spelling, "colour" = British spelling). The #-sign indicates to the browser to use the hexadecimal code for that colour. You can, of course, write the colour name, but it's not recommended.

Language Tags

The language tag looks like this:
<meta http-equiv="Content-Language" content="en">

"en" is for English. If you use one or several full sentences in an other language, mixed with English, then it's better to include that language. For instance, if both English and German your language tag should be: content="en,de">, where "de" is the code for German. See list of language codes.

There is a new recommendation from W3C to specify a primary language in the meta tags and then specifying the secondary language / languages in the text where you are using it/them. If you are using XML, XHTML, or CSS you can get more specific details on how to do.
 

Home   -   Site Map

Free Backgrounds

Free Pictures

Website Design Handbook

What's No-Index ?

Computer Viruses and Worms

Hide Your E-mail Address

How to Choose Website Colours

How to Change my Pictures and Photos

Reduce Picture Size

Reduce Picture File Size

Reduce Download Time

Increase Picture Size

How Protect my Pictures

Webpage Optimization

Find Best Keywords

SEO Check-List

Website Promotion

Search Engine Marketing

List of Search Engines


Website Design and Promotion Search
Powered by Google

If you use only one or a few words from a different language, then you can keep both language and character tags for English. But instead of, for instance, the letter "ö" (Swedish, Finnish, German) you use the HTML equivalent code from the special characters list - (i.e.: ö = &ouml;).

In such a case you surround the "foreign" words with the "span"-tag with the language code to indicate to the search engines you are using a different language and that it's not a spelling error. Example:
<span lang="es">En la Español</span>

Alternately you can use the <div> tag instead of the <span> tag.

In the code you use &ntilde; instead of "ñ". (See list of special characters) That way you avoid getting the search engines "think" you have misspelled some words. Furthermore many search engines keep country and/or language specific data banks. When you specify the language for the foreign words you use, your webpage will be included in those special directories also.

Locations of visitors to this site


Related page:
| Meta Tags and Search Engines |


Get version (2 pages small font, 3 pages normal)

© by Cristina and Peter Forsberg.
You are allowed to print out the text for your personal needs.
You are also allowed to copy and distribute the printout for educational purposes when free of charge,
as long as you give the source: www.donpedrowebdesign.com/charset.html

Last updated:
Aug 31, 2010

Visitor counter
Since June 29, 2004,
according to
www.digits.com/

eXTReMe Tracker