Man, this was a Pain in the Ass to convert to HTML5 ~ CSS3 as there are 9,916 lines of code to go through ( a bit long)... But I did it. 25.04.13
Figures.. overnight, no less than 6 hrs and 'they' changed again, and i had 200 more errors.. Fuckers. Done? 26.04.13 am

↜―┄┅⋱\†/⋰┅┄―↝

Character set and 'special characters'

HTML uses the character set called the Universal Character Set (UCS), also known as UTF-8 defined in the ISO-10646 specifiation. This standard defines a repertoire of thousands of characters used by communities all over the world. This character set is equivalent to Unicode 2.0.

If you need a to use a special character, like an accented character or a currency sign, you have to use a special notation. There are two ways to use these special characters:

  1. Use the decimal representation, in the format &#nnn;, wherennn is the decimal code of the character.
  2. Use a character entity, in the format &ccc;, where ccc is the character string that represents the specific character.

for Declaring DocType and CharSet see...

*fyi: very nice, but big and slow, so nice looking unicode chart: unicode-table.com


For Cyrillic ( utf-8 ) ( Russian flag icon Russian, Ukrainian flag icon Ukrianian iso-8859-5 / windows 1251 )

☡ 'open' Cyrillic (кириллический) chart


My collection 'special characters' :
   utf-8, glyphs or unicode entities that i like and use

☡ see uft-8 chart


list-style-type : list

  1. list-style-type: circle;
  2. list-style-type: disc;
  3. list-style-type: square;
  4. list-style-type: decimal;
  5. list-style-type: decimal-leading-zero;
  6. list-style-type: lower-greek;
    1. this is a sub-list or nested child to above 'lower-greek
  7. list-style-type: lower-roman;
  8. list-style-type: upper-roman;
  9. list-style-type: armenian;
  10. list-style-type: lower-latin;
  11. list-style-type: upper-latin;
  12. list-style-type: lower-alpha;
  13. list-style-type: upper-alpha;
  14. list-style-type: inherit; (doesn't seem to work unless iheritance is from <ol > or higher)
  15. list-style-type: cjk-ideographic;
  16. list-style-type: hiragana;
  17. list-style-type: hebrew
  18. list-style-type: none;
  19. list image - test for < li > **

NOTE: In order to 'Nest' lists (see lower-greek above), the First set of li are Not terminated </li>, then the nested li's are, then, close the the Second (nested or sub-list) /ul /ol then close the First set of /li and /ol or /ul

<ul style="list-style-image: uri('../images/dlf/real.gif'); font-size:10px;>

also see at w3schools

real media gifthis th to the above image... is it there? Yep, so path is good.
** OK... here is the 'right' syntax for image: < li style="list-style-image: url('../images/dlf/real.gif'); " >

I swear... I've looked at 3 diff place at w3c.org and all three have diff syntex. No quotes, single and then double quotes... also 3 other sites, w3schools.com, css3.com and some other site found at top of search at bing.com...
Not one worked.
My above example Obviously works! F* them.... 'uri' ONLY works, sometimes if in head style or attached stylesheet. Not in inline style. for inline style use the tried and true 'url'


Table of printable Latin-1 Hex Character codes

Latin 1 - Basic Characters used in western key maps 33 ( !) to 255 (ÿ)
- not unmodified keys 0 (zero) 34 (")

*NOTE: W3c has Taken codes 127 - 159 away from us and has labeled them 'control codes' which they are, but because they are trying to make HTML5 and CSS3 a psudo programming language, fag ui... they took 'em

☡ see chart


some pretty cool 'icon' type

Unicode U2600 characters (pdf):
[cup'a joe?] ♨ (utf-8) - ♨ (hex - &#x2668;)

this is a unicode U2600 chart I created of the Unicode U2600 character set.. these look like 'icon's' or glyphs but but are actual unicode characters (letters), just like 'A, B, C, Я, Ж, Ю, 1, 2, 3.. '
{ ☡ show ~ hide chart }

this expand / collapse (hidden / unhidden, hide / unhide, visibility:hidden / visible, show / hide) <div> is very simple..
i put the code in my CSS Snippets page


Character Converters:

Unicode ☡ ~ Hex (&#x2621;, ☡) conversion form: {here}

a really Good and complete character converter.. All encodings.

a couple more sites that have 'lookup' chart/form for characters by number
utf-8 - chartable.de
dev.networkerror.org , build charts to 10,000 characters


UTF-8: from 0020 [&#x0020;]( ascii 034 = &#nbsp;) to FB06 (st)

This is my UTF-8 Character Chart... UTF-8 is made up of Unicode, these entities (characters) have the prefix U+
to use in html, drop the 'U+" and add prefix  &#x  to the number

☡ see uft-8 chart


Named HTML - Unicode entites ('named' refers to the html names eg, &#lt; (LessThan))

This is the most compleate list of HTML 'Named' characters I have. 'unicode', utf-8 named characters.
It was a lot of work, converting from whatwg, unicode jason file to usable html table... phew...

*this is a very complete list, but very long, about 2,307 lines (rows)... ...but, if you're here, it's already loaded.

so, ☡ see the 'named' character chart... click, eh.Open in single basic page


glyph (glĭf)

noun: A symbol, such as a stylized figure or arrow on a public sign, that imparts information nonverbally.

emoticon [смайлы (smiles)] :
e·mo·ti·con (ĭ-mō'tĭ-kŏn')
noun: A sideways facial glyph used in e-mail to indicate an emotion or attitude, as to indicate intended humor [ :-) ].


Character Set Charts:

<meta http-equiv="content-type" content="text/html;charset=utf-8">

for Character Set definitions see these tables these sets are used in the html (php, xml) page 'head' > <meta> definitions:


*New :  ♪ Musical Notes..


ISO-8859-1 Characters with HTML 'Names'

☡ see chart


Symbols, mathematical symbols, and Greek letters

☡ see chart


Markup-significant and internationalization characters

☡ see chart



08.06.11:
Add 'musical notes' to the 'special characters section... you have a rough draft txt doc in ? /Language {programing}/css/ I think.. Landis

musical quarter note: &#9833; = ♩
musical eighth note: &#9834; = ♪
musical single bar note: &#9835; = ♫
musical double bar note: &#9836; = ♬
musical flat note: &#9837; = ♭
musical natural note: &#9838; = ♮
musical sharp note: &#9839; = ♯
You can copy the &#98xx; (ampersand, pound, 4 digit numbar, semi-colon) from here, But, if you copy the code from 'page source' you will have to change the amp (&amp;) 'code' to an actual single character Ampersand .
characters by Ron at answers.yahoo - [code] by Landis.

[code string] &#9833; &#9834; &#9835; &#9836; ...... &#9833;&# 9834;&# 9835;&# 9836; [/code string]
renders this

...... ♬ ♫ ♪ ♩ ...... ♩ ♪ ♫ ♬ ......
or
...... ♩ ♪ ♫ ♬ ....... ♬ ♫ ♪ ♩ ......

*note: how i made this 'post-it' and Much more on it and about it, margins, borders, shadows, position, radius, transparentcy
even rotation can be found on my css tips page


This converter was originally taken from Shaun Moss's website. His site appears to be offline so I rescued this from the google cache. Changed some stuff too. I use this to convert the email addresses I put on web pages to things the email harvester bots won't understand.


Unicode Characters to HTML Entities Converter

A utility to convert Unicode characters to decimal and hexadecimal HTML entities.
(by Shaun Moss, adapted from: ASCII to HEX to Unicode Converter by Mike Golding)

The partial conversions do not convert characters with a code of 127 or less, (i.e. plain old ASCII characters), which can appear in HTML code as they are.
More about character sets.


Unicode characters:
 

HTML entities:
decimal, full:
decimal, partial: *
hexadecimal, full:
hexadecimal, partial:

* This is the result I prefer for HTML code. I have noticed some quirkiness with the hexadecimal codes - sometimes the HTML entity is not converted to the character and appears on the page unconverted (e.g. &#xC6D4; instead of 월). Also, although some text editors will allow you to type Latin 1 characters (e.g. é, ç) into your HTML code, these characters do not always render properly (not sure why, if you know please email me). For maximum reliability use decimal HTML entities for any non-ASCII characters.


~top~

Character Set Encoding Definitions:

these are the sets used in the head > meta section of 'web documents'.

these sets are used in the html (php, xml) page 'head' > <meta> definitions:
examples:
<meta http-equiv="content-type" content="text/html;charset=utf-8">
(utf-x are universal unicode set)
<meta http-equiv="content-type" content="text/html;charset=iso-8859-1">
(iso-8859-x are international region specific sets)
<meta http-equiv="content-type" content="text/html;charset=windows-1259-1>
(windows-1259-x are Microsucks Windows specific sets)


Unicode - UTF-x Character sets

☡ see uft-8 chart


ISO 8859-x Character Sets

☡ see ISO encoding chart


MS Windows Character Sets

These are character sets specific to Windows. They are similar, but not equal, to the ISO 8859 character sets. While ISO 8859 character sets do not specify characters in the 128..159 range, the Windows character sets do. Characters in the 0..127 range are identical to US-ASCII. Most but not all of the character assignments in the 160..255 range are the same as in ISO 8859.

Number Name
1250 Latin 2
1251 Cyrillic
1252 Latin 1
1253 Greek
1254 Latin 5
1255 Hebrew
1256 Arabic
1257 Baltic
1258 Viet Nam
874 Thai

Declaring character sets in XML

Every XML document or external parsed entity or external DTD must begin with an XML or text declaration like this:

<?xml version="1.0" encoding="iso-8859-1" ?>

In the encoding attribute, you must declare the character set you will use for the rest of the document. You should use the IANA/MIME-Code from the table above.


Declaring character sets in HTML

In the head of an HTML document you should declare the character set you use for the document:

you Must declare the document type Before the head. the only thing that can be, before this is server side includes such a php or asp. list of Doc Types at W3C and list of encoding, CharSet at W3C

example: <?php include '../header.html'; ?>

for HTML5 which as of 04.2013 is not completely implemented, but it's what I have been writting to, updating 2,200 pages on this site alone, sucks!

<!DocType HTML>

<html>

  <head>
    <meta charset="utf-8">
( that's it! ) (fyi, this is the Only thing that the duh's at w3c have made 'easier', cleaner in html5 and css3).

*note: i personally put the character encodeing declaration Above the 'Title' tag so i can use utf-8 (unicode) in the title. if you look at my title, you'll see things like '☡' i use as Landis2 and ♨ as well as Русский, Russian Cyrillic character, none of which would be posible without utf-8 being declaired first. hence the old method of using &#characters... just sayin'

these days, because of the way i 'include' my head elements in a header page then call it into pages with
<?php ' '; ?> code, the 'Title' tag come long after the charset. the common head elements are in header0x.html and are included into page0x.php, and each page0x.php needs it's own 'Title'.. blah, blah...


for HTML 4 / 4.1

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
 "http://www.w3.org/TR/html4/loose.dtd">
<html> <head>

My prefered Character Set is UTF-8, a collection of Unicode character sets which make it more 'internationalized' vs windows or iso's 'reginal' 125x sets

<meta http-equiv="content-type" content="text/html;charset=utf-8">
</head><body>

for windows crap, which will mess up pages written in a utf-8 editor

<head>
  <meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
...
</head><body>

Without this declaration (and, BTW, without an additional DOCTYPE declaration), the W3C Validator will not be able to validate your HTML document.


IANA Character Set definitions

The Internet Assigned Numbers Authority (IANA) maintains a list of character sets and codes for them. This list is:

IANA-CHARSETS Official Names for Character Sets, http://www.iana.org/assignments/character-sets


preceding 3 tables by: Stefan Heymann. Last Update 2009-07-05

one or more tables are   Copyright © 1996 - 1999 Rob Schlüter,  schluter@knoware.nl (last updated 1998/12/13) Remainder and Majority is copyright© Landis Reed ☡ MMXII

this is code from the original 1992 C+ coding of UTF-8... why i've included it here:
i was reading through the code and it took me a few looks to 'convince myself' the Tx's are all the same size..
just found it interesting, like one of those pieces of 'art' that mess with your eyes using perspective... that's it..