TIS-620

Thai Industrial Standard 620-2533 ist unter der Abkürzung TIS-620 allgemein bekannt als der verbreitetste Zeichensatz und Zeichenkodierung für die Thailändische Schrift. Der Standard wurde vom Thai Industrial Standards Institute (TISI), einem Organ der königlich-thailändischen Regierung, verabschiedet und ist der allein gültige Standard im Königreich Thailand.

Der beschreibende Name des Standards lautet: "Standard für Codes thailändischer Buchstaben zum Gebrauch im Computer" (Thai: รหัสสำหรับ อักขระไทย ที่ใช้กับ คอมพิวเตอร).

Der Zusatz "2533" bezieht sich auf die Jahreszahl nach dem buddhistischen Kalender (1990), in dem der Standard veröffentlicht wurde. Die Vorgängerversion, TIS-620-2529 (1986), gilt damit nicht mehr.

Struktur

TIS-620 ist eine konventionelle ASCII-Erweiterung, die zu 7-Bit-ASCII vollständig kompatibel ist und im 8-Bit-Hexadezimal-Bereich zwischen A1 und FB die thailändischen Buchstaben kodiert. Aufgrund der komplexen Platzierung der thailändischen Vokale und Tonzeichen wird TIS-620 nur zum Informationsaustausch verwendet. Für eine korrekte Darstellung wird zusätzlich eine Rendering-Engine für thailändischen Text benötigt.

Varianten

Eine fast identische Version von TIS-620 wurde 1999 als ISO 8859-11 adaptiert. Der einzige Unterschied ist, dass in ISO 8859-11 das Zeichen A0 (Hex) als geschütztes Leerzeichen definiert ist, während es in TIS-620 zwar reserviert, aber nicht definiert ist. (In der Praxis wird dieser kleine Unterschied normalerweise ignoriert.)

Der Zeichensatz ISO 8859-11 wurde auch als ISO-IR-166 bei Ecma International registriert, aber diese Variante enthält auch explizite Escape-Sequenzen, um Anfang und Ende eines thailändischen Wortes zu markieren. (Im Thailändischen werden keine Zwischenräume zwischen den Wörtern gesetzt.)

Die Windows-Codepage 874 basiert ebenfalls auf TIS-620, fügt allerdings einige weitere Zeichen hinzu.

Die Reihenfolge der Zeichen in TIS-620 wurde in Unicode (ISO 10646) ebenfalls übernommen. Die thailändischen Zeichen reichen in Unicode von U+0E01 bis U+0E7F. TIS-620-Zeichen können ganz einfach nach UTF-16 konvertiert werden. Man muss nur jedem Byte das Präfix 0E hinzufügen und die Hex-Zahl A0 vom Wert abziehen.

TIS-620
	x0	x1	x2	x3	x4	x5	x6	x7	x8	x9	xA	xB	xC	xD	xE	xF
0x	unused
1x	unused
2x	SP	!	"	#	$	%	&	'	(	)	*	+	,	-	.	/
3x	0	1	2	3	4	5	6	7	8	9	:	;	<	=	>	?
4x	@	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O
5x	P	Q	R	S	T	U	V	W	X	Y	Z	[	\	]	^	_
6x	`	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o
7x	p	q	r	s	t	u	v	w	x	y	z	{	\|	}	~
8x	unused
9x	unused
Ax		ก	ข	ฃ	ค	ฅ	ฆ	ง	จ	ฉ	ช	ซ	ฌ	ญ	ฎ	ฏ
Bx	ฐ	ฑ	ฒ	ณ	ด	ต	ถ	ท	ธ	น	บ	ป	ผ	ฝ	พ	ฟ
Cx	ภ	ม	ย	ร	ฤ	ล	ฦ	ว	ศ	ษ	ส	ห	ฬ	อ	ฮ	ฯ
Dx	ะ	ั	า	ำ	ิ	ี	ึ	ื	ุ	ู	ฺ					฿
Ex	เ	แ	โ	ใ	ไ	ๅ	ๆ	็	่	้	๊	๋	์	ํ	๎	๏
Fx	๐	๑	๒	๓	๔	๕	๖	๗	๘	๙	๚	๛

Eventuell muss die Darstellung im Browser vergrößert werden, um alle Zeichen lesbar darzustellen.

In der oberen Tabelle ist 20 das reguläre SPACE Zeichen. Die Werte 00-1F, 7F. 80-9F, A0, DB-DE und FC-FF sind in TIS-620 keinen Zeichen zugeordnet.

Weblinks

Offizielle Referenz (auf Thai)
Mapping von TIS-620 auf ISO 10646 (nicht maßgeblich)

Wikimedia Foundation.

Игры ⚽ Нужен реферат?

Schlagen Sie auch in anderen Wörterbüchern nach:

Thai Industrial Standard 620-2533 — Thai Industrial Standard 620 2533, commonly referred to as TIS 620, is the most common character set and character encoding for the Thai language. The standard is published by the Thai Industrial Standards Institute (TISI), an organ of the… … Wikipedia
Thai Industrial Standard 620-2533 — ist unter der Abkürzung TIS 620 allgemein bekannt als der verbreitetste Zeichensatz und Zeichenkodierung für die Thailändische Schrift. Der Standard wurde vom Thai Industrial Standards Institute (TISI), einem Organ der königlich thailändischen… … Deutsch Wikipedia
ISO/IEC 8859-11 — ISO/IEC 8859 11:2001, Information technology 8 bit single byte coded graphic character sets Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII based standard character encodings, first edition published in 2001. It is… … Wikipedia
Unicode — For the 1889 Universal Telegraphic Phrase book, see Commercial code (communications). The Unicode official logo since October 2009 … Wikipedia
Baudot code — The Baudot code, invented by Émile Baudot,[1] is a character set predating EBCDIC and ASCII. It was the predecessor to the International Telegraph Alphabet No 2 (ITA2), the teleprinter code in use until the advent of ASCII. Each character in the… … Wikipedia
Character encoding — Special characters redirects here. For the Wikipedia editor s handbook page, see Help:Special characters. A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of… … Wikipedia
Control character — In computing and telecommunication, a control character or non printing character is a code point (a number) in a character set, that does not in itself represent a written symbol. It is in band signaling in the context of character encoding. All … Wikipedia
Morse code — Chart of the Morse code letters and numerals Morse code is a method of transmitting textual information as a series of on off tones, lights, or clicks that can … Wikipedia
ANSI escape code — ANSI code redirects here. For other uses, see ANSI (disambiguation). ANSI escape sequences are characters embedded in the text used to control formatting, color, and other output options on video text terminals. Almost all terminal emulators… … Wikipedia
Mojibake — The UTF 8 encoded Japanese Wikipedia article for mojibake, as displayed in the Windows 1252 encoding. Mojibake (文字化け … Wikipedia

Academic dictionaries and encyclopedias

TIS-620

Struktur

Varianten

Weblinks

Schlagen Sie auch in anderen Wörterbüchern nach:

Share the article and excerpts

Academic dictionaries and encyclopedias

Deutsch Wikipedia

TIS-620

Struktur

Varianten

Weblinks

Schlagen Sie auch in anderen Wörterbüchern nach:

Share the article and excerpts

Direct link