Definition

What is ASCII (American Standard Code for Information Interchange)?

By

Rahul Awati
Peter Loshin, Former Senior Technology Editor

Published: 24 Jan 2025

ASCII (American Standard Code for Information Interchange) is the most common character encoding format for text data in computers and on the internet. In standard ASCII-encoded data, there are unique values for 128 alphabetic, numeric or special additional characters and control codes. Over the years, several ASCII extended sets have emerged that expand the original set of 128 characters with additional symbols and characters.

ASCII characters in the original ASCII table

The ASCII encoding system includes hundreds of characters, each assigned its own unique binary code. In the original system of 128 characters, the binary codes were 7 bits long. Today, ASCII uses 8-bit codes to maintain compatibility with modern computers that use 8-bit bytes. The extra bit in these codes is usually set to 0.

ASCII characters include uppercase and lowercase letters A through Z, numerals 0 through 9 and basic punctuation symbols. Codes 32-127 are all printable ASCII characters, representing letters, numbers, punctuation marks, plus some special characters like ^, [, \, ~ and other miscellaneous symbols.

The ASCII format also uses some non-printing control characters (also called control codes) originally intended for use with teletype printing terminals used in the early days of computing to input and output data. These characters, as described in Table 1, range from decimal 0 to decimal 31, and represent characters like null character (character 0), back space (character 8), synchronous idle (character 22) and unit separator (character 31).

Table 1. Non-printing ASCII control codes are used to manage data flows.

ASCII character representation

ASCII characters in both original and extended formats may be represented in several ways:

As pairs of hexadecimal digits -- base-16 numbers, represented as 0 through 9 and A through F for the decimal values of 10-15.
As three-digit octal (base 8) numbers.
As decimal numbers from 0 to 127 (or 0 to 255 in the extended table).
As 7-bit or 8-bit binary.
As an HTML number.

Some characters can also be represented as their HTML names.

The ASCII encoding for the lowercase letter "m" is represented in these ways:

Character/symbol	Description	Hexadecimal	Octal	Decimal	Binary (7 bit)	Binary (8 bit)	HTML number
m	Lowercase m	6D	155	109	110 1101	0110 1101	m

Similarly, the ASCII encoding for the semicolon (;) can be represented in the following ways:

Character/symbol	Description	Hexadecimal	Octal	Decimal	Binary (7 bit)	Binary (8 bit)	HTML number	HTML name
;	Semicolon	3B	073	59	001 11011	0001 11011	;	&semi;

ASCII control codes (non-printing)

The ASCII values for 0 through 31 (binary: 0000 0000 through 0001 1111 in the 8-bit ASCII system) are non-printing control codes. They were originally intended for controlling the flow of data and include codes that do the following:

Show the end or beginning of data components.
Control or show the state of hardware used for data transmission.
Accommodate positioning of the cursor pointer in a data stream.
Indicate the start or end of text or transmission.
Control peripheral devices like printers.

Some of the ASCII control codes are shown in the following table.

Character/symbol	Description	Hexadecimal	Octal	Decimal	Binary (7 bit)	Binary (8 bit)	HTML number
NUL	Null character	00	000	0	000 0000	0000 0000
ACK	Acknowledge	06	006	6	000 0110	0000 0110
BS	Backspace	08	010	8	0000 1000	000 1000
CR	Carriage Return	0D	015	13	000 1101	0000 1101
DC1	Device Control 1 (oft. XON)	11	021	17	001 0001	0001 0001
ESC	Escape	1B	033	27	001 1011	0001 1011

Extended ASCII characters

The standard ASCII character set is only 7 bits, and characters are represented as 8-bit bytes with the most significant bit set to 0. The extended ASCII character set includes 127 more 8-bit characters, where the most significant bit is set to 1. The extended ASCII character set includes the binary values from 128 (1000 0000) through 255 (1111 1111).

Here are some examples of characters included in the extended ASCII table.

Character/symbol	Description	Hexadecimal	Octal	Decimal	Binary (8 bit)	HTML number	HTML name
€	Euro sign	80	200	128	1000 0000		€
…	Horizontal ellipsis	85	205	133	10000101		…
'	Left single quotation mark	91	221	145	1001 0001		‘
÷	Division sign	F7	367	247	1111 0111	÷	÷
À	Latin capital letter A with grave	C0	300	192	1100 0000	À	À
ÿ	Latin small letter y with diaeresis	FF	377	255	1111 1111	ÿ	ÿ

There is no single extended ASCII character set. Unlike standard ASCII characters, there are multiple versions of the extended ASCII character set. These sets may differ depending on the operating system or vendor. Extended ASCII character sets typically include symbols, letters with diacritical marks, graphical markings and mathematical symbols including some Latin letters.

Table 2 lists Microsoft's Windows-1252 (CP-1252) character encoding of the Latin alphabet. This is the default extended ASCII character set for Windows that American and British English and other European languages use. Also, the table is a superset of ISO 8859-1 (ISO Latin-1) in terms of printable characters and uses only printable characters in the 128 to 159 range (no control characters).

Table 2. Microsoft Windows extended ASCII character encoding.

How does ASCII work?

ASCII offers a universally accepted and understood character set for basic data communications. The format codes a string of data as ASCII characters that can be interpreted and displayed as readable plain text for people and as data for computers.

Programmers use the design of the ASCII character set to simplify certain tasks. For example, using ASCII character codes, changing a single bit easily converts text from uppercase to lowercase.

The capital letter "A" is represented by the binary value:

0100 0001

The lowercase letter "a" is represented by the binary value:

0110 0001

The difference is the third most significant bit. In decimal and hexadecimal, this corresponds to:

Character	Binary	Decimal	Hexadecimal
A	0100 0001	65	41
a	0110 0001	97	61

The difference between uppercase and lowercase characters is always 32 (0x20 in hexadecimal), so converting from uppercase to lowercase and back is a matter of adding or subtracting 32 from the ASCII character code.

Similarly, hexadecimal characters for the digits 0 through 9 are as follows:

Character	Binary	Decimal	Hexadecimal
0	0011 0000	48	30
1	0011 0001	49	31
2	0011 0010	50	32
3	0011 0011	51	33
4	0011 0100	52	34
5	0011 0101	53	35
6	0011 0110	54	36
7	0011 0111	55	37
8	0011 1000	56	38
9	0011 1001	57	39

Using this encoding, developers can easily convert ASCII digits to numerical values by stripping off the four most significant bits of the binary ASCII values (0011). This calculation can also be done by dropping the first hexadecimal digit or by subtracting 48 from the decimal ASCII code.

Developers can also check the most significant bit of characters in a sequence to verify that a data stream, string or file contains ASCII values. The most significant bit of basic ASCII characters will always be 0; if that bit is 1, then the character is not an ASCII-encoded character.

Why is ASCII important?

ASCII was the first major character encoding standard for data processing by computers and other electronic devices. The standardized, universally accepted nature of ASCII codes let different systems communicate with each other to process data, share files and documents, and more. Developers can use the ASCII format to design interfaces that both humans and computers understand.

As a standardized format for representing information and facilitating communication, ASCII is important in numerous fields, including the following:

Computer programming.
Data transmission protocols.
Visual design.
Graphic design.

Today, most modern computer systems use Unicode, also known as the Unicode Worldwide Character Standard. This means that ASCII encoding is now technically obsolete. Because ASCII text is compatible with Unicode Transformation Format 8 (UTF-8), many computers still use ASCII or Unicode encoding. The exceptions are some IBM mainframes that use the proprietary 8-bit code called Extended Binary Coded Decimal Interchange Code (EBCDIC).

ASCII variants in other languages

When it was first introduced, ASCII supported English language text only. When 8-bit computers became common during the 1970s, vendors and standards bodies began extending the ASCII character set to include 128 additional character values. Extended ASCII incorporates non-English characters, but it is still insufficient for comprehensive encoding of text in most world languages, including English. To overcome this limitation, different extended ASCII character sets have been developed.

Initially, other character encoding standards were adopted for other languages. In some cases, the standards were designed for other countries with different requirements. In other cases, the encodings were hardware manufacturers' proprietary designs.

What is the relationship between ASCII and Unicode?

Unicode is a character encoding standard that includes ASCII encodings. In 2003, the Internet Engineering Task Force (IETF) standardized the use of UTF-8 encoding for all web content in RFC 3629. Unicode character encoding replaces ASCII encoding, but it is backward-compatible with ASCII. ASCII characters use the same encoding as the first 128 characters of UTF-8.

Unicode defines codespaces for the implementation of character encodings for different languages. Characters can be mapped to encodings using either UTF or Universal Coded Character Set (UCS).

Depending on the language and mapping used, characters can be expressed in one to four 8-bit bytes (UTF-8), in two 16-bit units (UTF-16) or in a single 32-bit unit (UTF-32).

Both ASCII and Unicode provide standard ways to encode characters for use by computers and other devices. The number of characters supported and the way each character is represented differ in ASCII and Unicode. Even with extended ASCII, the number of English characters represented is 256. In contrast, Unicode supports codes for close to 150,000 characters. This is why Unicode can be used to represent text from many different languages for computer processing, not just English. Among the reasons for the emergence and introduction of Unicode is its ability to support characters for languages that use thousands of characters.

Unicode is also a universal encoding standard because it is platform-, program- and programming language-agnostic. The main drawback of Unicode is that it can only represent plain text, not rich text.

The UCS standard is an ISO (International Organization for Standardization) standard, ISO/IEC 10646. Since ISO/IEC 10646 defines the character encoding for UCS, Unicode supports the same encoding points and characters as ISO/IEC 10646 (specifically ISO/IEC 10646:2003).

ASCII advantages and disadvantages

More than half a century of use has made clear the advantages and disadvantages of ASCII.

Advantages

Universally accepted. ASCII character encoding is universally understood and accepted. It is also universally implemented in computing through the Unicode standard (except for IBM mainframe EBCDIC encoding).
Compact character encoding. Standard codes can be expressed in 7 or 8 bits. This means data that can be expressed in the standard ASCII character set requires only as many bytes to store or send as the number of characters in the data.
Efficient for programming. The character codes for letters and numbers are well adapted to programming techniques for manipulating text and using numbers for calculations or storage as raw data.

Disadvantages

Limited character set. Even with extended ASCII, only 255 distinct characters can be represented. The characters in a standard character set are enough for English language communications. However, it is difficult to accommodate languages that do not use the Latin alphabet, despite the support for diacritical marks and Greek letters in extended ASCII.
Inefficient character encoding. Representing characters from other alphabets other than English requires more overhead such as escape codes.

Converting text to ASCII code in Windows

There is more than one way to display text as ASCII codes in Windows. To use the Windows PowerShell command Format-Hex to display ASCII encoding for a text file, take these steps:

Open the Windows PowerShell application. Click on the search box in the lower left of your Windows desktop. Type PowerShell and click on the PowerShell icon to start the application.
Format-Hex command. Enter the following command to display the ASCII encoding for a file called hello.txt in the c:\Users\userID\Documents directory: format-hex .\hello.txt
View output. ASCII encoding for the file hello.txt will be displayed as shown in Figure 1.

Figure 1. View ASCII encoding for a text file using the 'Format-Hex' command in PowerShell.

The top of the output shows that data is displayed in 16 columns, with one character per column. A running count of characters, in hexadecimal, is displayed along the left side of the output. In this case, in the last line, there are 0x60 (or 96 in decimal) characters at the start of the last line. ASCII encoding for the file's characters are shown in a grid 16 characters wide, with encoding in two-digit hexadecimal values. The original contents of the file are displayed to the right in 16-character groupings.

The original file has two spaces (ASCII 0x20) followed by a CR (carriage return, ASCII 0x0D) and LF (line feed, ASCII 0A) characters. The CR-LF combination is used in ASCII files to show the end of a line.

Other options. Format-Hex can be used with other commands for easier command-line viewing of larger files. For example, this command is used to page through ASCII encoding of a large file:

Format-Hex .\hello-long.txt | more

The output will look similar to Figure 2, and you can view output one page at a time.

Figure 2. View ASCII encoding for a longer file using the 'Format-Hex' command with the more command.

The FTP ascii command

The File Transfer Protocol (FTP) has an ascii command that is used to enable the transfer of ASCII-encoded files. When transferring files in ASCII mode in FTP, the receiving host may change the file so it will be formatted as ASCII on the destination host.

When FTP transfers files using the binary mode, those files are not changed.

ASCII art

ASCII characters can be combined graphically to create an image. ASCII art is a common technique for creating graphical images on text-only media like a computer terminal or text-only printer. For example, this ASCII art is an example of an early emoji.

¯\_(ツ)_/¯

More elaborate images are possible when using more lines and more characters, especially from extended ASCII character sets.

History and future of ASCII

ASCII encoding is based on character encoding used for telegraph data and Morse code. The ASCII character encoding standard was designed in the early 1960s to provide a standard character set all computers could understand, facilitating data interchange between them. The American National Standards Institute first published it as a standard for computing in 1963.

The IETF adopted ASCII as a standard for internet data when it published ASCII format for network interchange as RFC 20 in 1969. That request for comments (RFC) document standardized the use of ASCII for internet data and was accepted as a full standard in 2015.

ASCII remains a universally accessible and acceptable standard for encoding computer and network data. Given the need to preserve data stored over the past decades, most experts predict it will remain foundational for computing, programming and electronic data interchange for many more years to come.

Learn more about data storage management and how organizations use data retention policies to retain data and maintain access to it over the long term.

Continue Reading About What is ASCII (American Standard Code for Information Interchange)?

Types of bytes: Units of memory explained

Hone your PowerShell text manipulation skills

Basic PowerShell commands for Windows administrators

Common network protocols and their functions explained

The history of emoji

Search Networking

What is multi-access edge computing? Benefits and use cases
Multi-access edge computing (MEC) is a network architecture concept that brings cloud computing capabilities and IT services ...
What is 5G?
Fifth-generation wireless or 5G is a global standard and technology for wireless and telecommunications networks.
What is a small cell in wireless networks?
A small cell is a type of low-power cellular radio access point or base station that provides wireless service within a limited ...

Search Security

What is identity and access management? Guide to IAM
No longer just a good idea, IAM is a crucial piece of the cybersecurity puzzle. It's how an organization regulates access to ...
What is data masking?
Data masking is a security technique that modifies sensitive data in a data set so it can be used safely in a non-production ...
What is antivirus software?
Antivirus software (antivirus program) is a security program designed to prevent, detect, search and remove viruses and other ...

Search CIO

What is a chief data officer (CDO)?
A chief data officer (CDO) in many organizations is a C-level executive whose position has evolved into a range of strategic data...
What is user-generated content?
User-generated content (UGC) is published information that an unpaid contributor provides to a website.
What is business process outsourcing (BPO)?
Business process outsourcing (BPO) is a business practice in which an organization contracts with an external service provider to...

Search HRSoftware

What is compensation management?
Compensation management is the discipline and process for determining employees' appropriate pay, incentives, rewards, bonuses ...
What is HR technology (human resources tech)?
HR technology (human resources tech) refers to the hardware and software that support an organization's human resource management...
What is core HR (core human resources)?
Core HR (core human resources) is an umbrella term that refers to the essential, mandatory and fundamental tasks and functions of...

Search Customer Experience

What are virtual agents and how are they being used?
A virtual agent is an AI-powered software application or service that interacts with humans or other digital systems in a ...
Customer acquisition cost (CAC): How to calculate and reduce it
Customer acquisition cost (CAC) is the cost associated with convincing a consumer to buy your product or service, including ...
What is direct marketing?
Direct marketing is a type of advertising campaign that seeks to elicit an action (such as an order, a visit to a store or ...

Close