Definition

bytecode

What is bytecode?

Bytecode is computer object code that an interpreter converts into binary machine code so it can be read by a computer's hardware processor. The interpreter is typically implemented as a virtual machine (VM) that translates the bytecode for the target platform. The machine code consists of a set of instructions that the processor understands.

Many computer languages, such as C and C++, require a separate compiler for a specific computer platform. That is, a separate compiler is needed for each combination of operating system (OS) and hardware architecture. For example, Microsoft Windows and Intel's microprocessors represent one platform, and macOS and the Apple M-series chips represent another.

With bytecode, the source code must be compiled only once. The platform-specific interpreter then converts it to machine code that can be executed by the OS and central processing unit, or CPU.

How does bytecode work in application delivery?

The creation and execution of bytecode is often part of an app delivery process. That process begins with creating a program's source code using the following three steps:

  1. A developer builds an application in a high-level, human-readable programming language such as Java, C# or Python. Most developers use some sort of integrated development environment to create the application files and then commit those files to a version control system. A high-level language helps to simplify and optimize the application development process. However, the language statements -- or source code -- cannot be read by a computer processor.
  2. A compiler converts the source code to bytecode, an intermediary code that bridges the gap between the high-level source code and low-level machine code. The compiler is a special type of program that translates statements in the source code to bytecode, machine code or another programming language. A compiler usually performs a lexical analysis, syntax analysis and semantic analysis. It then generates intermediate representation (IR) code. That IR code generation is used to create the final output code.
  3. A special type of VM installed on each system where the application will run serves as an interpreter for converting the bytecode to machine code that targets a specific platform. Machine code is made up entirely of binary bits -- 1's and 0's -- in a format that a computer's processors can read and execute. For example, the VM-based interpreter on an Apple Mac computer would generate machine code that is specific to the macOS and the computer's processor architecture, whether Intel or Apple M1.
diagram of how bytecode is used in application delivery
With bytecode, source code doesn't have to be recompiled for each target platform.

What is the advantage of bytecode?

Bytecode eliminates the need to recompile source code for each target platform. Although the interpreters differ between platforms, the application's bytecode does not.

This approach lets each system interpret the same bytecode files. The bytecode itself is in a binary format that consists of constants, references and numeric codes.

diagram of how Java virtual machine works
The Java virtual machine interprets bytecode and converts it to machine language that is platform-specific.

An example of bytecode

One of the most common examples of bytecode in action is the Java programming language. When an application is written in Java, the Java compiler converts the source code to bytecode, outputting the bytecode to a CLASS file.

The CLASS file is then read and processed by a Java virtual machine (JVM) running on a target system. The JVM, which is part of the Java Runtime Environment, interprets the bytecode and converts it to machine language specific to the intended platform.

The JVM interpreter usually processes the bytecode instructions one instruction at a time, but a JVM can also support a just-in-time compiler. These compilers can process the bytecode more efficiently, which helps improve application performance.

Programming languages that use bytecode

The Lisp programming language, once commonly used for artificial intelligence applications, is an earlier language that uses bytecode as an intermediary step. Other languages that use bytecode or a similar approach include the following:

  • Hypertext Preprocessor or PHP
  • Prolog
  • Raku
  • Scala
  • Unicon

Learn more about Java and why it was designed to be platform-independent.

This was last updated in June 2022

Continue Reading About bytecode