In computing, a virtual machine (VM) is an emulation of a computer system. Virtual machines are based on computer architectures
and provide functionality of a physical computer. Their implementations
may involve specialized hardware, software, or a combination.
There are different kinds of virtual machines, each with different functions:
There are different kinds of virtual machines, each with different functions:
- System virtual machines (also termed full virtualization VMs) provide a substitute for a real machine. They provide functionality needed to execute entire operating systems. A hypervisor uses native execution to share and manage hardware, allowing for multiple environments which are isolated from one another, yet exist on the same physical machine. Modern hypervisors use hardware-assisted virtualization, virtualization-specific hardware, primarily from the host CPUs.
- Process virtual machines are designed to execute computer programs in a platform-independent environment.
Definitions
A "virtual machine" was originally defined by Popek and Goldberg as "an efficient, isolated duplicate of a real computer machine." Current use includes virtual machines that have no direct correspondence to any real hardware.
The physical, "real-world" hardware running the VM is generally referred
to as the 'host', and the virtual machine emulated on that machine is
generally referred to as the 'guest'. A host can emulate several guests,
each of which can emulate different operating systems and hardware
platforms.
System virtual machines
The desire to run multiple operating systems was the initial motive
for virtual machines, so as to allow time-sharing among several
single-tasking operating systems. In some respects, a system virtual
machine can be considered a generalization of the concept of virtual memory that historically preceded it. IBM's CP/CMS, the first systems to allow full virtualization, implemented time sharing by providing each user with a single-user operating system, the Conversational Monitor System
(CMS). Unlike virtual memory, a system virtual machine entitled the
user to write privileged instructions in their code. This approach had
certain advantages, such as adding input/output devices not allowed by
the standard system.
As technology evolves virtual memory for purposes of virtualization, new systems of memory overcommitment
may be applied to manage memory sharing among multiple virtual machines
on one computer operating system. It may be possible to share memory pages
that have identical contents among multiple virtual machines that run
on the same physical machine, what may result in mapping them to the
same physical page by a technique termed kernel same-page merging
(KSM). This is especially useful for read-only pages, such as those
holding code segments, which is the case for multiple virtual machines
running the same or similar software, software libraries, web servers,
middleware components, etc. The guest operating systems do not need to
be compliant with the host hardware, thus making it possible to run
different operating systems on the same computer (e.g., Windows, Linux, or prior versions of an operating system) to support future software.
The use of virtual machines to support separate guest operating systems is popular in regard to embedded systems. A typical use would be to run a real-time operating system
simultaneously with a preferred complex operating system, such as Linux
or Windows. Another use would be for novel and unproven software still
in the developmental stage, so it runs inside a sandbox.
Virtual machines have other advantages for operating system development
and may include improved debugging access and faster reboots.
Multiple VMs running their own guest operating system are frequently engaged for server consolidation.
Process virtual machines
A process VM, sometimes called an application virtual machine, or Managed Runtime Environment
(MRE), runs as a normal application inside a host OS and supports a
single process. It is created when that process is started and destroyed
when it exits. Its purpose is to provide a platform-independent
programming environment that abstracts away details of the underlying
hardware or operating system and allows a program to execute in the same
way on any platform.
A process VM provides a high-level abstraction – that of a high-level programming language (compared to the low-level ISA abstraction of the system VM). Process VMs are implemented using an interpreter; performance comparable to compiled programming languages can be achieved by the use of just-in-time compilation.
This type of VM has become popular with the Java programming language, which is implemented using the Java virtual machine. Other examples include the Parrot virtual machine and the .NET Framework, which runs on a VM called the Common Language Runtime. All of them can serve as an abstraction layer for any computer language.
A special case of process VMs are systems that abstract over the communication mechanisms of a (potentially heterogeneous) computer cluster.
Such a VM does not consist of a single process, but one process per
physical machine in the cluster. They are designed to ease the task of
programming concurrent applications by letting the programmer focus on
algorithms rather than the communication mechanisms provided by the
interconnect and the OS. They do not hide the fact that communication
takes place, and as such do not attempt to present the cluster as a
single machine.
Unlike other process VMs, these systems do not provide a specific
programming language, but are embedded in an existing language;
typically such a system provides bindings for several languages (e.g., C and Fortran). Examples are Parallel Virtual Machine (PVM) and Message Passing Interface
(MPI). They are not strictly virtual machines because the applications
running on top still have access to all OS services and are therefore
not confined to the system model.
History
Both system virtual machines and process virtual machines date to the 1960s and continue to be areas of active development.
System virtual machines grew out of time-sharing, as notably implemented in the Compatible Time-Sharing System (CTSS). Time-sharing allowed multiple users to use a computer concurrently:
each program appeared to have full access to the machine, but only one
program was executed at the time, with the system switching between
programs in time slices, saving and restoring state each time. This
evolved into virtual machines, notably via IBM's research systems: the M44/44X, which used partial virtualization, and the CP-40 and SIMMON, which used full virtualization, and were early examples of hypervisors. The first widely available virtual machine architecture was the CP-67/CMS (see History of CP/CMS
for details). An important distinction was between using multiple
virtual machines on one host system for time-sharing, as in M44/44X and
CP-40, and using one virtual machine on a host system for prototyping,
as in SIMMON. Emulators, with hardware emulation of earlier systems for compatibility, date back to the IBM System/360 in 1963, while the software emulation (then-called "simulation") predates it.
Process virtual machines arose originally as abstract platforms for an intermediate language used as the intermediate representation of a program by a compiler; early examples date to around 1966. An early 1966 example was the O-code machine, a virtual machine that executes O-code (object code) emitted by the front end of the BCPL compiler. This abstraction allowed the compiler to be easily ported to a new architecture by implementing a new back end that took the existing O-code and compiled it to machine code for the underlying physical machine. The Euler language used a similar design, with the intermediate language named P (portable).[8] This was popularized around 1970 by Pascal, notably in the Pascal-P system (1973) and Pascal-S compiler (1975), in which it was termed p-code and the resulting machine as a p-code machine.
This has been influential, and virtual machines in this sense have been
often generally called p-code machines. In addition to being an
intermediate language, Pascal p-code was also executed directly by an
interpreter implementing the virtual machine, notably in UCSD Pascal (1978); this influenced later interpreters, notably the Java virtual machine (JVM). Another early example was SNOBOL4
(1967), which was written in the SNOBOL Implementation Language (SIL),
an assembly language for a virtual machine, which was then targeted to
physical machines by transpiling to their native assembler via a macro assembler.
Macros have since fallen out of favor, however, so this approach has
been less influential. Process virtual machines were a popular approach
to implementing early microcomputer software, including Tiny BASIC and adventure games, from one-off implementations such as Pyramid 2000 to a general-purpose engine like Infocom's z-machine, which Graham Nelson argues is "possibly the most portable virtual machine ever created".
Significant advances occurred in the implementation of Smalltalk-80,
particularly the Deutsch/Schiffmann implementation
which pushed just-in-time (JIT) compilation forward as an implementation approach that uses process virtual machine.
Later notable Smalltalk VMs were VisualWorks, the Squeak Virtual Machine,
and Strongtalk.
A related language that produced a lot of virtual machine innovation was the Self programming language, which pioneered adaptive optimization and generational garbage collection. These techniques proved commercially successful in 1999 in the HotSpot Java virtual machine.
Other innovations include having a register-based virtual machine, to
better match the underlying hardware, rather than a stack-based virtual
machine, which is a closer match for the programming language; in 1995,
this was pioneered by the Dis virtual machine for the Limbo
language. OpenJ9 is an alternative for HotSpot JVM in OpenJDK and is an
open source eclipse project claiming better startup and less resource
consumption compared to HotSpot.
Full virtualization
In full virtualization, the virtual machine simulates enough hardware
to allow an unmodified "guest" OS (one designed for the same instruction set) to be run in isolation. This approach was pioneered in 1966 with the IBM CP-40 and CP-67, predecessors of the VM family.
Examples outside the mainframe field include Parallels Workstation, Parallels Desktop for Mac, VirtualBox, Virtual Iron, Oracle VM, Virtual PC, Virtual Server, Hyper-V, VMware Workstation, VMware Server (discontinued, formerly called GSX Server), VMware ESXi, QEMU, Adeos, Mac-on-Linux, Win4BSD, Win4Lin Pro, and Egenera vBlade technology.
Hardware-assisted virtualization
In hardware-assisted virtualization, the hardware provides
architectural support that facilitates building a virtual machine
monitor and allows guest OSes to be run in isolation.
Hardware-assisted virtualization was first introduced on the IBM System/370 in 1972, for use with VM/370, the first virtual machine operating system offered by IBM as an official product.
In 2005 and 2006, Intel and AMD provided additional hardware to support virtualization. Sun Microsystems (now Oracle Corporation) added similar features in their UltraSPARC T-Series processors in 2005. Examples of virtualization platforms adapted to such hardware include KVM, VMware Workstation, VMware Fusion, Hyper-V, Windows Virtual PC, Xen, Parallels Desktop for Mac, Oracle VM Server for SPARC, VirtualBox and Parallels Workstation.
In 2006, first-generation 32- and 64-bit x86 hardware support was
found to rarely offer performance advantages over software
virtualization.
Operating-system-level virtualization
In operating-system-level virtualization, a physical server is
virtualized at the operating system level, enabling multiple isolated
and secure virtualized servers to run on a single physical server. The
"guest" operating system environments share the same running instance of
the operating system as the host system. Thus, the same operating system kernel
is also used to implement the "guest" environments, and applications
running in a given "guest" environment view it as a stand-alone system.
The pioneer implementation was FreeBSD jails; other examples include Docker, Solaris Containers, OpenVZ, Linux-VServer, LXC, AIX Workload Partitions, Parallels Virtuozzo Containers, and iCore Virtual Accounts.