In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate (or even hide entirely) significant areas of computing systems (e.g. memory management),
making the process of developing a program simpler and more
understandable than when using a lower-level language. The amount of
abstraction provided defines how "high-level" a programming language is.
In the 1960s, high-level programming languages using a compiler were commonly called autocodes. Examples of autocodes are COBOL and Fortran.
The first high-level programming language designed for computers was Plankalkül, created by Konrad Zuse. However, it was not implemented in his time, and his original contributions were largely isolated from other developments due to World War II, aside from the language's influence on the "Superplan" language by Heinz Rutishauser and also to some degree Algol. The first significantly widespread high-level language was Fortran, a machine-independent development of IBM's earlier Autocode systems. Algol, defined in 1958 and 1960 by committees of European and American computer scientists, introduced recursion as well as nested functions under lexical scope. It was also the first language with a clear distinction between value and name-parameters and their corresponding semantics. Algol also introduced several structured programming concepts, such as the while-do and if-then-else constructs and its syntax was the first to be described in formal notation – "Backus–Naur form" (BNF). During roughly the same period, Cobol introduced records (also called structs) and Lisp introduced a fully general lambda abstraction in a programming language for the first time.
In the 1960s, high-level programming languages using a compiler were commonly called autocodes. Examples of autocodes are COBOL and Fortran.
The first high-level programming language designed for computers was Plankalkül, created by Konrad Zuse. However, it was not implemented in his time, and his original contributions were largely isolated from other developments due to World War II, aside from the language's influence on the "Superplan" language by Heinz Rutishauser and also to some degree Algol. The first significantly widespread high-level language was Fortran, a machine-independent development of IBM's earlier Autocode systems. Algol, defined in 1958 and 1960 by committees of European and American computer scientists, introduced recursion as well as nested functions under lexical scope. It was also the first language with a clear distinction between value and name-parameters and their corresponding semantics. Algol also introduced several structured programming concepts, such as the while-do and if-then-else constructs and its syntax was the first to be described in formal notation – "Backus–Naur form" (BNF). During roughly the same period, Cobol introduced records (also called structs) and Lisp introduced a fully general lambda abstraction in a programming language for the first time.
Features
"High-level language" refers to the higher level of abstraction from machine language. Rather than dealing with registers, memory addresses and call stacks, high-level languages deal with variables, arrays, objects, complex arithmetic or boolean expressions, subroutines and functions, loops, threads, locks, and other abstract computer science concepts, with a focus on usability over optimal program efficiency. Unlike low-level assembly languages, high-level languages have few, if any, language elements that translate directly into a machine's native opcodes.
Other features, such as string handling routines, object-oriented
language features, and file input/output, may also be present. One thing
to note about high-level programming languages is that these languages
allow the programmer to be detached and separated from the machine. That
is, unlike low-level languages like assembly or machine language,
high-level programming can amplify the programmer's instructions and
trigger a lot of data movements in the background without their
knowledge. The responsibility and power of executing instructions have
been handed over to the machine from the programmer.
Abstraction penalty
High-level
languages intend to provide features which standardize common tasks,
permit rich debugging, and maintain architectural agnosticism; while
low-level languages often produce more efficient code through optimization for a specific system architecture. Abstraction penalty
is the cost that high-level programming techniques pay for being unable
to optimize performance or use certain hardware because they don't take
advantage of certain low-level architectural resources. High-level
programming exhibits features like more generic data structures and
operations, run-time interpretation, and intermediate code files; which
often result in execution of far more operations than necessary, higher
memory consumption, and larger binary program size.
For this reason, code which needs to run particularly quickly and
efficiently may require the use of a lower-level language, even if a
higher-level language would make the coding easier. In many cases,
critical portions of a program mostly in a high-level language can be
hand-coded in assembly language, leading to a much faster, more efficient, or simply reliably functioning optimised program.
However, with the growing complexity of modern microprocessor
architectures, well-designed compilers for high-level languages
frequently produce code comparable in efficiency to what most low-level
programmers can produce by hand, and the higher abstraction may allow
for more powerful techniques providing better overall results than their
low-level counterparts in particular settings.[9]
High-level languages are designed independent of a specific computing
system architecture. This facilitates executing a program written in
such a language on any computing system with compatible support for the
Interpreted or JIT
program. High-level languages can be improved as their designers
develop improvements. In other cases, new high-level languages evolve
from one or more others with the goal of aggregating the most popular
constructs with new or improved features. An example of this is Scala which maintains backward compatibility with Java
which means that programs and libraries written in Java will continue
to be usable even if a programming shop switches to Scala; this makes
the transition easier and the lifespan of such high-level coding
indefinite. In contrast, low-level programs rarely survive beyond the
system architecture which they were written for without major revision.
This is the engineering 'trade-off' for the 'Abstraction Penalty'.
Relative meaning
Examples of high-level programming languages in active use today include Python, Visual Basic, Delphi, Perl, PHP, ECMAScript, Ruby, C#, Java and many others.
The terms high-level and low-level are inherently relative. Some decades ago, the C language, and similar languages, were most often considered "high-level", as it supported concepts such as expression evaluation, parameterised recursive functions, and data types and structures, while assembly language was considered "low-level". Today, many programmers might refer to C as low-level, as it lacks a large runtime-system
(no garbage collection, etc.), basically supports only scalar
operations, and provides direct memory addressing. It, therefore,
readily blends with assembly language and the machine level of CPUs and microcontrollers.
Assembly language may itself be regarded as a higher level (but often still one-to-one if used without macros) representation of machine code, as it supports concepts such as constants and (limited) expressions, sometimes even variables, procedures, and data structures. Machine code, in its turn, is inherently at a slightly higher level than the microcode or micro-operations used internally in many processors.
Execution modes
There are three general modes of execution for modern high-level languages:
- Interpreted
- When code written in a language is interpreted, its syntax is read and then executed directly, with no compilation stage. A program called an interpreter reads each program statement, following the program flow, then decides what to do, and does it. A hybrid of an interpreter and a compiler will compile the statement into machine code and execute that; the machine code is then discarded, to be interpreted anew if the line is executed again. Interpreters are commonly the simplest implementations of the behavior of a language, compared to the other two variants listed here.
- Compiled
- When code written in a language is compiled, its syntax is transformed into an executable form before running. There are two types of compilation:
- Machine code generation
- Some compilers compile source code directly into machine code. This is the original mode of compilation, and languages that are directly and completely transformed to machine-native code in this way may be called truly compiled languages. See assembly language.
- Intermediate representations
- When code written in a language is compiled to an intermediate representation, that representation can be optimized or saved for later execution without the need to re-read the source file. When the intermediate representation is saved, it may be in a form such as bytecode. The intermediate representation must then be interpreted or further compiled to execute it. Virtual machines that execute bytecode directly or transform it further into machine code have blurred the once clear distinction between intermediate representations and truly compiled languages.
- Source-to-source translated or transcompiled
- Code written in a language may be translated into terms of a lower-level language for which native code compilers are already common. JavaScript and the language C are common targets for such translators. See CoffeeScript, Chicken Scheme, and Eiffel as examples. Specifically, the generated C and C++ code can be seen (as generated from the Eiffel language when using the EiffelStudio IDE) in the EIFGENs directory of any compiled Eiffel project. In Eiffel, the translated process is referred to as transcompiling or transcompiled, and the Eiffel compiler as a transcompiler or source-to-source compiler.
Note that languages are not strictly interpreted languages or compiled languages. Rather, implementations of language behavior use interpreting or compiling. For example, ALGOL 60 and Fortran
have both been interpreted (even though they were more typically
compiled). Similarly, Java shows the difficulty of trying to apply these
labels to languages, rather than to implementations; Java is compiled
to bytecode which is then executed by either interpreting (in a Java virtual machine (JVM)) or compiling (typically with a just-in-time compiler such as HotSpot,
again in a JVM). Moreover, compiling, transcompiling, and interpreting
are not strictly limited to only a description of the compiler artifact
(binary executable or IL assembly).
High-level language computer architecture
Alternatively,
it is possible for a high-level language to be directly implemented by a
computer – the computer directly executes the HLL code. This is known
as a high-level language computer architecture – the computer architecture itself is designed to be targeted by a specific high-level language. The Burroughs large systems were target machines for ALGOL 60, for example.