Programming languages are described in terms of their syntax (form) and semantics (meaning), usually defined by a formal language. Languages usually provide features such as a type system, variables, and mechanisms for error handling. An implementation of a programming language is required in order to execute programs, namely an interpreter or a compiler. An interpreter directly executes the source code, while a compiler produces an executable program.
The term computer language is sometimes used interchangeably with programming language.[2] However, the usage of both terms varies among authors, including the exact scope of each. One usage describes programming languages as a subset of computer languages.[3] Similarly, languages used in computing that have a different goal than expressing computer programs are generically designated computer languages. For instance, markup languages are sometimes referred to as computer languages to emphasize that they are not meant to be used for programming.[4]One way of classifying computer languages is by the computations they are capable of expressing, as described by the theory of computation. The majority of practical programming languages are Turing complete,[5] and all Turing complete languages can implement the same set of algorithms. ANSI/ISO SQL-92 and Charity are examples of languages that are not Turing complete, yet are often called programming languages.[6][7] However, some authors restrict the term "programming language" to Turing complete languages.[1][8]
Another usage regards programming languages as theoretical constructs for programming abstract machines and computer languages as the subset thereof that runs on physical computers, which have finite hardware resources.[9] John C. Reynolds emphasizes that formal specification languages are just as much programming languages as are the languages intended for execution. He also argues that textual and even graphical input formats that affect the behavior of a computer are programming languages, despite the fact they are commonly not Turing-complete, and remarks that ignorance of programming language concepts is the reason for many flaws in input formats.[10]
In most practical contexts, a programming language involves a computer; consequently, programming languages are usually defined and studied this way.[11] Programming languages differ from natural languages in that natural languages are only used for interaction between people, while programming languages also allow humans to communicate instructions to machines.
The domain of the language is also worth consideration. Markup languages like XML, HTML, or troff, which define structured data, are not usually considered programming languages.[12][13][14] Programming languages may, however, share the syntax with markup languages if a computational semantics is defined. XSLT, for example, is a Turing complete language entirely using XML syntax.[15][16][17] Moreover, LaTeX, which is mostly used for structuring documents, also contains a Turing complete subset.[18][19]
Programming languages usually contain abstractions for defining and manipulating data structures or controlling the flow of execution. The practical necessity that a programming language supports adequate abstractions is expressed by the abstraction principle.[20] This principle is sometimes formulated as a recommendation to the programmer to make proper use of such abstractions.[21]
Lisp, implemented in 1958, was the first functional programming language.[32] Unlike Fortran, it supports recursion and conditional expressions,[33] and it also introduced dynamic memory management on a heap and automatic garbage collection.[34] For the next decades, Lisp dominated artificial intelligence applications.[35] In 1978, another functional language, ML, introduced inferred types and polymorphic parameters.[31][36]
Prolog, designed in 1972, was the first logic programming language, communicating with a computer using formal logic notation.[41][42] With logic programming, the programmer specifies a desired result and allows the interpreter to decide how to achieve it.[43][42]
During the 1980s, the invention of the personal computer transformed the roles for which programming languages were used.[44] New languages introduced in the 1980s included C++, a superset of C that can compile C programs but also supports classes and inheritance.[45] Ada and other new languages introduced support for concurrency.[46] The Japanese government invested heavily into the so-called fifth-generation languages that added support for concurrency to logic programming constructs, but these languages were outperformed by other concurrency-supporting languages.[47][48]
Some of the new programming languages are classified as visual programming languages like Scratch, LabVIEW and PWCT. Also, some of these languages mix between textual and visual programming usage like Ballerina.[60][61][62][63] Also, this trend lead to developing projects that help in developing new VPLs like Blockly by Google.[64] Many game engines like Unreal and Unity added support for visual scripting too.[65][66]
Every programming language includes fundamental elements for describing data and the operations or transformations applied to them, such as adding two numbers or selecting an item from a collection. These elements are governed by syntactic and semantic rules that define their structure and meaning, respectively.
A programming language's surface form is known as its syntax. Most programming languages are purely textual; they use sequences of text including words, numbers, and punctuation, much like written natural languages. On the other hand, some programming languages are graphical, using visual relationships between symbols to specify a program.
The syntax of a language describes the possible combinations of symbols that form a syntactically correct program. The meaning given to a combination of symbols is handled by semantics (either formal or hard-coded in a reference implementation). Since most languages are textual, this article discusses textual syntax.
Not all syntactically correct programs are semantically correct. Many syntactically correct programs are nonetheless ill-formed, per the language's rules; and may (depending on the language specification and the soundness of the implementation) result in an error on translation or execution. In some cases, such programs may exhibit undefined behavior. Even when a program is well-defined within a language, it may still have a meaning that is not intended by the person who wrote it.
The following C language fragment is syntactically correct, but performs operations that are not semantically defined (the operation *p >> 4 has no meaning for a value having a complex type and p->im is not defined because the value of p is the null pointer):
If the type declaration on the first line were omitted, the program would trigger an error on the undefined variable p during compilation. However, the program would still be syntactically correct since type declarations provide only semantic information.
The grammar needed to specify a programming language can be classified by its position in the Chomsky hierarchy. The syntax of most programming languages can be specified using a Type-2 grammar, i.e., they are context-free grammars.[67] Some languages, including Perl and Lisp, contain constructs that allow execution during the parsing phase. Languages that have constructs that allow the programmer to alter the behavior of the parser make syntax analysis an undecidable problem, and generally blur the distinction between parsing and execution.[68] In contrast to Lisp's macro system and Perl's BEGIN blocks, which may contain general computations, C macros are merely string replacements and do not require code execution.[69]
Static semantics defines restrictions on the structure of valid texts that are hard or impossible to express in standard syntactic formalisms.[1][failed verification] For compiled languages, static semantics essentially include those semantic rules that can be checked at compile time. Examples include checking that every identifier is declared before it is used (in languages that require such declarations) or that the labels on the arms of a case statement are distinct.[70] Many important restrictions of this type, like checking that identifiers are used in the appropriate context (e.g. not adding an integer to a function name), or that subroutine calls have the appropriate number and type of arguments, can be enforced by defining them as rules in a logic called a type system. Other forms of static analyses like data flow analysis may also be part of static semantics. Programming languages such as Java and C# have definite assignment analysis, a form of data flow analysis, as part of their respective static semantics.
Once data has been specified, the machine must be instructed to perform operations on the data. For example, the semantics may define the strategy by which expressions are evaluated to values, or the manner in which control structures conditionally execute statements. The dynamic semantics (also known as execution semantics) of a language defines how and when the various constructs of a language should produce a program behavior. There are many ways of defining execution semantics. Natural language is often used to specify the execution semantics of languages commonly used in practice. A significant amount of academic research goes into formal semantics of programming languages, which allows execution semantics to be specified in a formal manner. Results from this field of research have seen limited application to programming language design and implementation outside academia.
A type system defines how a programming language classifies values and expressions into types, how it can manipulate those types and how they interact. The goal of a type system is to verify and usually enforce a certain level of correctness in programs written in that language by detecting certain incorrect operations. Any decidable type system involves a trade-off: while it rejects many incorrect programs, it can also prohibit some correct, albeit unusual programs. In order to bypass this downside, a number of languages have type loopholes, usually unchecked casts that may be used by the programmer to explicitly allow a normally disallowed operation between different types. In most typed languages, the type system is used only to type check programs, but a number of languages, usually functional ones, infer types, relieving the programmer from the need to write type annotations. The formal design and study of type systems is known as type theory.
c80f0f1006