Executable files, assembly language, and programming languages

Executable files are files that can be run ('executed') directly by the operating system (OS). On Windows they end in .exe; on Unix they have no name extension. Depending on the OS, the layout of an executable can vary, but they contain the actual list of instructions for your processor to execute - in the form the processor needs them in, not a text form that represents those instructions to humans.

An assembly language is a textual way of representing those same instructions so that humans can write and read them. A program called an assembler is used to convert the text (called source code) into native processor instructions. Assembly language is still incredibly hard to work with though, even for a very experienced programmer, so that's why "higher-level" languages exist.

The languages most people mean when they talk about programming languages, like C or Python, are invented specifically for humans to read and write easily at the expense of being a direct description of machine instructions. Each instruction in assembly does something like "move this data from one nameless location in memory to another", whereas a line of code in a high-level programming language does something like "add this item to a list". The latter is made of the former, but you don't have to deal with all the inscrutable details to get something done with it.

Higher-level languages require a program called a compiler or an interpreter to convert the source code into machine instructions. A compiler is basically the same thing as an assembler except that it has to do a lot more, because the semantics in a language like C aren't just a translation of machine instructions. So compilers involve incredibly complicated processes to generate instructions based on the source code.

Higher-level languages make software as we know it possible, because they're so much easier and more efficient for humans to read and write that you can do things with them in a jiffy that would be nightmarishly complicated in an assembly language. A program in C might be 30 lines long when the equivalent code in an assembly language would be 100. And that's not even to mention that the assembly code is made of commands like MOVQ %rax, %rdi while the C code is made of lines like printf("%s\n", message). Even if you know what they mean, the former is vastly harder to read. And the same program in Python is probably only 10 lines long, if that.

Compilers versus interpreters

The difference between a compiler and an interpreter is that a compiler reads your source code and puts the resulting machine code in another file for you to run it later. An interpreter converts your source code to instructions and executes them on the fly. Interpreters are a lot more convenient, but they mean that the program can never be run without the interpreter. Also, running code with an interpreter is a lot slower than running code generated by a compiler, because the interpreter doesn't have time to read your code and "think about it" and carefully optimize it before outputting the machine code.

But what actually is a language?

Most programming languages are, technically speaking, just specifications of how a compiler or interpreter should interpret code in order to be a considered a compiler or interpreter of that language. For example, the Python 3 specification describes how Python works, and any interpreter that interprets code according to this specification can be considered a Python 3 interpreter, or an implementation of Python 3 (there are several although CPython is the most common).

This page was last modified (UTC)