The only language computer hardware can understand is binary code consisting of 1s and 0s. Learn how compilers and interpreters are used to translate a computer program into binary code in this video lesson.
A program is a set of instructions that tells a computer what to do in order to come up with a solution to a particular problem. Programs are written using a programming language. A programming language is a formal language designed to communicate instructions to a computer. There are two major types of programming languages: low-level languages and high-level languages.
Low-level languages are referred to as 'low' because they are very close to how different hardware elements of a computer actually communicate with each other. Low-level languages are machine oriented and require extensive knowledge of computer hardware and its configuration. There are two categories of low-level languages: machine language and assembly language.
Machine language, or machine code, is the only language that is directly understood by the computer, and it does not need to be translated. All instructions use binary notation and are written as a string of 1s and 0s. A program instruction in machine language may look something like this:
Technically speaking, this is the only language computer hardware understands. However, binary notation is very difficult for humans to understand. This is where assembly languages come in.
An assembly language is the first step to improve programming structure and make machine language more readable by humans. An assembly language consists of a set of symbols and letters. A translator is required to translate the assembly language to machine language. This translator program is called the 'assembler.' It can be called the second generation language since it no longer uses 1s and 0s to write instructions, but terms like MOVE, ADD, SUB and END.
Many of the earliest computer programs were written in assembly languages. Most programmers today don't use assembly languages very often, but they are still used for applications like operating systems of electronic devices and technical applications, which use very precise timing or optimization of computer resources. While easier than machine code, assembly languages are still pretty difficult to understand. This is why high-level languages have been developed.
A high-level language is a programming language that uses English and mathematical symbols, like +, -, % and many others, in its instructions. When using the term 'programming languages,' most people are actually referring to high-level languages. High-level languages are the languages most often used by programmers to write programs. Examples of high-level languages are C++, Fortran, Java and Python.
To get a flavor of what a high-level language actually looks like, consider an ATM machine where someone wants to make a withdrawal of $100. This amount needs to be compared to the account balance to make sure there are enough funds. The instruction in a high-level computer language would look something like this: x = 100if balance x: print 'Insufficient balance'else: print 'Please take your money'
This is not exactly how real people communicate, but it is much easier to follow than a series of 1s and 0s in binary code.
There are a number of advantages to high-level languages. The first advantage is that high-level languages are much closer to the logic of a human language. A high-level language uses a set of rules that dictate how words and symbols can be put together to form a program. Learning a high-level language is not unlike learning another human language - you need to learn vocabulary and grammar so you can make sentences. To learn a programming language, you need to learn commands, syntax and logic, which correspond closely to vocabulary and grammar.
The second advantage is that the code of most high-level languages is portable and the same code can run on different hardware. Both machine code and assembly languages are hardware specific and not portable. This means that the machine code used to run a program on one specific computer needs to be modified to run on another computer. Portable code in a high-level language can run on multiple computer systems without modification. However, modifications to code in high-level languages may be necessary because of the operating system. For example, programs written for Windows typically don't run on a Mac.
A high-level language cannot be understood directly by a computer, and it needs to be translated into machine code. There are two ways to do this, and they are related to how the program is executed: a high-level language can be compiled or interpreted.
A compiler is a computer program that translates a program written in a high-level language to the machine language of a computer. The high-level program is referred to as 'the source code.' A typical computer program processes some type of input data to produce output data. The compiler is used to translate source code into machine code or compiled code. This does not yet use any of the input data. When the compiled code is executed, referred to as 'running the program,' the program processes the input data to produce the desired output.
When using a compiler, the entire source code needs to be compiled before the program can be executed. The resulting machine code is typically a compiled file, such as a file with an .exe extension. Once you have a compiled file, you can run the program over and over again without having to compile it again. If you have multiple inputs that require processing, you run the compiled code as many times as needed.
An interpreter is a computer program that simulates a computer that understands a high-level language. This means that the interpreter translates the source code line by line during execution. Consider again a computer program that processes some type of input data to produce output data. The interpreter executes the code line by line, which results in the desired output data. The only result is the output data - there is no compiled code. When using an interpreter, every time you want to run the program, you need to interpret the code again line by line. There is no compiled code to use if you have multiple inputs that require processing.
To understand the difference between compiling and interpreting, let's examine the equivalent in human languages. For example, consider a movie made in Asia where all the characters speak Vietnamese. To market the movie to an international audience, the spoken text needs to be translated into English. A translator would sit down and carefully translate all the text and create subtitles for the movie. Anytime somebody wants to watch the movie, they can turn on the subtitles. This type of translation is the equivalent of compiling - everything is translated once and can be used many times afterward.
Now consider a delegate from Vietnam giving a speech in the United Nations in Vietnamese. In order for the attendees to understand the speech, there are a number of translators who provide a translation that is transmitted to the attendees' headphones. This translation occurs in close to real time. Every time the delegate speaks in Vietnamese, the translators get to work. This type of translation is the equivalent of interpreting - text is translated line by line as necessary, and the results are not used again.
Compiled vs. Interpreted
Compiled code tends to be faster since the translation is completed in one step prior to the actual execution. Interpreted code, on the other hand, is more flexible and can be run interactively. For example, using interpreted code, you can try out a few lines of code to see if they work very quickly without having to go through the steps of compiling and executing the program.
One advantage of using compiled code is that it does not reveal the original source code. This makes it possible to distribute a program without revealing its inner workings. When you install a software application on your computer, you are typically installing a compiled version of the code.
You can run the software application, but you can't open up the source code in the original programming language. For many companies selling software applications, the original source code is a well-kept secret and gives them their competitive advantage over other companies. Examples of compiled languages include C and its derivatives C++ and C#, COBOL, Java and Fortran. Examples of interpreted languages are Perl, Python and Ruby.
There are two major types of programming languages: low level languages and high level languages. Low-level languages are machine oriented and require extensive knowledge of computer hardware and its configuration. There are two categories of low-level languages: machine language and assembly language.
Machine language, or machine code, consists of binary code and is the only language that is directly understood by the computer. An assembly language consists of a set of symbols and letters and requires translation to machine language. Both machine code and assembly languages are hardware specific.
A high-level language is a programming language that uses English and mathematical symbols in its instructions. To execute a program in a high-level language, it can be compiled or interpreted. A compiler translates the entire program written in a high-level language to machine language prior to execution. An interpreter translates a program line by line during execution.
After you've reviewed this video lesson, you should be able to:
- Characterize the two types of programming languages
- Evaluate the two categories of low-level languages
- Differentiate between a compiler and an interpreter
- List examples of interpreted and compiled languages