What is an interpreted language? Java is interpreted?

In this question here I quote that Java is an interpreted language as I have always understood it that way. But I have been corrected in this comment that Java is no longer interpreted.

So I got some doubts:

  • What is an interpreted language?
  • so Java has already been interpreted but is it no more? Is that it?
Author: Comunidade, 2014-06-27

3 answers

An interpreted language is one that needs a special program-called an interpreter-for its programs to run. Contrast compiled language, in which your programs go through a process of translation that converts from [semi-]human language to machine language.

When writing something like:

x = y + z;

If you are telling the computer that you want to assign the variable x the value of the sum of the variables y and z. However, although this is the intention, at first this is nothing more than a text file. It is necessary that this file serves as an input for another program, which will do something with it and - at some point-its intention when writing the code is realized.

  • the simplest way is interpretation: the program parses the instruction, and then does what it is sending. Simple and straightforward!
  • a more complex one is the compilation: the program parses the instruction translates the same into the machine language, and produces as output another program - whose behavior should be to perform what you expressed in the code.

Between one and the other, there are several half-terms:

  • One can parse the statement, convert it to machine code, and execute that machine code immediately. Saving it in a kind of "cache" so that - if the same statement has to be executed again - it uses the code of previously generated machine, avoiding having to analyze it again. This process is called Just-In-Time Compilation - JIT. It is the strategy used for example in the JavaScript engine V8 (used in Chrome and node.js).

  • One can analyze the statement and convert it not to machine code directly, but to another (usually binary) format that is simpler to interpret and/or compile. It is useful when you want to analyze the sources once, but without "tying" the output to any specific platform. This strategy is used by Java, which through the tool javac converts programs from the textual format to the format bytecodes.

    In this case, it is said that the generated code will be executed by a" virtual machine": something similar to a real machine, but with its own architecture, its specific instruction set, everything that in principle would describe a machine. Only this machine is not physical - it's just a interpreter/compiler of intermediate code for a specific architecture.

    • and answering your question, formerly these bytecodes were used as input to an interpreter; today they are used as input to a JIT compiler, as described in the previous item.
  • One can parse the statement and convert it to an equivalent statement in another language - then send it to be compiled/interpreted by the tools of this second language. It is widely used when you want to program in a restricted environment (e.g. the browser, which only supports JavaScript) using a language other than that supported in that environment.

Finally, it should be remembered that the line that separates compiled from interpreted is not so well defined: even what we call "machine code" often still needs to be converted into what we call microinstructions - those that are sent directly to the CPU in fact. Not every architecture has this distinction (in some the "machine code" is directly executed), but the most used ones do. In the end, the result of the build is not focused on a specific architecture, but rather on a set of similar architectures (e.g. x86 and x86-64 - which cover a huge set of machines from the 80s to today).

 22
Author: mgibsonbr, 2014-06-27 12:20:17

Interpretation

An interpreted language executes the code directly from the source code.

Interpretation occurs in a similar way to compilation (translation), that is, it has a process of syntactic, lexical and semantic analysis, but this is done on demand. The source code will be read (it can be line by line or otherwise) and interpreted with these processes and then something is executed according to what is is writing.

Java

Java worked like this in early versions.

There is still some confusion with this because there is still a process of "interpreting" the code generated by the compiler. But usually it is not considered as an interpreted code since even this "interpretation" does not occur instruction by instruction.

To better understand, we have to observe that a Java code goes through the same analysis processes mentioned above, what changes is the way it goes going through the code and what it does at the end, which is what differentiates the build interpretation:

  • The interpretation takes place in short snippets of the program, can be line by line, and in the end performs something that has been determined in this snippet.

  • Compilation takes place in larger snippets (functions, classes, packages) trying to understand the whole and in the end a code is generated. There is a translation into another form.

    In the case is a Java virtual machine code (JVMI like link the page in Portuguese, but do not forget to see in English, it is always better). It is like machine code that a computer understands but it is specific to the Java platform and not to a processor. So the program is compiled but cannot run directly on the processor as is the case with languages like C or Pascal that usually directly create the understandable code for the processor.

JITter

So this virtual machine code that is called bytecode is compiled as well, but it is an extremely simple process, it is in an easy to read and understand format by this new compiler completely different from the language source code compiler. In addition to not having to worry about whether the code is correct or not, this has been done before. And mostly this compilation does not occur instruction by instruction.

This is done by a JIT (Just-In-Time) compiler which is a compiler that generates the processor's machine code, the so-called native code. In the case of Java this jitter transforms bytecode into native code making some optimizations that are only possible when you know well the environment that is running, not only the computer, operating system, settings, but also the other components (packages) that are being used together.

It JIT compilation understands all intermediate code and generates native code on demand as it is needed. But there is how to force this build to occur a little earlier.

This JITter did not exist in early versions of Java. Usually JITter does not influence the semantics of the language so any language previously interpreted or compiled to a bytecode can be Jitted later. In fact this is increasingly common. We can cite as examples JavaScript, Lua, PHP, etc. which happened to be JITtadas later in independent implementations.

The JITter usually only has to understand this standard bytecode and the processor code where it will run, need not know anything about the language. But there are JITters that work on top of the source code, so in a way there's an on-demand build (at the time it's going to run) as opposed to the better-known early build. But even this on-demand build is not an interpretation because it generates code to run and does not run directly.

Languages compiled without machine code

There are languages that strictly cannot be considered as interpreted. It runs on top of bytecode (sometimes called pseudocode) but is not Jittada. The execution is faster than the pure interpretation but not as much as the JITtada , because of a in a way there is an "interpretation" of this bytecode and it will be executed directly, without transformation into native code. Lua (pure, without the LuaJIT) is an example today.

This is not new. One of the first mainstream languages that was very successful in various parts of the world, including Brazil, was the Clipper (a dialect that survives in a modern way is the Harbour ). It worked this way but how it generated a executable many programmers believed that it generated code equal to C. But it was just a pcode encapsulated in .exe. It's similar to what .NET does today. Your programs seem to be in a native executable, but internally it has the bytecode.

But this technique has existed since the 50s.

There are languages that do not generate a bytecode but a AST (Abstract Syntax Tree). It's a step ahead of generation code. A compiler typically (in virtually all known implementations) generates an AST after the parsing and lexical processes and the other subsequent processes occur on top of this tree. Standard Ruby uses (or used, I may be outdated) this AST to run. Interpretation still occurs in AST, but is not the normal process of interpretation. Anyway there was a previous build process.

Of course there are implementations of Ruby that they work differently, including because they run on top of the Java platform, that is, in the end the same bytecode that is generated in Java is generated by JRuby and then it is Jittated by the JVM. This shows the flexibility of this JIT infrastructure.

Some people consider that these languages are still interpreted (or semi-interpreted) since they do not execute native machine code, there is a lighter interpretation because part of the process necessary was done before by a compiler and something simple to manipulate was generated with the" guarantee " that has no errors. But you need a program that understands this code and has something run indirectly. This would be an interpretation.

That I remember from the beginning of Java was like this, I think there has never been the interpretation of direct source code. That is, it always had javac and the JVM interpreted bytecode.

So it's pretty complicated to sort languages or even implementations as interpreted or compiled.

Languages are not interpreted

We cannot say that there are interpreted or compiled languages or even Jittada . At most we can say that the implementations have these characteristics. And they are not mutually exclusive. Although some people will say that they are different implementations provided together, it is possible to say that the three forms can exist in the implementation .

Conclusion

Obviously the execution of an interpreted program is much slower than a compiled program that has its machine code generated in advance. In the case of the Code JITtado has a cost to generate the machine code but it is a much lower cost than the direct interpretation. In addition this is done once and then the machine code is always repurposed.

Pure interpretation today only makes sense in time of development or to run very short scripts. Therefore any language used to make systems must have some form of compilation, even if optional.

 23
Author: Maniero, 2020-09-07 11:38:53

What is an interpreted language?

Is a programming language where high-level code written by the programmer is interpreted by another computer program and then executed by the operating system, i.e. the written language is not transformed into machine code, but rather interpreted by another program.

Is Java interpreted or compiled?

First let's understand some terms:

Javac - compiler that transforms code written in Java to bytecodes .

Bytecodes - code in bytes, different from machine code because it is not immediately executable.

JIT - just In Time Compiler , compiles bytecode to machine code at runtime, Performing Performance Optimizations.

JVM - virtual platform that loads the class file into RAM, checks the bytecode by checking if there are access restriction violations in your code and converts to executable machine code.

Therefore the high-level code written in Java by the programmer is compiled by the Javac which transforms to bytecode . Bytecode is compiled by the JVM through the JIT compiler for a sequence of instructions given to machine code at runtime before running natively. Its main goal is to make optimizations heavy on performance. Given this, we can say that Java is not interpreted and compiled, since it is not directly executed by another program from the high-level code written by the programmer.

 9
Author: abfurlan, 2014-06-27 12:59:13