> Classically, dynamic languages have been interpreted, that is, evaluated at run- time by an interpreter program. In recent years, dynamic languages have also been complied, that is, translated to another language such as native code before being executed.
> We conclude that ahead-of-time compilation is a viable alternative to interpretation of dynamic languages
Especially when the AOT compiler can be used incremental and the program can be extended/modified incrementally. See sbcl.
> The project then showed that it is possible to compile this dynamic language ahead-of-time to a native binary, using a statically typed language as an inter- mediate step.
Co-author of the thesis here - we wrote it almost 6 years ago, so my memory may be a little rusty :-)
The question we dealt with was how to compile a weakly, implicitly and dynamically typed language (see the definitions in the thesis, but basically a language where the variable types cannot be statically determined in the general case and will be coerced if the run-time type does not match operator requirements) in a manner that is more efficient than simply interpreting the program source code.
I do not recall seeing any related work with regards to BASIC and Lisp at the time, however, we may very well have overlooked something. Thank you for the references!
Lisp is strongly, implicit and dynamically typed, with type coercions. For example it provides generic arithmetic:
* (let ((a 1) ; integer
(b 2.0) ; default float
(c 1/3) ; ratio
(d #c(1.2 2.3))) ; complex number
(+ a b c d))
#C(4.5333333 2.3)
Variables A, B, C, D have no type declarations. The values usually carry type information. The + operation will take any numeric value and create a result value of a type it chooses. for example (+ 1/2 1/2) will result to 1. Adding ratios might create a ratio or an integer.
There is a whole bunch of literature on compiling Scheme:
Thanks for the link - a lot of interesting literature to dig into!
Generally with unknown (at compile-time) variable types, you need to box the variables (carry type information in addition to the value).
The operators may then either work on the boxed variables and choose behaviour based on the type information or the operators may be specialized in many versions to work on the unboxed variables (this requires the run-time to dispatch to the correct specialized version, if the types cannot be determined statically).
This is generally a trade-off between space and execution time - if the number of possible types are low (either because of a limited type system or because the possible different types can be determined statically), then it may make sense to specialize.
In JS, in addition to mutable types and values for each variable, you also have a challenge with variable scope. It is possible to introduce new variables in the global scope from a local scope, so depending on run-time values, a variable for a given statement may or may not have been declared.
An example from the thesis:
function f(){
a = 5;
}
function g(){
console.log(a);
}
if(x){
f();
}
g();
Assuming that 'a' was not declared elsewhere also, the call to 'g()' will either print out the value of 'a' (ie. "5") or will result in a run-time error.
> you need to box the variables (carry type information in addition to the value).
Dealing with boxing/unboxing and trying to minimize boxing operations in computations is regularly done in Lisp compilers.
> It is possible to introduce new variables in the global scope from a local scope
just like in Lisp. Generally Javascript and Lisp have a lot in common. Scoping rules are different though and Lisp is not object-oriented at the core - but provides closures or adds object-oriented extensions like CLOS which are semi-optional.
* (defun f ()
(setf a 5))
; in: DEFUN F
; (SETF A 5)
; ==>
; (SETQ A 5)
;
; caught WARNING:
; undefined variable: COMMON-LISP-USER::A
;
; compilation unit finished
; Undefined variable:
; A
; caught 1 WARNING condition
F
* (defun g ()
(print a))
; in: DEFUN G
; (PRINT A)
;
; caught WARNING:
; undefined variable: COMMON-LISP-USER::A
;
; compilation unit finished
; Undefined variable:
; A
; caught 1 WARNING condition
As you can see the compiler warns about undefined variables, but deals with it.
* (if (> (random 1.0) 0.5) (f))
5
* (g)
5
5
If we remove the binding of A, then we get a runtime error.
* (makunbound 'a)
A
* (g)
debugger invoked on a UNBOUND-VARIABLE in thread
#<THREAD "main thread" RUNNING {10005205B3}>:
The variable A is unbound.
Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name):
0: [CONTINUE ] Retry using A.
1: [USE-VALUE ] Use specified value.
2: [STORE-VALUE] Set specified value and use it.
3: [ABORT ] Exit debugger, returning to top level.
(G)
source: (PRINT A)
0]
As one can see, both F and G are actually compiled machine code functions. Both functions we directly AOT compiled to machine code when I entered them at the prompt.
Turbo Basic (nowadays PowerBasic), Quick Basic, VAX Basic, Amos, CBASIC, VB 6.0, PureBasic, REALbasic (nowadays Xojo) are all examples from BASIC compilers released between late 70's and late 90's.
Strange, I thought I have seen compilers for languages like BASIC for a long time. The first LISP compiler is from the early 60s: http://www.bitsavers.org/pdf/mit/ai/aim/AIM-039.pdf
Scheme compilers are from the 70s. Rabbit for Scheme: https://dspace.mit.edu/handle/1721.1/6913
> We conclude that ahead-of-time compilation is a viable alternative to interpretation of dynamic languages
Especially when the AOT compiler can be used incremental and the program can be extended/modified incrementally. See sbcl.
> The project then showed that it is possible to compile this dynamic language ahead-of-time to a native binary, using a statically typed language as an inter- mediate step.
This has been known for a long time.
For example Kyoto Common Lisp from 1985: http://www.softwarepreservation.org/projects/LISP/kcl/doc/kc...
Design and Implementation of KCL:
http://www.softwarepreservation.org/projects/LISP/kcl/paper/...
KCL has been morphed into newer implementations like ECL and CLASP (addresses LLVM) over the years.
Or CLICC:
http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=522...
Or Stalin for an attempt on a highly optimizing whole-program compiler for Scheme: https://github.com/barak/stalin
> Overall the project found that compiling a dynamic language ahead-of-time is a viable alternative to interpreting the language
See SBCL: http://www.sbcl.org