Does R Need a Compiler?


R is a high-level compiler that is known to have evolved from S. It is a popular statistical analysis and graphical tool with many similarities to Scheme. Over the last several years, R has gained immense popularity and its adoption in the industry has skyrocketed. Hence, there is a growing interest in learning R among the data science professionals in a variety of industries. When learning a new programming language, understanding whether that language is an interpreted one or would need a compiler is critical.

So, does R need a compiler? R does not need a compiler. R is an interpreted programming language and doesn’t need a compiler to operate because there is no need to compile code into an object language. 

In this article, we’ll define R programming language, R environment, and compilers. We’ll also go through a few examples of this programming language and its interpretation.

Important Sidenote: We interviewed numerous data science professionals (data scientists, hiring managers, recruiters – you name it) and identified 6 proven steps to follow for becoming a data scientist. Read my article: ‘6 Proven Steps To Becoming a Data Scientist [Complete Guide] for in-depth findings and recommendations! – This is perhaps the most comprehensive article on the subject you will find on the internet!

What Is R Programming Language?

Ross Ihaka and Robert Gentleman, two statisticians, developed R back in 1993. The name R is a tribute to the first names of the two developers. The two were looking to design a software environment capable of running statistical computing. 

The result was the now open-source programming language possessing a wide registry of statistical and graphical methods. R is currently developed by the R Development Core Team and is available free of charge under the GNU General Public License. 

R programming language includes machine learning algorithms, statistical inference, linear regression, and time-series, among other computing methods. R has its roots in S and Scheme. Though R shares many similarities with S, the underlying semantics and implementation are heavily influenced by Scheme.  

R is a high-level language for analyzing, manipulating, and displaying data. The R language is used and trusted by many major companies that depend on data mining for high data traffic. Some notable companies that rely on R include Facebook, Alphabet (Google), Uber, Ford, NY Times, and Airbnb.   

What Is the R Environment?

R consists of integrated software facilities used for data manipulation, calculation, and graphical display. The R environment has the following:

  • Data handling and storage facilities.
  • Calculation operators.
  • A wide collection of integrated data analysis tools.
  • Graphical facilities for visualization and display.
  • The R language, with loops, conditionals, recursive functions, and input/output facilities. 
  • The R environment is written mostly in C, FORTRAN, and R programming languages. 

What Is a Compiler?

A compiler is a software program that translates source code into object code. The compiler turns a high-level script into machine language. 

Compilers are so-called because they first read the entire source codes, compile it, and then reorganize the instructions. 

In contrast, an interpreter is a software program that analyzes code line by line, not looking at the entire source code. Interpreters work faster than compilers since they can execute a program instantly. Compilers take some time before returning an executable program. 

High-level programming languages come with a compiler, except for strictly interpretive languages. R is an example of a high level interpreted language. 

How Does R Run Without a Compiler?

R has a command line interpreter to visualize already compiled code. All of R’s key routines run through compiled code such as .C or .Call. 

When you enter expressions into the R console, the R interpreter executes the actual code that you wrote. There is no need to compile R programs into an object language, unlike programs such as Java, C. or C++. 

Simple Example

For example, if a user types in 3+4 in the R command prompt and presses the enter button, the program returns the following display:

>3+4

[1] 7

For a more complex example, the user can type this:

# Create a Function to Print Squares of Numbers in Sequence

new.function <- function(a) {

   for(i in 1:a) {

      b <- i^2

      print(b)

   }

}

# Call the Function new.function Supplying 6 as an Argument

new.function(6)

After executing this code, the R program returns the following display:

[1] 1

[1] 4

[1] 9

[1] 16

[1] 25

[1] 36

Just-In-Time (JIT) compilers execute R program source codes into interpreted bytecodes. Once you submit your program from an Integrated Development Environment (IDE) to R, the R interpreter executes your code line by line. The R interpreter then displays the output in a graphical console. 

How Does the R Interpreter Work?

Every R program is made up of a series of expressions. The expressions are often in the form of function calls. When you key in your expression, the interpreter starts by parsing each of them. Parsing translates the syntactic sugar into functional form. 

After translation, R substitutes objects for symbols where relevant. Then R evaluates the expressions and returns an object. 

Is a Compiler Part of the R Distribution Package?

Yes. According to the Comprehensive R Archive Network, R distribution comes with:

  • Base R functions
  • R bytecode compiler 
  • Base R datasets
  • grDevices for base and grid graphics
  • Base graphics R functions
  • Grid graphic layout capabilities rewrites 
  • Defined methods and classes for R objects.
  • Parallel computation support
  • Regression spline functions and classes
  • Statistical functions for R
  • S4 statistical classes
  • Tcl/Tk interface and language bindings
  • tools for package development and administration.
  • utils utility functions for R

What Is a Bytecode Compiler?

Bytecode resembles machine language but is more readable for humans. Bytecode is a program code that has been compiled from source code to a low-level language suitable for a software interpreter. The bytecode can then be further compiled into machine language or directly executed by a virtual machine. 

Though it is possible for humans to write the bytecode directly, it is far easier to write in a high-level language such as Java. The Java source code can be compiled into Java bytecode and then run on a Java Virtual Machine (JVM). So, Java bytecode files such as Java.CLASS files are usually produced from source code by a compiler like javac. Some common examples of java bytecode instructions include:

  • new: creates new objects.
  • aload_0: load a reference onto the stack from local variable 0
  • astore_2: store a reference into local variable 2
  • caload: load a char from an array
  • irem: logical int remainder

Why Does R Need a Bytecode Compiler?

In the 4.0.2 release, 22% of R’s lines are written in R language. 50% of the code is written in C while 30% is written in FORTRAN. Also, users can link C, C++, and FORTRAN code to call at run time for the more complicated tasks. Advanced users may even write code in C to directly manipulate R objects. 

All these make R a versatile and dynamic programming language and software environment. To make it possible for R to run code written in these other programming languages, we need to break those languages into bytecodes. 

The R bytecode compiler translates source code expressions into a byte code object. The object is then evaluated using eval. The compiler parses the source code expressions, compiles them, then writes the compiled expressions to the outfile, loading into the R environment. 

The bytecode compiler can enable and disable JIT compilation before returning an executable file. Luke Tierney of the University of Iowa offers an example of the workings of an R bytecode compiler.

oldJIT <- enableJIT(0)

A Simple Example

f <- function(x) x+1

fc <- cmpfun(f)

fc(2)

disassemble(fc)

Old R Version of lapply

la1 <- function(X, FUN, …) {

    FUN <- match.fun(FUN)

    if (!is.list(X))

X <- as.list(X)

    rval <- vector(“list”, length(X))

    for(i in seq(along = X))

rval[i] <- list(FUN(X[[i]], …))

    names(rval) <- names(X) # keep `names’ !

    return(rval)

}

A Small Variation

la2 <- function(X, FUN, …) {

    FUN <- match.fun(FUN)

    if (!is.list(X))

X <- as.list(X)

    rval <- vector(“list”, length(X))

    for(i in seq(along = X)) {

        v <- FUN(X[[i]], …)

        if (is.null(v)) rval[i] <- list(v)

        else rval[[i]] <- v

    }

    names(rval) <- names(X) # keep `names’ !

    return(rval)

}

Compiled Versions

la1c <- cmpfun(la1)

la2c <- cmpfun(la2)

Some Timings

x <- 1:10

y <- 1:100

system.time(for (i in 1:10000) lapply(x, is.null))

system.time(for (i in 1:10000) la1(x, is.null))

system.time(for (i in 1:10000) la1c(x, is.null))

system.time(for (i in 1:10000) la2(x, is.null))

system.time(for (i in 1:10000) la2c(x, is.null))

system.time(for (i in 1:1000) lapply(y, is.null))

system.time(for (i in 1:1000) la1(y, is.null))

system.time(for (i in 1:1000) la1c(y, is.null))

system.time(for (i in 1:1000) la2(y, is.null))

system.time(for (i in 1:1000) la2c(y, is.null))

enableJIT(oldJIT) 

Author’s Recommendations: Top Data Science Resources To Consider

Before concluding this article, I wanted to share few top data science resources that I have personally vetted for you. I am confident that you can greatly benefit in your data science journey by considering one or more of these resources.

  • DataCamp: If you are a beginner focused towards building the foundational skills in data science, there is no better platform than DataCamp. Under one membership umbrella, DataCamp gives you access to 335+ data science courses. There is absolutely no other platform that comes anywhere close to this. Hence, if building foundational data science skills is your goal: Click Here to Sign Up For DataCamp Today!
  • MITx MicroMasters Program in Data Science: If you are at a more advanced stage in your data science journey and looking to take your skills to the next level, there is no Non-Degree program better than MIT MicroMasters. Click Here To Enroll Into The MIT MicroMasters Program Today! (To learn more: Check out my full review of the MIT MicroMasters program here)
  • Roadmap To Becoming a Data Scientist: If you have decided to become a data science professional but not fully sure how to get started: read my article – 6 Proven Ways To Becoming a Data Scientist. In this article, I share my findings from interviewing 100+ data science professionals at top companies (including – Google, Meta, Amazon, etc.) and give you a full roadmap to becoming a data scientist.

Conclusion

R is one of the top-tier programs for statistical modeling and analysis. It is open-source and constantly evolving. Under the GNU General Public License, anyone can work with R without any fees or licenses. 

As an interpreted programming language, R doesn’t require a compiler. Because of its interoperability with other languages, you need the R bytecode compiler to translate source codes into a low-level language.  

BEFORE YOU GO: Don’t forget to check out my latest article – 6 Proven Steps To Becoming a Data Scientist [Complete Guide]. We interviewed numerous data science professionals (data scientists, hiring managers, recruiters – you name it) and created this comprehensive guide to help you land that perfect data science job.

  1. (n.d.). Mathematical Sciencess–College of Liberal Arts & Sciences, The University of Iowa. https://homepage.stat.uiowa.edu/~luke/R/compiler/compiler.pdf
  2. R faq. (n.d.). The Comprehensive R Archive Network. https://cran.r-project.org/doc/FAQ/R-FAQ.html
  3. (n.d.). R: The R Project for Statistical Computing. https://www.r-project.org/

Affiliate Disclosure: We participate in several affiliate programs and may be compensated if you make a purchase using our referral link, at no additional cost to you. You can, however, trust the integrity of our recommendation. Affiliate programs exist even for products that we are not recommending. We only choose to recommend you the products that we actually believe in.

Daisy

Daisy is the founder of DataScienceNerd.com. Passionate for the field of Data Science, she shares her learnings and experiences in this domain, with the hope to help other Data Science enthusiasts in their path down this incredible discipline.

Recent Posts