Chapter 19 Machine Learning
In this chapter, we build a package that, via V8, wraps ml.js, a library that brings machine learning to JavaScript. It covers quite a few models; we only include one: the linear regression. This is an interesting example because it reveals some proceedings that one is likely to run into when creating packages using V8.
const x = [0.5, 1, 1.5, 2, 2.5];
const y = [0, 1, 2, 3, 4];
const regression = new ml.SimpleLinearRegression(x, y);
19.1 Dependency
We start by creating a package and add the V8 package as dependency.
Then we create the inst
directory in which we place the ml.js file downloaded from the CDN.
dir.create("inst")
download.file(
"https://www.lactame.com/lib/ml/4.0.0/ml.min.js",
"inst/ml.min.js"
)
With the dependency downloaded, one can start working on the R code. First, a new V8 context needs to be created and the ml.js file needs to be imported into it.
19.2 Simple Regression
The “simple linear regression” consists of a simple function that takes two arrays. We can thus create a function that takes two vectors, x
, and y
, and runs the regression.
#' @export
ml_simple_lm <- function(y, x){
# assign x and y
ml$assign("x", x)
ml$assign("y", y)
# run regression
ml$eval(
"var regression = new ML.SimpleLinearRegression(x, y);"
)
# return results
ml$get("regression")
}
Then we can document and load the model to the function.
ml_simple_lm(cars$speed, cars$dist)
## $name
## [1] "simpleLinearRegression"
##
## $slope
## [1] 0.1655676
##
## $intercept
## [1] 8.283906
This works but has a few issues, namely running two or more regression internally will override the variable regression
in the V8 context. Let us demonstrate by implementing a function to predict.
We then document and load the functions to run two regressions in a row then observe the issue. Unlike R, the model we created only exists in JavaScript, unlike the lm
, the function ml_simple_lm
does not return the model. Therefore, ml_simple_lm
does not distinguish between models, unlike predict
, which takes the model as the first argument and runs the prediction on it.
This implementation of ml.js is indeed very dangerous.
# first model
ml_simple_lm(cars$speed, cars$dist)
ml_predict(15)
## 25.18405
# overrides model
ml_simple_lm(1:10, runif(10))
# produces different predictions
ml_predict(15)
## 10.76742
The package ml currently under construction should emulate R in that respect; the ml_simple_lm
should return the model, which should be usable with the predict
function. In order to do so, we are going to need to track regressions internally in V8 so the ml_simple_lm
returns a proxy of the model that predict
can use to predict on the intended model.
In order to track and store regressions internally, we are going to declare an empty array when the package is loaded.
# zzz.R
ml <- NULL
.onLoad <- function(libname, pkgname){
ml <<- V8::v8()
mljs <- system.file("ml.min.js", package = "ml")
ml$source(mljs)
ml$eval("var regressions = [];")
}
Then, one can track regressions by creating an R object, which is incremented every time ml_simple_lm
runs; this can be used as a variable name in the JavaScript regressions
array declare in .onLoad
. This variable name must be stored in the object we intend to return so the predict
method we will create later on can access the model and run predictions. Finally, in order to declare a new method on the predict
function, we need to return an object bearing a unique class. Below we use mlSimpleRegression
.
counter <- new.env(parent = emptyenv())
counter$regressions <- 0
#' @export
ml_simple_lm <- function(y, x){
counter$regressions <- counter$regressions + 1
# assign variables
ml$assign("x", x)
ml$assign("y", y)
# address
address <- paste0("regressions['", counter$regressions, "']")
# create regression
code <- paste0(
address, " = new ML.SimpleLinearRegression(x, y);"
)
ml$eval(code)
# retrieve and append address
regression <- ml$get(address)
regression$address <- address
# create object of new class
structure(
regression,
class = c("mlSimpleRegression", class(regression))
)
}
Then one can implement the predict
method for mlSimpleRegression
. The function uses the address
of the model to run the JavaScript predict
method on that object.
#' @export
predict.mlSimpleRegression <- function(object, newdata, ...){
code <- paste0(object$address, ".predict")
ml$call(code, newdata)
}
We can then build and load the package to it in action.
19.3 Exercises
There are too many great JavaScript libraries that would be great additions to the R ecosystem but perhaps try and integrate one of these below. They are simple yet exciting and thus provide ideal first forays into what this part of the book explained.
- chance.js - random generator
- currency.js - helpers to work with currencies