Hello there! Today I want to share some details about the implementation of my recent project, the Scala library urlopt4s. This library provides a simple interface for filtering advertising and tracking parameters from a URL. It doesn’t contain any filtering code itself, but rather uses the AdGuard adblocker engine under the hood. So, in addition to its’ main function, this library can be treated as a proof-of-concept for running any (almost) kind of JavaScript code on the JVM.

What exactly does it do?

To make things clear, let’s first see what exactly this library does. So, to get started with it, all you need to do is add a dependency:

libraryDependencies += "me.seroperson" %% "urlopt4s" % "0.1.0"

And then do something like this to initialize an instance and run the filtering:

// ...

object ExampleApp extends IOApp {

  override def run(args: List[String]): IO[ExitCode] = UrlOptimizer[IO]()
    .use { urlOptimizer =>
      for {
        result <- urlOptimizer.removeAdQueryParams("https://www.google.com/?utm_source=test")
        _ <- IO.println(result) // "https://www.google.com/"
      } yield ExitCode.Success
    }

}

The resulting URL is unlikely to contain any advertisements or tracking parameters. Packaging this functionality in a library may seem excessive, but reading this section will explain why simple filtering by a predefined set of known advertisement and tracking parameters does not work well.

How exactly does it work?

Under the hood, it runs JavaScript and loads the AdGuard adblocker engine with an optimized set of filtering rules to perform the dirty work and clean a given URL. This is a really large JS library with many dependencies, including TypeScript and modern JavaScript APIs, and urlopt4s is a good proof of concept that we can run even this on the JVM (via GraalJS) right now. Read this for more information about implementation details.

TL;DR JS on JVM

Still, there are some restrictions and a few tweaks required to make everything work. I’m not going to provide a complete guide here because it’s better to check the code, but here’s a short summary of how to run non-hello-world JavaScript code on the JVM:

  • Create JS module, define webpack and package configurations, place there your JS code and export functions which you need to call from JVM.
  • Build an optimized bundle, place it as a resource inside of your JVM project.
  • Initialize GraalJS context, load JS bundle and get a pointer to previously exported function.
  • Wrap everything into a neat Scala interface, which just calls JS code.

Additional tips on implementing JS on JVM

Leaving here my experience which could be helpful:

  • GraalJS documentation (and this) has really a lot of useful things.
  • Usually you want to pack your JS code as a CommonJS module and set all related GraalJS properties, such as js.commonjs-require-cwd and js.commonjs-require (docs), as it makes using modern JS libraries easier.
  • Sometimes your JS code will work on i.e. NodeJS, but not on GraalJS. There are many reasons why such behaviour can occur, but highly likely it would be some APIs which GraalJS is missing. In this case you will encounter errors like ReferenceError: TextDecoder is not defined (or ReferenceError: TextEncoder is not defined, or some is not defined else). To deal with it you will have to add polyfills to your JS bundle with, for example, webpack ProvidePlugin.
  • When using webpack to build your JS code as a module and error like ReferenceError: global is not defined occurs, probably you have to set globalObject property to globalThis.
  • It is a good idea to restrict IO permissions for your GraalJS context (like this).

Conclusion

Thank you for reading this little article, I hope it will be useful to someone. Feel free to reach me if you have something to say. And also take a look on the posts on similar topics: