Macro-assisted translations in ClojureScript

Posted at — Nov 7, 2019

At my work, we use a lot of ClojureScript, but we also have a project that is in Elixir (Phoenix). One thing I really appreciate about the Elixir community is that they value ergonomics very highly, perhaps more than any other community I’ve seen.

This orientation leads to a lot of great tooling for common problems. One such example is elixir-gettext, which is an internationalization (i18n) and localization (l10n) library. It is by far the best solution I’ve in this space for any platform, because it integrates with the compiler to enable exposing all translatable texts in your app automatically. This means there is no need to manually maintain dictionaries, like there is with other tools.

Using this library, you just write code like this if you want a string to be translatable:

   dgettext("signup_page", "Please enter your email") 
   #       context/domain,  text to translate

To expose these texts to translators, you simply run a build task, and it will spit out “.pot” files, which is a format supported by a lot of translation software. The library also supports pluralization and string interpolation, but I won’t get into that here.

I figured this was a pretty obvious and developer-friendly approach to supporting translation for any language with macros, so I was a little surprised I couldn’t find something similar for ClojureScript. Fortunately, it is pretty easy to make tooling for this when macros are available.

At a high-level, we only need 3 things:

A macro that, as a side effect, records the static arguments passed to it, and spits out a call to a runtime translation function. Let’s call this tr.
A runtime translation function (tr*) that, based on the static arguments, and some global state (like the language code, and localized texts), provides a translated string.
Some kind of ‘build step’ that processes the information recorded in 1, and makes the texts available for translation (for example, by spitting out a .pot file, like the aforementioned elixir solution).

I asked about this in the #shadow-cljs channel on the clojurians slack, and it turned out @thheller had already sketched out a solution of this, which had the key parts already figured out. I took this, tweaked the syntax, and adapted it into a solution. (Thanks again, Thomas!)

For the macro, I went for one that would support calls like this:

  ;; Plain string. Useful if no other information is required in order to
  ;; provide a good translation.
  (tr \"Ok\")

  ;; A vector, where the first member is a qualified keyword. The namespace 
  ;; of the keyword will be set as the domain, and the message_id will be the name,
  ;; giving some context to the translator. It will also be stable over time, 
  ;; even if the suggested english text (last arg here) changes.
  (tr [:api-token/confirm-delete "Are you sure you want to delete this API token?"])

  ;; Pluralization, interpolation: (the :count keyword is special)
  (tr [:common/pony-brag-message 
       \"I have {count} pony\" 
       \"I have {count} ponies\"] 
    :count 2)

I like this syntax, because it allows the programmer to give a lot of information to the translator very concisely (code-wise). A domain (e.g., “api tokens”), message id (e.g., “confirm-delete”), and a suggested default text can all be communicated, depending on which form you use. It seems to work pretty well in the app I work on, but if you want to use something different, it probably won’t impact the high level approach too much.

For the tr macro, and tr* function mentioned up, here is a first draft. It is not complete or optimized yet, but it does appear to work, and should be enough to illustrate the idea.

Note this key part from the macro (taken from Thomas’ code) - an interesting way to prevent the side effect we require from breaking compiler caching:

  (when env/*compiler*
      (swap! 
          env/*compiler* 
          update-in 
          [::ana/namespaces current-ns ::strings] 
          vec-conj 
          string-data))

The reason this is needed is that for any particular namespace, analysis data may be fetched from the file system instead of being created from scratch. The snippet above ensures that the translation data is stored along with the rest of this data. (Look for my-namespace.cljs.cache.transit.json files in your directory if you want to see what I’m talking about.)

The remaining ‘build-step’ part depends a little on what build tool you use. It is also coupled to how you store the call/arguments in the macro. I use shadow-cljs, so adding this build hook was enough:

  ;; In shadow-cljs.edn:
  :builds
  {:some-build-id
  :build-hooks [(front.tasks.i18n-tool/hook)]
  ... }

The hook code:

As you can see, pretty simple - we’re just lightly massaging the data we collected inside the compiler, and spitting out a JSON file. The way this is hooked in via shadow-cljs makes it happen every time we build in release mode. In my case, since I want translation to be possible while the app is running, I then upsert these texts to a database table with another script. The script also stores “last_compiled_at” for each text record, so that we are able stop translating texts that are no longer needed. If you don’t have those types of requirements, you could simply spit out .pot file(s) or something similar instead. I’m leaving out this part, as well as the code to pass the translated texts to the browser, since tends to be more app-specific.

That is all. Hope this is useful to someone in the Clojure community.

Isak Sky's blog

Clojure, SQL, and more

Macro-assisted translations in ClojureScript