Title: Support Many Languages in R
Version: 0.1.0
Description: An object model for source text and translations. Find and extract translatable strings. Provide translations and seamlessly retrieve them at runtime.
License: MIT + file LICENSE
URL: https://transltr.ununoctium.dev
BugReports: https://github.com/jeanmathieupotvin/transltr/issues
Encoding: UTF-8
Language: en
RoxygenNote: 7.3.2
Config/testthat/edition: 3
Depends: R (≥ 4.3)
Imports: digest, R6, stringi, utils, yaml
Suggests: covr, devtools, lifecycle, microbenchmark, pkgdown, testthat (≥ 3.0.0), usethis, withr
Collate: 'aaa.R' 'assert.R' 'class-location.R' 'class-text.R' 'class-translator.R' 'find-source-in-exprs.R' 'find-source.R' 'flat.R' 'hash.R' 'language.R' 'normalize.R' 'serialize.R' 'text-io.R' 'translator-io.R' 'transltr-package.R' 'utils-format-vector.R' 'utils-map.R' 'utils-nullish-op.R' 'utils-stop.R' 'utils-strings.R' 'uuid.R' 'zzz.R'
NeedsCompilation: no
Packaged: 2025-02-14 16:05:36 UTC; jmp
Author: Jean-Mathieu Potvin [aut, cre, cph], Jérôme Lavoué ORCID iD [ctb, fnd, rev]
Maintainer: Jean-Mathieu Potvin <jeanmathieupotvin@ununoctium.dev>
Repository: CRAN
Date/Publication: 2025-02-14 16:40:02 UTC

Support Many Languages in R

Description

[Experimental]

An object model for source text and translations. Find and extract translatable strings. Provide translations and seamlessly retrieve them at runtime.

Introduction

R relies on GNU gettext to produce multi-lingual messages (if Native Language Support is enabled). This is well-designed software offering an extensive set of functionalities. It is ubiquitous and has withstood the test of time. It is not the objective of transltr to (fully) replace it.

Package transltr provides an alternative in-memory object model (and further functions) to easily inspect and manipulate source text and translations.

Getting Started

Write code as you normally would. Whenever a piece of text (literal  character vectors) should be available in multiple languages, pass it to method Translator$translate(). You may also use your own function.

  1. Once you are ready to translate your project, call find_source(). This returns a Translator object.

  2. Export the Translator object with translator_write(). Fill in the underlying translation files.

  3. Import translations back into an R session with translator_read().

Current language and source language are respectively set with language_set() and language_source_get(). By default, the latter is set equal to "en" (English).

Bugs and Feedback

You may submit bugs, request features, and provide feedback by creating an issue on GitHub.

Acknowledgements

Warm thanks to Jérôme Lavoué, who supported and sponsored the first release of this project.

Author(s)

Maintainer: Jean-Mathieu Potvin jeanmathieupotvin@ununoctium.dev [copyright holder]

Other contributors:

See Also

The scattered and incomplete documentation of R's Native Language Support:

The comprehensive technical documentation of GNU gettext.


Hashing Algorithms

Description

These algorithms map a character string to another character string of hexadecimal characters highly likely to be unique. The latter is used to uniquely identify a source text (and the underlying source language).

Usage

algorithms()

Details

Secure Hash Algorithm 1

Method sha1 corresponds to SHA-1 (Secure Hash Algorithm version 1), a cryptographic hashing function. While it is now superseded by more secure variants (SHA-256, SHA-512, etc.), it is still useful for non-sensitive purposes. It is fast, collision-resistant, and may handle very large inputs. It emits strings of 40 hexadecimal characters.

Cumulative UTF-8 Sum

[Experimental]

This method is experimental. Use with caution.

Method utf8 is a simple method derived from cumulative sums of UTF-8 code points (converted to integers). It is slightly faster than method sha1 for small inputs and emits hashes with a width porportional to the underlying input's length. It is used for testing purposes internally.


Find Source Text

Description

Find and extract source text that must be translated.

Usage

find_source(
  path = ".",
  encoding = "UTF-8",
  verbose = getOption("transltr.verbose", TRUE),
  tr = translator(),
  interface = NULL
)

find_source_in_files(
  paths = character(),
  encoding = "UTF-8",
  verbose = getOption("transltr.verbose", TRUE),
  algorithm = algorithms(),
  interface = NULL
)

Arguments

path

A non-empty and non-NA character string. A path to a directory containing R source scripts. All subdirectories are searched. Files that do not have a .R, or .Rprofile extension are skipped.

encoding

A non-empty and non-NA character string. The source character encoding. In almost all cases, this should be UTF-8. Other encodings are internally re-encoded to UTF-8 for portability.

verbose

A non-NA logical value. Should progress information be reported?

tr

A Translator object.

interface

A name, a call object, or a NULL. A reference to an alternative (custom) function used to translate text. If a call object is passed to interface, it must be to operator ::. Calls to method Translator$translate() are ignored and calls to interface are extracted instead. See Details below.

paths

A character vector of non-empty and non-NA values. A set of paths to R source scripts that must be searched.

algorithm

A non-empty and non-NA character string equal to "sha1", or "utf8". The algorithm to use when hashing source information for identification purposes.

Details

find_source() and find_source_in_files() look for calls to method Translator$translate() in R scripts and convert them to Text objects. The former further sets these resulting objects into a Translator object. See argument tr.

find_source() and find_source_in_files() work on a purely lexical basis. The source code is parsed but never evaluated (aside from extracted literal character vectors).

Interfaces

In some cases, it may not be desirable to call method Translator$translate() directly. A custom function wrapping (interfacing) this method may always be used as long as it has the same signature as method Translator$translate(). In other words, it must minimally have two formal arguments: ... and source_lang.

Custom interfaces must be passed to find_source() and find_source_in_files() for extraction purposes. Since these functions work on a lexical basis, interfaces can be placeholders in the source code (non- existent bindings) at the time these functions are called. However, they must be bound to a function (ultimately) calling Translator$translate() at runtime.

Custom interfaces are passed to find_source() and find_source_in_files() as name or call objects in a variety of ways. The most straightforward way is to use base::quote(). See Examples below.

Methodology

find_source() and find_source_in_files() go through these steps to extract source text from a single R script.

  1. It is read with text_read() and re-encoded to UTF-8 if necessary.

  2. It is parsed with parse() and underlying tokens are extracted from parsed expressions with utils::getParseData().

  3. Each expression (expr) token is converted to language objects with str2lang(). Parsing errors and invalid expressions are silently skipped.

  4. Valid call objects stemming from step 3 are filtered with is_source().

  5. Calls to method Translator$translate() or to interface stemming from step 4 are coerced to Text objects with as_text().

These steps are repeated for each R script. find_source() further merges all resulting Text objects into a coherent set with merge_texts() (identical source code is merged into single Text entities).

Extracted character vectors are always normalized for consistency (at step 5). See normalize() for more information.

Limitations

The current version of transltr can only handle literal character vectors. This means it cannot resolve non-trivial expressions that depends on a state. All values passed to argument ... of method Translator$translate() must yield character vectors (trivially).

Value

find_source() returns an R6 object of class Translator. If an existing Translator object is passed to tr, it is modified in place and returned.

find_source_in_files() returns a list of Text objects. It may contain duplicated elements, depending on the extracted contents.

See Also

Translator, Text, normalize(), translator_read(), translator_write(), base::quote(), base::call(), base::as.name()

Examples

# Create a directory containing dummy R scripts for illustration purposes.
temp_dir   <- file.path(tempdir(TRUE), "find-source")
temp_files <- file.path(temp_dir, c("ex-script-1.R", "ex-script-2.R"))
dir.create(temp_dir, showWarnings = FALSE, recursive = TRUE)

cat(
  "tr$translate('Hello, world!')",
  "tr$translate('Farewell, world!')",
  sep  = "\n",
  file = temp_files[[1L]])
cat(
  "tr$translate('Hello, world!')",
  "tr$translate('Farewell, world!')",
  sep  = "\n",
  file = temp_files[[2L]])

# Extract calls to method Translator$translate().
find_source(temp_dir)
find_source_in_files(temp_files)

# Use custom functions.
# For illustrations purposes, assume the package
# exports an hypothetical translate() function.
cat(
  "translate('Hello, world!')",
  "transtlr::translate('Farewell, world!')",
  sep  = "\n",
  file = temp_files[[1L]])
cat(
  "translate('Hello, world!')",
  "transltr::translate('Farewell, world!')",
  sep  = "\n",
  file = temp_files[[2L]])

# Extract calls to translate() and transltr::translate().
# Since find_source() and find_source_in_files() work on
# a lexical basis, these are always considered to be two
# distinct functions. They also don't need to exist in the
# R session calling find_source() and find_source_in_files().
find_source(temp_dir, interface = quote(translate))
find_source_in_files(temp_files, interface = quote(transltr::translate))


Find Source Text in Expressions

Description

Find and extract source text that must be translated from a single file or a set of R expr tokens.

Arguments listed below are not explicitly validated for efficiency.

Usage

find_source_in_file(
  path = "",
  encoding = "UTF-8",
  verbose = getOption("transltr.verbose", TRUE),
  algorithm = algorithms(),
  interface = NULL
)

find_source_in_exprs(
  tokens = utils::getParseData(),
  path = "",
  algorithm = algorithms(),
  interface = NULL
)

find_source_exprs(path = "", encoding = "UTF-8")

is_source(x, interface = NULL)

Arguments

path

A non-empty and non-NA character string. A path to an R source script.

encoding

A non-empty and non-NA character string. The source character encoding. In almost all cases, this should be UTF-8. Other encodings are internally re-encoded to UTF-8 for portability.

verbose

A non-NA logical value. Should progress information be reported?

algorithm

A non-empty and non-NA character string equal to "sha1", or "utf8". The algorithm to use when hashing source information for identification purposes.

interface

A name, a call object, or a NULL. A reference to an alternative (custom) function used to translate text. If a call object is passed to interface, it must be to operator ::. Calls to method Translator$translate() are ignored and calls to interface are extracted instead. See Details below.

tokens

A data.frame returned by utils::getParseData(). It must always minimally contain columns line1, col1, line2, col2, and text.

x

Any R object.

Details

find_source_in_exprs() silently skips parsing errors. See find_source() for more information.

is_source() checks if an object conceptually represents a source text. This can either be

Calls to method Translator$translate() that include ... in their argument(s) are ignored. Such calls are part of the definition of a custom interface and should not be extracted.

Value

find_source_in_file() and find_source_in_exprs() return a list of Text objects. It may contain duplicated elements, depending on the extracted contents.

find_source_exprs() returns a subset of the output of utils::getParseData(). Only expr tokens are returned.

is_source() returns a logical value.

See Also

Text, find_source(), utils::getParseData()


Serialize Objects to Flat Strings

Description

Serialize R objects into textual sequences of unindented (flat) and identifiable sections. These are called FLAT (1.0) objects.

Usage

flat_serialize(x = list(), tag_sep = ": ", tag_empty = "")

flat_deserialize(string = "", tag_sep = ": ")

flat_tag(x = list(), tag_sep = ": ", tag_empty = "")

flat_format(x = list())

flat_example()

Arguments

x

A list. It can be empty.

tag_sep

A non-empty and non-NA character string. The separator to use when creating tags from names (recursively) extracted from x.

tag_empty

A non-NA character string. The value to use as a substitute for empty names. Positional indices are automatically appended to it to ensure tags are always unique.

string

A non-NA character string. It can be empty. Contents to deserialize.

Details

The Flat format (Flat List As Text, or FLAT) is a minimal textual data serialization format optimized for R list objects. Elements are converted to character strings and organized into unindented sections identified by a tag. Call flat_example() for a valid example.

flat_serialize() serializes x into a FLAT object.

flat_deserialize() is the inverse operation: it converts a FLAT object back into a list. The latter has the same shape as the original one, but

The convention is to serialize an empty list to an empty character string.

Internal mechanisms

flat_tag() and flat_format() are called internally by flat_serialize(). Aside from debugging purposes, they should not be called outside of the former.

flat_tag() creates tags from names extracted from x and formats them. Tags may not be unique, depending on x's structure and names.

flat_format() recursively formats the elements of x as part of the serialization process. It

Value

flat_serialize() returns a character string, possibly empty.

flat_deserialize() returns a named list, possibly empty. Its structure depends on the underlying tags.

flat_tag() returns a character vector, possibly empty.

flat_format() returns an unnamed list having the same shape as x. See Details.

flat_example() returns a character string (a serialized example), invisibly. It is used for its side-effect of printing an illustration of the format (with useful information).


Format Vectors

Description

Format atomic vectors, lists, and pairlists.

Usage

format_vector(
  x = vector(),
  label = NULL,
  level = 0L,
  indent = 1L,
  fill_names = FALSE,
  null = "<null>",
  empty = "<empty>",
  validate = TRUE
)

Arguments

x

A vector of any atomic mode, a list, or a pairlist. It can be empty and it can contain NA values.

label

A NULL, or a non-empty and non-NA character string. A top descriptive label for x. It is used to preserve, and output all names in recursive calls. The value passed to label is considered to be at level 0, and is not indented.

level

A non-NA integer value. The current depth, or current nesting level to use for indentation purposes.

indent

A non-NA integer value. The number of single space(s) to use for each level when indenting name/value pairs.

fill_names

A non-NA logical value. Should NULL and empty names be replaced by names created from the elements' underlying positions? Positions are relative to each level.

null

A non-empty and non-NA character string. The value to use to represent NULL and empty parlists (they are conceptually the same thing).

empty

A non-empty and non-NA character string. The value to use to represent empty vectors, excluding NULL. See null above for the latter. The type of the underlying empty object is added to empty for convenience. See Examples below.

validate

A non-NA logical value. Should the arguments be validated before being used? This argument should be left as is.

Details

format_vector() is an alternative to utils::str() that exposes a much simpler generic formatting interface and yields terser outputs of name/value pairs. Indentation is used for nested values.

format_vector() does not attempt to cover all R objects like utils::str(). Instead, it (merely) focuses on efficiently handling the types used by transltr. It is the low-level workhorse function of format.Translator(), format.Text(), and format.Location().

Value

A character vector, possibly trimmed by str_trim().

See Also

str_trim()

Other utility functions: stops(), str_to(), vapply_1l()


Hashing

Description

Map an arbitrary character string to a shorter string of hexadecimal characters highly likely to be unique. It typically has a fixed width.

Arguments listed below are not validated for efficiency.

Usage

hash(lang = "", text = "", algorithm = "")

Arguments

lang

A non-empty and non-NA character string. The underlying language.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.

text

A non-NA character string. It can be empty.

algorithm

A non-empty and non-NA character string equal to "sha1", or "utf8". The algorithm to use when hashing source information for identification purposes.

Details

Hashes generated by hash() uniquely identify the lang and text pair. Values passed to these arguments are concatenated with a colon character for hashing purposes.

Value

hash() returns a character string, or NULL if algorithm is not supported.

See Also

Translator, Text, normalize(), algorithms()


Assertions

Description

These functions are a functional implementation of defensive programming.

⁠is_*()⁠ functions check whether their argument meets certain criteria.

⁠assert_*()⁠ functions further throw an error message when at least one criterion is not met.

Arguments listed below are not explicitly validated for efficiency.

Usage

is_int(x, allow_empty = FALSE)

is_chr(x, allow_empty = FALSE)

is_lgl1(x)

is_int1(x)

is_chr1(x, allow_empty_string = FALSE)

is_list(x, allow_empty = FALSE)

is_between(x, min = -Inf, max = Inf)

is_named(x, allow_empty_names = FALSE, allow_na_names = FALSE)

is_match(x, choices = vector(), allow_partial = FALSE)

assert_int(
  x,
  allow_empty = FALSE,
  throw_error = TRUE,
  x_name = deparse(substitute(x))
)

assert_chr(
  x,
  allow_empty = FALSE,
  throw_error = TRUE,
  x_name = deparse(substitute(x))
)

assert_lgl1(x, throw_error = TRUE, x_name = deparse(substitute(x)))

assert_int1(x, throw_error = TRUE, x_name = deparse(substitute(x)))

assert_chr1(
  x,
  allow_empty_string = FALSE,
  throw_error = TRUE,
  x_name = deparse(substitute(x))
)

assert_list(
  x,
  allow_empty = FALSE,
  throw_error = TRUE,
  x_name = deparse(substitute(x))
)

assert_between(
  x,
  min = -Inf,
  max = Inf,
  throw_error = TRUE,
  x_name = deparse(substitute(x))
)

assert_named(
  x,
  allow_empty_names = FALSE,
  allow_na_names = FALSE,
  throw_error = TRUE,
  x_name = deparse(substitute(x))
)

assert_match(
  x,
  choices,
  allow_partial = FALSE,
  quote_values = FALSE,
  throw_error = TRUE,
  x_name = deparse(substitute(x))
)

assert_arg(x, quote_values = FALSE, throw_error = TRUE)

assert(x, ...)

## Default S3 method:
assert(x, ...)

Arguments

x

Any R object.

allow_empty

A non-NA logical value. Should vectors of length 0 be considered as valid values?

allow_empty_string

A non-NA logical value. Should empty character strings be considered as valid values?

min

A non-NA numeric lower bound. It can be infinite.

max

A non-NA numeric upper bound. It can be infinite.

allow_empty_names

A non-NA logical value. Should empty character strings be considered as valid names? This is different from having no names at all.

allow_na_names

A non-NA logical value. Should NA values be considered as valid names?

choices

A non-empty vector of valid candidates.

allow_partial

A non-NA logical value. Should x be partially matched?

throw_error

A non-NA logical value. Should an error be thrown? If so, stops() is called. Otherwise, error messages are returned as a character vector (possibly empty).

x_name

A non-empty and non-NA character string. The name of x.

quote_values

A non-NA logical value. Passed as is to str_to().

Details

Guard clauses tend to be verbose and recycled many times within a project. This makes it hard to keep error messages consistent over time. ⁠assert_*()⁠ functions encapsulate usual guard clause into simple semantic functions. This reduces code repetition and number of required unit tests. See Examples below.

By convention, NA values are always disallowed.

assert_arg() is a partial refactoring of base::match.arg(). It relies on assert_match() internally and does not have an equivalent is_arg() function. It must be called within another function.

assert() is a S3 generic function that covers specific data structures. Classes (and underlying objects) that do not have an assert() method are considered to be valid by default.

Value

is_int(), is_chr(), is_lgl1(), is_int1(), is_chr1(), is_list(), is_between(), is_named(), and is_match() return a logical value.

assert(), assert_int(), assert_chr(), assert_lgl1(), assert_int1(), assert_chr1(), assert_list(), assert_between(), assert_named(), assert_match(), and assert_arg() return an empty character vector if x meets the underlying criteria and throw an error otherwise. If throw_error is FALSE, the error message is returned as a character vector. Unless otherwise stated, the latter is of length 1 (a character string).

assert.default() always returns an empty character vector.


Get or Set Language

Description

Get or set the current, and source languages.

They are registered as environment variables named TRANSLTR_LANGUAGE, and TRANSLTR_SOURCE_LANGUAGE.

Usage

language_set(lang = "en")

language_get()

language_source_set(lang = "en")

language_source_get()

Arguments

lang

A non-empty and non-NA character string. The underlying language.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.

Details

The language and the source language can always be temporarily changed. See argument lang of method Translator$translate() for more information.

The underlying locale is left as is. To change an R session's locale, use Sys.setlocale() or Sys.setLanguage() instead. See below for more information.

Value

language_set(), and language_source_set() return NULL, invisibly. They are used for their side-effect of setting environment variables TRANSLTR_LANGUAGE and TRANSLTR_SOURCE_LANGUAGE, respectively.

language_get() returns a character string. It is the current value of environment variable TRANSLTR_LANGUAGE. It is empty if the latter is unset.

language_source_get() returns a character string. It is the current value of environment variable TRANSLTR_SOURCE_LANGUAGE. It returns "en" if the latter is unset.

Locales versus languages

A locale is a set of multiple low-level settings that relate to the user's language and region. The language itself is just one parameter among many others.

Modifying a locale on-the-fly can be considered risky in some situations. It may not be the optimal solution for merely changing textual representations of a program or an application at runtime, as it may introduce unintended changes and induce subtle bugs that are harder to fix.

Moreover, it makes sense for some applications and/or programs such as Shiny applications to decouple the front-end's current language (what users see) from the back-end's locale (what developers see). A UI may be displayed in a certain language while keeping logs and R internal messages, warnings, and errors as is.

Consequently, the language setting of transltr is purposely kept separate from the underlying locale and removes the complexity of having to support many of them. Users can always change both the locale and the language parameter of the package. See Examples.

Note

Environment variables are used because they can be shared among different processes. This matters when using parallel and/or concurrent R sessions. It can further be shared among direct and transitive dependencies (other packages that rely on transltr).

Examples

# Change the language parameters (globally).
language_source_set("en")
language_set("fr")

language_source_get()  ## Outputs "en"
language_get()         ## Outputs "fr"

# Change both the language parameter and the locale.
# Note that while users control how languages are named
# for language_set(), they do not for Sys.setLanguage().
language_set("fr")
Sys.setLanguage("fr-CA")

# Reset settings.
language_source_set(NULL)
language_set(NULL)

# Source language has a default value.
language_source_get()  ## Outputs "en"


Source Locations

Description

Structure and manipulate source locations. Class Location is a lighter alternative to srcfile() and other related functionalities.

Usage

location(path = tempfile(), line1 = 1L, col1 = 1L, line2 = 1L, col2 = 1L)

is_location(x)

## S3 method for class 'Location'
format(x, ...)

## S3 method for class 'Location'
print(x, ...)

## S3 method for class 'Location'
c(...)

merge_locations(...)

Arguments

path

A non-empty and non-NA character string. The origin of the ranges.

line1, col1

A non-empty integer vector of non-NA values. The (inclusive) starting point(s) of what is being referenced.

line2, col2

A non-empty integer vector of non-NA values. The (inclusive) end(s) of what is being referenced.

x

Any R object.

...

Usage depends on the underlying function.

Details

A Location is a set of one or more line/column ranges referencing contents (like text or source code) within a common origin identified by an underlying path. The latter is generic and can be anything: a file on disk, on a network, a pointer, a binding, etc. What matters is the underlying context.

Location objects may refer to multiple distinct ranges for the the same origin. This is why arguments line1, col1, line2 and col2 accept integer vectors (and not only scalar values).

Combining Location Objects

c() can only combine Location objects having the same path. In that case, the underlying ranges are combined into a set of non-duplicated range(s).

merge_locations() is a generalized version of c() that handles any number of Location objects having possibly different paths. It can be viewed as a vectorized version of c().

Value

location(), and c() return a named list of length 5 and of S3 class Location containing the values of path, line1, col1, line2, and col2.

is_location() returns a logical value.

format() returns a character vector.

print() returns argument x invisibly.

merge_locations() returns a list of (combined) Location objects.

Examples

# Create Location objects.
loc1 <- location("file-a", 1L, 2L, 3L, 4L)
loc2 <- location("file-a", 5L, 6L, 7L, 8L)
loc3 <- location("file-c", c(9L, 10L), c(11L, 12L), c(13L, 14L), c(15L, 16L))

is_location(loc1)  ## TRUE

print(loc1)
print(loc2)
print(loc3)

# Combine Location objects.
# c() throws an error if they do not have the same path.
c(loc1, loc2)

# Location objects with different paths can be merged.
# This groups Location objects according to their paths
# and calls c() on each group. It returns a list.
merge_locations(loc1, loc2, loc3)

# The path of a Location object can be whatever fits the context.
# Below is an example that references text in a character vector
# bound to variable x in the global environment.
x <- "This is a string and it is held in memory for some purpose."
location("<environment: R_GlobalEnv: x>", 1L, 11L, 1L, 16L)


Normalize Text

Description

Construct a standardized string from values passed to ...

Usage

normalize(...)

Arguments

...

Any number of vectors containing atomic elements. Each vector is normalized as a paragraph.

  • Elements are coerced to character values.

  • NA values and empty strings are discarded.

  • Multi-line strings are supported and encouraged. Blank lines are interpreted (two or more newline characters) as paragraph separators.

Details

Input text can written in a variety of ways using single-line and multi-line strings. Values passed to ... are normalized (to ensure their consistency) and collapsed to a single character string using the standard paragraph separator. The latter is defined as two newline characters ("\n\n").

  1. NA values and empty strings are discarded before reducing ... to a character string.

  2. Whitespaces (tabs, newlines, and repeated spaces) characters are replaced by a single space. Paragraph separators are preserved.

  3. Leading or trailing whitespaces are stripped.

Value

A character string, possibly empty.


Source Ranges

Description

Create, parse, and validate source ranges.

Usage

range_format(x = location())

range_parse(ranges = character())

range_is_parseable(ranges = character())

Arguments

x

A Location object.

ranges

A character vector of non-NA and non-empty values. The ranges to extract pairs of indices (line, column) from.

Details

Ranges are ⁠Ln <int>, Col <int> @ Ln <int>, Col <int>⁠ strings created on-the-fly from Location objects for outputting purposes.

Value

range_format() returns a character vector. It assumes that x is valid.

range_parse() returns a list having the same length as ranges. Each element is an integer vectors containing 4 non-NA values (unless the underlying range is invalid).

range_is_parseable() returns a logical vector having the same length as ranges.

See Also

Location, ExportedLocation,


Serialize Objects

Description

Convert Translator objects, Text objects, and Location objects to a YAML object, or vice-versa.

Convert translations contained by a Translator object to a custom textual representation (a FLAT object), or vive-versa.

Usage

serialize(x, ...)

serialize_translations(tr = translator(), lang = "")

deserialize(string = "")

deserialize_translations(string = "", tr = NULL)

export_translations(tr = translator(), lang = "")

export(x, ...)

## S3 method for class 'Translator'
export(x, ...)

## S3 method for class 'Text'
export(x, id = uuid(), set_translations = FALSE, ...)

## S3 method for class 'Location'
export(x, id = uuid(), ...)

## S3 method for class 'ExportedTranslator'
assert(x, throw_error = TRUE, ...)

## S3 method for class 'ExportedText'
assert(x, throw_error = TRUE, ...)

## S3 method for class 'ExportedLocation'
assert(x, throw_error = TRUE, ...)

## S3 method for class 'ExportedTranslations'
assert(x, throw_error = TRUE, ...)

import(x, ...)

## S3 method for class 'ExportedTranslator'
import(x, ...)

## S3 method for class 'ExportedText'
import(x, ...)

## S3 method for class 'ExportedLocation'
import(x, ...)

## S3 method for class 'ExportedTranslations'
import(x, tr = NULL, ...)

## Default S3 method:
import(x, ...)

format_errors(errors = character(), id = uuid(), throw_error = TRUE)

Arguments

x

Any R object.

...

Further arguments passed to, or from other methods.

tr

A Translator object.

This argument is NULL by default for deserialize_translations() and import.ExportedTranslations(). If a Translator object is passed to these functions, they will import translations and further register them (as long as they correspond to an existing source text).

lang

A non-empty and non-NA character string. The underlying language.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.

string

A non-empty and non-NA character string. Contents to deserialize.

id

A non-empty and non-NA character string. A unique identifier for the underlying object. It is used for validation purposes.

set_translations

A non-NA logical value. Should translations be included in the resulting ExportedText object? If FALSE, field Translations is set equal to NULL.

throw_error

A non-NA logical value. Should an error be thrown? If so, stops() is called. Otherwise, error messages are returned as a character vector (possibly empty).

errors

A non-empty character vector of non-NA values. Error message(s) describing why object(s) are invalid.

Details

The information contained within a Translator object is split by default. Unless set_translations is TRUE, translations are serialized independently from other fields. This is useful when creating Translator files and translations files.

While serialize() and serialize_translations() are distinct, they share a common design and perform the same thing, at least conceptually. The same is true for deserialize() and deserialize_translations(). These 4 functions are those that should be used in almost all circumstances.

Serialization

The data serialization process performed by serialize() and serialize_translations() is internally broken down into 2 steps: objects are first exported before being serialized.

export() and export_translations() are preserializing mechanisms that convert objects into transient objects that ease the conversion process. They are never returned to the user: serialize(), and serialize_translations() immediately transform them into character strings.

serialize() returns a YAML object.

serialize_translations() returns a FLAT object.

Deserialization

The data deserialization process performed by deserialize() and deserialize_translations() is internally broken down into 3 steps: objects are first deserialized, then validated and finally, imported.

deserialize() and deserialize_translations() are raw deserializer mechanisms: string is converted into an R named list that is presumed to be an exported object. deserialize() relies on YAML tags to infer the class of each object.

The contents of the transient objects is thoroughly checked with an assert() method (based on the underlying presumed class). Valid objects are imported back into an appropriate R object with import().

Custom fields and comments added by users to serialized objects are ignored.

Formatting errors

assert() methods accumulate error messages before returning, or throwing them. format_errors() is a helper function that eases this process. It exists to avoid repeting code in each method. There is no reason to call it outside of assert() methods.

Value

See other sections for further information.

serialize(), and serialize_translations() return a character string.

export() returns a named list of S3 class

export_translations() returns an ExportedTranslations object.

deserialize() and import() return

deserialize_translations() and import.ExportedTranslations() return an ExportedTranslations object. They further register imported translations if a Translator object is passed to tr.

import.default() is used for its side-effect of throwing an error for unsupported objects.

assert.ExportedTranslator(), assert.ExportedText(), assert.ExportedLocation(), and assert.ExportedTranslations() return a character vector, possibly empty. If throw_error is TRUE, an error is thrown if an object is invalid.

format_errors() returns a character vector, and outputs its contents as an error if throw_error is TRUE.

Exported Objects

An exported object is a named list of S3 class ExportedTranslator, ExportedText, ExportedLocation, or ExportedTranslations and always having a tag attribute whose value is equal to the super-class of x.

There are four main differences between an object and its exported counterpart.

  1. Field names are slightly more verbose.

  2. Source text is treated independently from translations.

  3. Unset fields are set equal to NULL (a ~ in YAML).

  4. Each object has an Identifier used to locate errors.

The correspondance between objects is self-explanatory.

You may also explore provided examples below.

The ExportedTranslations Class

ExportedTranslations objects are created from a Translator object with export_translations(). Their purpose is to restructure translations by language. They are different from other exported objects because there is no corresponding Translations class.

An ExportedTranslations object is a named list of S3 class ExportedTranslations containing the following elements.

Identifier

The unique identifier of argument tr. See Translator$id for more information.

⁠Language Code⁠

The value of argument lang.

Language

The translation's language. See Translator$native_languages for more information.

⁠Source Language⁠

The source text's language. See Translator$source_langs for more information.

Translations

A named list containing further named lists. Each sublist contains two values:

⁠Source Text⁠

A non-empty and non-NA character string.

Translation

A non-empty and non-NA character string.

See Text$translations for more information.

Unavailable translations are automatically replaced by a placeholder that depends on whether they are exported or imported.

Note

Dividing the serialization and deserialization processes into multiple steps helps keeping the underlying functions short, and easier to test.

See Also

Official YAML 1.1 specification, yaml::as.yaml(), yaml::yaml.load(), flat_serialize(), flat_deserialize(), translator_read(), translator_write(), translations_read(), translations_write()


Throw Errors

Description

stops() is equivalent to stop(..., call. = FALSE). It removes calls from error messages by default. These are rarely useful and confuse users more often than they help them.

stopf() is equivalent to stops(sprintf(fmt, ...)). It wraps base::sprintf() and stops() and is used to construct flexible error messages.

Usage

stops(...)

stopf(fmt = "", ...)

Arguments

...

Further arguments respectively passed to base::stop() and base::sprintf() by stops() and stopf().

fmt

A character of length 1 passed as is to base::sprintf().

Value

Nothing. These functions are used for their side-effect of raising an error.

See Also

Other utility functions: format_vector(), str_to(), vapply_1l()


Character String Utilities

Description

str_to() converts an R object to a character string. It is a slightly more flexible alternative to base::toString().

str_trim() wraps base::strtrim() and further adds a ... suffix to each trimmed element.

str_wrap() wraps base::strwrap() and ensures a character string is returned.

Usage

str_to(x, ...)

## Default S3 method:
str_to(x, quote_values = FALSE, last_sep = ", or ", ...)

str_trim(x = character(), width = 80L)

str_wrap(x = character(), width = 80L)

Arguments

x

Any R object for str_to(). A character vector otherwise.

...

Further arguments passed to, or from other methods.

quote_values

A non-NA logical value. Should elements of x be quoted?

last_sep

A non-empty and non-NA character string separating the last and penultimate elements.

width

A non-NA integer value. The target width for individual elements of x. str_trim() takes 3 more characters into account for the suffix it inserts (...).

Details

str_to() concatenates all elements with ", ", except for the last one. See argument last_sep.

str_wrap() preserves existing paragraph separators ("\n\n").

Value

str_to() and str_wrap() return a character string.

str_trim() returns a character vector having the same length as x.

See Also

Other utility functions: format_vector(), stops(), vapply_1l()


Source Text

Description

Structure source text and its translations.

Usage

text(..., source_lang = language_source_get(), algorithm = algorithms())

is_text(x)

## S3 method for class 'Text'
format(x, ...)

## S3 method for class 'Text'
print(x, ...)

## S3 method for class 'Text'
c(...)

merge_texts(..., algorithm = algorithms())

as_text(x, ...)

## S3 method for class 'call'
as_text(x, loc = location(), algorithm = algorithms(), ...)

Arguments

...

Usage depends on the underlying function.

source_lang

A non-empty and non-NA character string. The language of the source text.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1. Doing so maximizes portability and cross-compatibility between packages.

algorithm

A non-empty and non-NA character string equal to "sha1", or "utf8". The algorithm to use when hashing source information for identification purposes.

x

Any R object.

loc

A Location object.

Details

A Text object is a piece of source text that is extracted from R source scripts.

The Text class structures this information and exposes a set of methods to manipulate it.

Combining Text Objects

c() can only combine Text objects having the same hash. This is equivalent to having the same algorithm, source_lang, and source_text. In that case, the underlying translations and Location objects are combined and a new object is returned. It throws an error if all Text objects are empty (they have no set source_lang).

merge_texts() is a generalized version of c() that handles any number of Text objects having possibly different hashes. It can be viewed as a vectorized version of c(). It silently ignores and drops all empty Text objects.

Coercion

as_text() is an S3 generic function that attempts to coerce its argument into a suitable Text object. as_text.call() is the method used by find_source() to coerce a call object to a Text object. While it can be used, it should be avoided most of the time. Users may extend it by defining their own methods.

Value

text(), c(), and as_text() return an R6 object of class Text.

is_text() returns a logical value.

format() returns a character vector.

print() returns argument x invisibly.

merge_texts() returns a list of (combined) Text objects. It can be empty if all underlying Text objects are empty.

Active bindings

hash

A non-empty and non-NA character string. A reproducible hash generated from source_lang and source_text, and by using the algorithm specified by algorithm. It is used as a unique identifier for the underlying Text object.

This is a read-only field. It is automatically updated whenever fields source_lang and/or algorithm are updated.

algorithm

A non-empty and non-NA character string equal to "sha1", or "utf8". The algorithm to use when hashing source information for identification purposes.

source_lang

A non-empty and non-NA character string. The language of the source text.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1. Doing so maximizes portability and cross-compatibility between packages.

source_text

A non-empty and non-NA character string. The source text. This is a read-only field.

languages

A character vector. Registered language codes. This is a read-only field. Use methods below to update it.

translations

A named character vector. Registered translations of source_text, including the latter. Names correspond to languages. This is a read-only field. Use methods below to update it.

locations

A list of Location objects giving the location(s) of source_text in the underlying project. It can be empty. This is a read-only field. Use methods below to update it.

Methods

Public methods


Method new()

Create a Text object.

Usage
Text$new(algorithm = algorithms())
Arguments
algorithm

A non-empty and non-NA character string equal to "sha1", or "utf8". The algorithm to use when hashing source information for identification purposes.

Returns

An R6 object of class Text.

Examples
# Consider using text() instead.
txt <- Text$new()

Method get_translation()

Extract a translation, or the source text.

Usage
Text$get_translation(lang = "")
Arguments
lang

A non-empty and non-NA character string. The underlying language.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.

Returns

A character string. NULL is returned if the requested translation is not available.

Examples
txt <- Text$new()
txt$set_translation("en", "Hello, world!")

txt$get_translation("en")  ## Outputs "Hello, world!"
txt$get_translation("fr")  ## Outputs NULL

Method set_translation()

Register a translation, or the source text.

Usage
Text$set_translation(lang = "", text = "")
Arguments
lang

A non-empty and non-NA character string. The underlying language.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.

text

A non-empty and non-NA character string. A translation, or the source text.

Details

This method is also used to register source_lang and source_text before setting them as such. See Examples below.

Returns

A NULL, invisibly.

Examples
# Register a pair of source_lang and source_text.
txt <- Text$new()
txt$set_translation("en", "Hello, world!")
txt$source_lang <- "en"

Method set_translations()

Register one or more translations, and/or the source text.

Usage
Text$set_translations(...)
Arguments
...

Any number of named, non-empty, and non-NA character strings.

Details

This method can be viewed as a vectorized version of method set_translation().

Returns

A NULL, invisibly.

Examples
txt <- Text$new()
txt$set_translations(en = "Hello, world!", fr = "Bonjour, monde!")

Method set_locations()

Register one or more locations.

Usage
Text$set_locations(...)
Arguments
...

Any number of Location objects.

Details

This method calls merge_locations() to merge all values passed to ... together with previously registered Location objects. The underlying registered paths and/or ranges won't be duplicated.

Returns

A NULL, invisibly.

Examples
txt <- Text$new()
txt$set_locations(
  location("a", 1L, 2L, 3L, 4L),
  location("a", 1L, 2L, 3L, 4L),
  location("b", 5L, 6L, 7L, 8L))

Method rm_translation()

Remove a registered translation.

Usage
Text$rm_translation(lang = "")
Arguments
lang

A non-empty and non-NA character string identifying a translation to be removed.

Details

You cannot remove lang when it is registered as the current source_lang. You must update source_lang before doing so.

Returns

A NULL, invisibly.

Examples
txt <- Text$new()
txt$set_translations(en = "Hello, world!", fr = "Bonjour, monde!")
txt$source_lang <- "en"

# Remove source_lang and source_text.
txt$source_lang <- "fr"
txt$rm_translation("en")

Method rm_location()

Remove a registered location.

Usage
Text$rm_location(path = "")
Arguments
path

A non-empty and non-NA character string identifying a Location object to be removed.

Returns

A NULL, invisibly.

Examples
txt <- Text$new()
txt$set_locations(
  location("a", 1L, 2L, 3L, 4L),
  location("b", 5L, 6L, 7L, 8L))

txt$rm_location("a")

Examples

# Set source language.
language_source_set("en")

# Create Text objects.
txt1 <- text(
  location("a", 1L, 2L, 3L, 4L),
  location("a", 1L, 2L, 3L, 4L),
  location("b", 5L, 6L, 7L, 8L),
  location("c", c(9L, 10L), c(11L, 12L), c(13L, 14L), c(15L, 16L)),
  en = "Hello, world!",
  fr = "Bonjour, monde!",
  es = "¡Hola, mundo!")

txt2 <- text(
  location("a", 1L, 2L, 3L, 4L),
  en = "Hello, world!",
  fr = "Bonjour, monde!",
  es = "¡Hola, mundo!")

txt3 <- text(
  source_lang = "fr2",
  location("a", 5L, 6L, 7L, 8L),
  en  = "Hello, world!",
  fr2 = "Bonjour le monde!",
  es  = "¡Hola, mundo!")

is_text(txt1)

# Texts objects has a specific format.
# print() calls format() internally, as expected.
print(txt1)
print(txt2)
print(txt3)

# Combine Texts objects.
# c() throws an error if they do not have the same
# hash (same souce_text, source_lang, and algorithm).
c(txt1, txt2)

# Text objects with different hashes can be merged.
# This groups Text objects according to their hashes
# and calls c() on each group. It returns a list.
merge_texts(txt1, txt2, txt3)

# Objects can be coerced to a Text object with as_text(). Below is an
# example for call objects. This is for illustration purposes only,
# and the latter should not be used. This method is used internally by
# find_source().
cl  <- str2lang("translate('Hello, world!')")
loc <- location("example in class-text", 2L, 32L, 2L, 68L)
as_text(cl, loc)


## ------------------------------------------------
## Method `Text$new`
## ------------------------------------------------

# Consider using text() instead.
txt <- Text$new()

## ------------------------------------------------
## Method `Text$get_translation`
## ------------------------------------------------

txt <- Text$new()
txt$set_translation("en", "Hello, world!")

txt$get_translation("en")  ## Outputs "Hello, world!"
txt$get_translation("fr")  ## Outputs NULL

## ------------------------------------------------
## Method `Text$set_translation`
## ------------------------------------------------

# Register a pair of source_lang and source_text.
txt <- Text$new()
txt$set_translation("en", "Hello, world!")
txt$source_lang <- "en"

## ------------------------------------------------
## Method `Text$set_translations`
## ------------------------------------------------

txt <- Text$new()
txt$set_translations(en = "Hello, world!", fr = "Bonjour, monde!")

## ------------------------------------------------
## Method `Text$set_locations`
## ------------------------------------------------

txt <- Text$new()
txt$set_locations(
  location("a", 1L, 2L, 3L, 4L),
  location("a", 1L, 2L, 3L, 4L),
  location("b", 5L, 6L, 7L, 8L))

## ------------------------------------------------
## Method `Text$rm_translation`
## ------------------------------------------------

txt <- Text$new()
txt$set_translations(en = "Hello, world!", fr = "Bonjour, monde!")
txt$source_lang <- "en"

# Remove source_lang and source_text.
txt$source_lang <- "fr"
txt$rm_translation("en")

## ------------------------------------------------
## Method `Text$rm_location`
## ------------------------------------------------

txt <- Text$new()
txt$set_locations(
  location("a", 1L, 2L, 3L, 4L),
  location("b", 5L, 6L, 7L, 8L))

txt$rm_location("a")

Read and Write Text

Description

text_read() and text_write() respectively wrap base::readLines() and base::writeLines(). They further validate their arguments, normalize file paths and re-encode inputs to UTF-8 before reading and writing.

Usage

text_read(path = "", encoding = "UTF-8")

text_write(x = character(), path = "", encoding = "UTF-8")

Arguments

path

A non-empty and non-NA character string. A path to a file to read text from, or write text to.

encoding

A non-empty and non-NA character string. The source character encoding. In almost all cases, this should be UTF-8. Other encodings are internally re-encoded to UTF-8 for portability.

x

A character vector. Lines of text to write. Its current encoding is given by encoding.

Value

text_read() returns a character vector.

text_write() returns NULL, invisibly.

See Also

readLines(), writeLines(), iconv()


Source Text and Translations

Description

Structure and manipulate the source text of a project and its translations.

Usage

translator(..., id = uuid(), algorithm = algorithms())

is_translator(x)

## S3 method for class 'Translator'
format(x, ...)

## S3 method for class 'Translator'
print(x, ...)

Arguments

...

Usage depends on the underlying function.

  • Any number of Text objects and/or named character strings for translator() (in no preferred order).

  • Further arguments passed to or from other methods for format(), and print().

id

A non-empty and non-NA character string. A globally unique identifier for the Translator object. Beware of collisions when using user-defined values.

algorithm

A non-empty and non-NA character string equal to "sha1", or "utf8". The algorithm to use when hashing source information for identification purposes.

x

Any R object.

Details

A Translator object encapsulates the source text of a project (or any other context) and all related translations. Under the hood, Translator objects are collections of Text objects. These do most of the work. They are treated as lower-level component and in typical situations, users rarely interact with them.

Translator objects can be saved and exported with translator_write(). They can be imported back into an R session with translator_read().

Value

translator() returns an R6 object of class Translator.

is_translator() returns a logical value.

format() returns a character vector.

print() returns argument x invisibly.

Active bindings

id

A non-empty and non-NA character string. A globally unique identifier for the underlying object. Beware of plausible collisions when using user-defined values.

algorithm

A non-empty and non-NA character string equal to "sha1", or "utf8". The algorithm to use when hashing source information for identification purposes.

hashes

A character vector of non-empty and non-NA values, or NULL. The set of all hash exposed by registered Text objects. If there is none, hashes is NULL. This is a read-only field updated whenever field algorithm is updated.

source_texts

A character vector of non-empty and non-NA values, or NULL. The set of all registered source texts. If there is none, source_texts is NULL. This is a read-only field.

source_langs

A character vector of non-empty and non-NA values, or NULL. The set of all registered source languages. This is a read-only field.

  • If there is none, source_langs is NULL.

  • If there is one unique value, source_langs is an unnamed character string.

  • Otherwise, it is a named character vector.

languages

A character vector of non-empty and non-NA values, or NULL. The set of all registered languages (codes). If there is none, languages is NULL. This is a read-only field.

native_languages

A named character vector of non-empty and non-NA values, or NULL. A map (bijection) of languages (codes) to native language names. Names are codes and values are native languages. If there is none, native_languages is NULL.

While users retain full control over native_languages, it is best to use well-known schemes such as IETF BCP 47, or ISO 639-1. Doing so maximizes portability and cross-compatibility between packages.

Update this field with method $set_native_languages(). See below for more information.

Methods

Public methods


Method new()

Create a Translator object.

Usage
Translator$new(id = uuid(), algorithm = algorithms())
Arguments
id

A non-empty and non-NA character string. A globally unique identifier for the Translator object. Beware of collisions when using user-defined values.

algorithm

A non-empty and non-NA character string equal to "sha1", or "utf8". The algorithm to use when hashing source information for identification purposes.

Returns

An R6 object of class Translator.

Examples
# Consider using translator() instead.
tr <- Translator$new()

Method translate()

Translate source text.

Usage
Translator$translate(
  ...,
  lang = language_get(),
  source_lang = language_source_get()
)
Arguments
...

Any number of vectors containing atomic elements. Each vector is normalized as a paragraph.

  • Elements are coerced to character values.

  • NA values and empty strings are discarded.

  • Multi-line strings are supported and encouraged. Blank lines are interpreted (two or more newline characters) as paragraph separators.

lang

A non-empty and non-NA character string. The underlying language.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.

source_lang

A non-empty and non-NA character string. The language of the source text. See argument lang for more information.

Details

See normalize() for further details on how ... is normalized.

Returns

A character string. If there is no corresponding translation, the value passed to method $set_default_value() is returned. NULL is returned by default.

Examples
tr <- Translator$new()
tr$set_text(en = "Hello, world!", fr = "Bonjour, monde!")
tr$translate("Hello, world!", lang = "en")  ## Outputs "Hello, world!"
tr$translate("Hello, world!", lang = "fr")  ## Outputs "Bonjour, monde!"

Method get_translation()

Extract a translation or a source text.

Usage
Translator$get_translation(hash = "", lang = "")
Arguments
hash

A non-empty and non-NA character string. The unique identifier of the requested source text.

lang

A non-empty and non-NA character string. The underlying language.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.

Returns

A character string. If there is no corresponding translation, the value passed to method $set_default_value() is returned. NULL is returned by default.

Examples
tr <- Translator$new()
tr$set_text(en = "Hello, world!")

# Consider using translate() instead.
tr$get_translation("256e0d7", "en")  ## Outputs "Hello, world!"

Method get_text()

Extract a source text and its translations.

Usage
Translator$get_text(hash = "")
Arguments
hash

A non-empty and non-NA character string. The unique identifier of the requested source text.

Returns

A Text object, or NULL.

Examples
tr <- Translator$new()
tr$set_text(en = "Hello, world!")

tr$get_translation("256e0d7", "en")  ## Outputs "Hello, world!"

Method set_text()

Register a source text.

Usage
Translator$set_text(..., source_lang = language_source_get())
Arguments
...

Passed as is to text().

source_lang

Passed as is to text().

Returns

A NULL, invisibly.

Examples
tr <- Translator$new()

tr$set_text(en = "Hello, world!", location())

Method set_texts()

Register one or more source texts.

Usage
Translator$set_texts(...)
Arguments
...

Any number of Text objects.

Details

This method calls merge_texts() to merge all values passed to ... together with previously registered Text objects. The underlying registered source texts, translations, and Location objects won't be duplicated.

Returns

A NULL, invisibly.

Examples
# Set source language.
language_source_set("en")

tr <- Translator$new()

# Create Text objects.
txt1 <- text(
  location("a", 1L, 2L, 3L, 4L),
  en = "Hello, world!",
  fr = "Bonjour, monde!")

txt2 <- text(
  location("b", 5L, 6L, 7L, 8L),
  en = "Farewell, world!",
  fr = "Au revoir, monde!")

tr$set_texts(txt1, txt2)

Method rm_text()

Remove a registered source text.

Usage
Translator$rm_text(hash = "")
Arguments
hash

A non-empty and non-NA character string identifying the source text to remove.

Returns

A NULL, invisibly.

Examples
tr <- Translator$new()
tr$set_text(en = "Hello, world!")

tr$rm_text("256e0d7")

Method set_native_languages()

Map a language code to a native language name.

Usage
Translator$set_native_languages(...)
Arguments
...

Any number of named, non-empty, and non-NA character strings. Names are codes and values are native languages. See field native_languages for more information.

Returns

A NULL, invisibly.

Examples
tr <- Translator$new()

tr$set_native_languages(en = "English", fr = "Français")

# Remove existing entries.
tr$set_native_languages(fr = NULL)

Method set_default_value()

Register a default value to return when there is no corresponding translations for the requested language.

Usage
Translator$set_default_value(value = NULL)
Arguments
value

A NULL or a non-NA character string. It can be empty. The former is returned by default.

Details

This modifies what methods $translate() and $get_translation() returns when there is no translation for lang.

Returns

A NULL, invisibly.

Examples
tr <- Translator$new()
tr$set_default_value("<unavailable>")

See Also

find_source(), translator_read(), translator_write()

Examples

# Set source language.
language_source_set("en")

# Create a Translator object.
# This would normally be done automatically
# by find_source(), or translator_read().
tr <- translator(
  id = "test-translator",
  en = "English",
  es = "Español",
  fr = "Français",
  text(
    location("a", 1L, 2L, 3L, 4L),
    en = "Hello, world!",
    fr = "Bonjour, monde!"),
  text(
    location("b", 1L, 2L, 3L, 4L),
    en = "Farewell, world!",
    fr = "Au revoir, monde!"))

is_translator(tr)

# Translator objects has a specific format.
# print() calls format() internally, as expected.
print(tr)


## ------------------------------------------------
## Method `Translator$new`
## ------------------------------------------------

# Consider using translator() instead.
tr <- Translator$new()

## ------------------------------------------------
## Method `Translator$translate`
## ------------------------------------------------

tr <- Translator$new()
tr$set_text(en = "Hello, world!", fr = "Bonjour, monde!")
tr$translate("Hello, world!", lang = "en")  ## Outputs "Hello, world!"
tr$translate("Hello, world!", lang = "fr")  ## Outputs "Bonjour, monde!"

## ------------------------------------------------
## Method `Translator$get_translation`
## ------------------------------------------------

tr <- Translator$new()
tr$set_text(en = "Hello, world!")

# Consider using translate() instead.
tr$get_translation("256e0d7", "en")  ## Outputs "Hello, world!"

## ------------------------------------------------
## Method `Translator$get_text`
## ------------------------------------------------

tr <- Translator$new()
tr$set_text(en = "Hello, world!")

tr$get_translation("256e0d7", "en")  ## Outputs "Hello, world!"

## ------------------------------------------------
## Method `Translator$set_text`
## ------------------------------------------------

tr <- Translator$new()

tr$set_text(en = "Hello, world!", location())

## ------------------------------------------------
## Method `Translator$set_texts`
## ------------------------------------------------

# Set source language.
language_source_set("en")

tr <- Translator$new()

# Create Text objects.
txt1 <- text(
  location("a", 1L, 2L, 3L, 4L),
  en = "Hello, world!",
  fr = "Bonjour, monde!")

txt2 <- text(
  location("b", 5L, 6L, 7L, 8L),
  en = "Farewell, world!",
  fr = "Au revoir, monde!")

tr$set_texts(txt1, txt2)

## ------------------------------------------------
## Method `Translator$rm_text`
## ------------------------------------------------

tr <- Translator$new()
tr$set_text(en = "Hello, world!")

tr$rm_text("256e0d7")

## ------------------------------------------------
## Method `Translator$set_native_languages`
## ------------------------------------------------

tr <- Translator$new()

tr$set_native_languages(en = "English", fr = "Français")

# Remove existing entries.
tr$set_native_languages(fr = NULL)

## ------------------------------------------------
## Method `Translator$set_default_value`
## ------------------------------------------------

tr <- Translator$new()
tr$set_default_value("<unavailable>")

Read and Write Translations

Description

Export Translator objects to text files and import such files back into R as Translator objects.

Usage

translator_read(
  path = getOption("transltr.path"),
  encoding = "UTF-8",
  verbose = getOption("transltr.verbose", TRUE),
  translations = TRUE
)

translator_write(
  tr = translator(),
  path = getOption("transltr.path"),
  overwrite = FALSE,
  verbose = getOption("transltr.verbose", TRUE),
  translations = TRUE
)

translations_read(path = "", encoding = "UTF-8", tr = NULL)

translations_write(tr = translator(), path = "", lang = "")

translations_paths(
  tr = translator(),
  parent_dir = dirname(getOption("transltr.path"))
)

Arguments

path

A non-empty and non-NA character string. A path to a file to read from, or write to.

See Details for more information. translator_write() automatically creates the parent directories of path (recursively) if they do not exist.

encoding

A non-empty and non-NA character string. The source character encoding. In almost all cases, this should be UTF-8. Other encodings are internally re-encoded to UTF-8 for portability.

verbose

A non-NA logical value. Should progress information be reported?

translations

A non-NA logical value. Should translations files also be read, or written along with path (the Translator file)?

tr

A Translator object.

This argument is NULL by default for translations_read(). If a Translator object is passed to this function, it will read translations and further register them (as long as they correspond to an existing source text).

overwrite

A non-NA logical value. Should existing files be overwritten? If such files are detected and overwrite is set equal to TRUE, an error is thrown.

lang

A non-empty and non-NA character string. The underlying language.

A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.

parent_dir

A non-empty and non-NA character string. A path to a parent directory.

Details

The information contained within a Translator object is split: translations are reorganized by language and exported independently from other fields.

translator_write() creates two types of file: a single Translator file, and zero, or more translations files. These are plain text files that can be inspected and modified using a wide variety of tools and systems. They target different audiences:

translator_read() first reads a Translator file and creates a Translator object from it. It then calls translations_paths() to list expected translations files (that should normally be stored alongside the Translator file), attempts to read them, and registers successfully imported translations.

There are two requirements.

The inner workings of the serialization process are thoroughly described in serialize().

Translator file

A Translator file contains a YAML (1.1) representation of a Translator object stripped of all its translations except those that are registered as source text.

Translations files

A translations file contains a FLAT representation of a set of translations sharing the same target language. This format attempts to be as simple as possible for non-technical collaborators.

Value

translator_read() returns an R6 object of class Translator.

translator_write() returns NULL, invisibly. It is used for its side-effects of

translations_read() returns an S3 object of class ExportedTranslations.

translations_write() returns NULL, invisibly.

translations_paths() returns a named character vector.

See Also

Translator, serialize()

Examples

# Set source language.
language_source_set("en")

# Create a path to a temporary Translator file.
temp_path <- tempfile(pattern = "translator_", fileext = ".yml")
temp_dir  <- dirname(temp_path)  ## tempdir() could also be used

# Create a Translator object.
# This would normally be done by find_source(), or translator_read().
tr <- translator(
  id = "test-translator",
  en = "English",
  es = "Español",
  fr = "Français",
  text(
    en = "Hello, world!",
    fr = "Bonjour, monde!"),
  text(
    en = "Farewell, world!",
    fr = "Au revoir, monde!"))

# Export it. This creates 3 files: 1 Translator file, and 2 translations
# files because two non-source languages are registered. The file for
# language "es" contains placeholders and must be completed.
translator_write(tr, temp_path)
translator_read(temp_path)

# Translations can be read individually.
translations_files <- translations_paths(tr, temp_dir)
translations_read(translations_files[["es"]])
translations_read(translations_files[["fr"]])

# This is rarely useful, but translations can also be exported individually.
# You may use this to add a new language, as long as it has an entry in the
# underlying Translator object (or file).
tr$set_native_languages(el = "Greek")

translations_files <- translations_paths(tr, temp_dir)

translations_write(tr, translations_files[["el"]], "el")
translations_read(file.path(temp_dir, "el.txt"))


Universally Unique Identifiers

Description

Generate a random UUID (Universally Unique Identifier) that complies to what RFC4122 prescribes. Such a value is also known as a version 4 UUID.

Usage

uuid()

uuid_raw()

uuid_is(x)

Arguments

x

An R object.

Details

uuid() calls uuid_raw() and formats its output accordingly.

Pseudo-random bytes are generated with sample() whenever uuid_raw() is called. This is most likely done before runtime when Translator objects are created. uuid_raw() samples values in the ⁠[0, 255]⁠ range with replacement and converts them to raw values. The user must ensure that the underlying seed is appropriate when generating UUIDs. See set.seed() for more information.

Value

uuid() returns a character of length 1 containing exactly 36 characters: 32 hexadecimal characters and 4 hyphens (used as separators).

uuid_raw() returns a raw vector of length 16.

uuid_is() returns a logical vector having the same length as x. It checks whether its elements are valid version 4 (variant 1) UUIDs or not. It returns FALSE for any other kind of UUID.

Note

UUIDs are designed to be globally unique (collisions are extremely unlikely) and are sometimes called GUIDs (Globally Unique Identifiers). There are several UUID versions with slightly different purposes.

Package transltr uses random identifiers (version 4/variant 1, also known as DCE 1.1, ISO/IEC 11578:1996).

See Also

RFC4122, UUIDs explained

Examples

uuid()
uuid_raw()
uuid_is(uuid())      ## TRUE
uuid_is(uuid_raw())  ## FALSE, uuid_raw() does not return a string.


Apply Wrappers

Description

These functions wrap a function of the apply() family, and enforce various values for convenience. Arguments are passed as is to an apply() function.

Usage

vapply_1l(x, fun, ...)

vapply_1i(x, fun, ...)

vapply_1c(x, fun, ...)

map(fun, ..., more = list())

Arguments

x

See argument X of vapply().

fun

See argument FUN of vapply(), and .mapply().

...

Further optional arguments passed to fun.

more

See argument MoreArgs of .mapply().

Value

vapply_1l(), vapply_1l(), and vapply_1c() respectively return a logical, an integer, and a character vector having the same length as x. Names are always discarded.

map() returns a list having the same length as the longest element passed to ....

See Also

Other utility functions: format_vector(), stops(), str_to()