Dear All,
it’s been a while since I announced any new release of the package ‘stylo’ that some of you use. The number of other involvements I have makes it a bit tricky to provide updates on a regular basis. Fortunately, due to supportive help of the COST Action “Distant Reading”, Steffen Pielström could spend a few weeks in Krakow, woking on new functionalities of ‘stylo’. Steffen did a great job, and the new version is on its way to the CRAN repository. The new functionalities include:
1. GUI for setting working directory
Using the function stylo(), you don’t really have to set your working directory. It is still advisable to do so (and I myself will still keep setting the directory) but the script will not crash if you forget about this step. Should the function detect no corpus in the current directory, it will launch an additional GUI window to let you specify its location.
2. handling metadata via an external file
The default way of dealing with metadata in ‘stylo’ is by using filenames to provide essential information. There has been a request from Christof, successfully implemented by Steffen: there is now an option called "metadata". If not specified, everything works as if nothing happened. Using it the option, however, you can specify a grouping variable, e.g. a vector containing author names, genre labels, decades, or any other metadata factor to control the coloring (the number of items and their order should match the order of the loaded texts of course). Alternatively, a CSV file containing the metadata and a column contaning the grouping variable can be specified. Here are some examples:
stylo(metadata = "Austen", "Austen", "Dickens", "Dickens", "Dickens", "EBronte", "Unknown")
stylo(metadata = "metadata.csv", grouping.column = "gender")
3. UTF-8 support
From the release 0.7.0 onwards, UTF-8 will be the default encoding, even in Windows. The current CRAN version (0.6.9) introduces some functionalities to make it possible; if you want to switch right now, please install the beta version of the 0.7.0 release, please install the package from GitHub:
library(devtools)
install_github("computationalstylistics/stylo")
4. encoding conversion
To facilitate preparing text files in the right encoding, Steffen designed and implemented two dedicated functions check.encoding() and change.encoding(). The former loads the files from the corpus, and provides an automatic check for the encoding currently used. The former function, as its name suggests, is for doing the actual conversion. Refer to the manual pages of these functions:
help(check.encoding)
help(change.encoding)
5. a fix for exporting networks to Gephi ver. 0.9.2
Some of you enjoy the Bootstrap Consensus Networks method, in particular when combined with the software Gephi. A nasty obstacle appeared in the new version of this package, since it does not read the CSV files as produced by the previous versions of ‘stylo’. The fix introduced here allows you for working with both early and recent versions of Gephi.
6. support for JCK (Japanese-Chinese-Korean) improved
I had no idea that in Japanese, there are two characters that are not actual letters, but they are still quite important as repetition characters. These were included into the tokenizer.
7. support for rmarkdown
Once you discover rmarkdown (
https://rmarkdown.rstudio.com/), you get addicted to it very quickly. This is a package that allows you for writing your papers with R plots generated ‘on the fly’. Same about presentations, which can be compliant with reveal.js or Latex/Beamer. The problem of invoking the function stylo() from inside rmarkdown is that stylo() produces a considerable number of diagnostic messages on screen. In the newly introduced 0.6.9 release, it is possible to switch the messages off:
suppressMessages(stylo())
The same functionality has been implemented in the following functions: stylo(), classify(), oppose(). There’s more to come!
That's all for now. Happy stylo()-ing!
Best,
Maciej