Using consistent R and LaTeX fonts in Org (or knitr, or Sweave)

I love good typography, even more so as Microsoft Word and PowerPoint have debased our standards. When I see a really fine piece of technical typesetting, it’s almost always done using TeX and friends. Beautiful LaTeX documents are easy to recognize. Beautiful R graphics are also easy to recognize. When literate programming systems like Sweave, Org mode, or knitr weave R graphics and LaTeX typesetting together, the beauty of both LaTeX and R is obvious, but documents can still look all wrong because of font clash.

Documents typeset purely in LaTeX can have a visual consistency that is hard to match. Take Kevin Lynagh‘s beautifully typeset undergraduate thesis. Kevin obviously cares about typography, so much that he ended up making many of his plots in using the LaTeX pgfplots package, based on the equally incredible PGF/TikZ. These are terrific packages, and ones that I use myself. But these are not replacements for R. To get R graphics output into LaTeX without any font clash, I need to either use something like the tikzDevice package, which has been dropped from CRAN and seems to have stalled, or generate PDFs and PNGs that complement my font choices in LaTeX.  [edit:  Kevin’s thesis source is available here.]

As wonderful as R is for plotting, changing the fonts in plots can be a bit cryptic. The base graphics package has methods, but lattice and ggplot2 are built on top of the grid package, which is another beast entirely. The extrafonts package described by Winston Chang is a terrific option for individual plots, but at least for me it didn’t seem quite clear how to change an entire literate document in a single line.

An alternative is the Cairo package which provides the ability to change fonts in any supported device.  Cairo also provides its own drop in replacement commands to the standard commands png(), pdf(), etc., which can be dropped in for a literate programing session.  I’d be interested to know what limitations others have found in these replacements.

Most of the time I use the LaTeX mathdesign package with Charter BT fonts. But I’m fickle, and sometimes use urw-garamond. When preparing Beamer presentations at NUS I tend to use Verdana, because that’s the university’s standard. Since I’m almost always using Org, with R blocks evaluated in the Babel literate programming framework, I want a solution in which all the graphics generated by R will match the LaTeX main text font as closely as possible.  When I move an R code block from a beamer presentation to a manuscript draft, I don’t want to have to do anything special.  It should just work.

The solution using Cairo appears to be pretty simple. In the beginning of an Org mode document in which the LaTeX will be typeset in Garamond, I can put the following

#+begin_src R :exports none :results silent :session
  library(Cairo)
  mainfont <- "Garamond"
  CairoFonts(regular = paste(mainfont,"style=Regular",sep=":"),
             bold = paste(mainfont,"style=Bold",sep=":"),
             italic = paste(mainfont,"style=Italic",sep=":"),
             bolditalic = paste(mainfont,"style=Bold Italic,BoldItalic",sep=":"))
  pdf <- CairoPDF
  png <- CairoPNG
#+end_src

With that in place, my fonts in exported PDF or PNG graphics from R will all use Garamond, largely in keeping with the LaTeX font. Strictly speaking, the urw-garamond in the LaTeX mathdesign package is not the same as the system font on MacOSX that R will be using, but it’s pretty close. Note this has to be done in each R session if an Org-mode file is running multiple sessions.

So for example, a code block like

#+begin_src R :exports results :results graphics :session :file histogram.png 
x <- rnorm(100)
hist(x,main="This is a histogram using Garamond")
#+end_src

will result in a histogram like

http://tucker-kellogg.com/blog/wp-content/uploads/2012/10/wpid-histogram.png

The Cairo package is all well documented stuff, though I have to admit I found the other ways of handling graphics fonts confusing. But within the limits of the four choices available in Cairo, one can mix and match system fonts. If your locale supports UTF-8, you can do some crazy things.  For example, you can redefine the italic family to a font that supports Chinese characters, and create something completely nonsensical, such as a ggplot histogram with text in a mix of xkcd and Chinese, e.g.

#+begin_src R :exports results  :results graphics output :session :file chinese.png
  Sys.setlocale("LC_CTYPE","en_US.UTF-8")
  library(Cairo)
  mainfont <- "xkcd"
  CairoFonts(regular = paste(mainfont,"style=Regular",sep=":"),
             bold = paste(mainfont,"style=Bold",sep=":"),
             italic = paste("SimSun","style=Regular",sep=":"),
             bolditalic = paste(mainfont,"style=Bold Italic,BoldItalic",sep=":"))
  pdf <- CairoPDF
  png <- CairoPNG
  qplot(x) + theme_bw() + ggtitle(expression(paste(italic("这是一个用"),"xkcd",italic("的直方图"))))
#+end_src

http://tucker-kellogg.com/blog/wp-content/uploads/2012/10/wpid-chinese.png

Alas, the XKCD font is not Unicode, so there are no Chinese xkcd characters.

In this case, what works for Org mode should work equally well for Sweave and knitr

Using Org mode for course development and presentations

For the last two semesters I’ve been teaching part of LSM2241, Introductory Bioinformatics at NUS. This is the first serious exposure to Bioinformatics for students in the Life Sciences at NUS, so it’s a great opportunity to help ~160 students appreciate the increasingly central role bioinformatics has in the practice of biology.

This coming term I’m giving all the lectures. I don’t consider myself a great lecturer – on the contrary, I consider this an opportunity to practice – but I think second year undergraduates don’t benefit much from team-taught lecture courses, so one lecturer of my quality is better than four lecturers of varying quality.

The workhorse of my course planning and preparation is Emacs Org mode. I use it for planning my own work, for making presentations and handouts (via Beamer and LaTeX export), for preparing exams, and for tracking my own development as a teacher.

Last term I found some huge benefits of Beamer over PowerPoint or Keynote. For example, when we were discussing dynamic programming algorithms for sequence alignment, it was pretty straightforward to write an alignment program that emitted TikZ diagrams animating the steps of filling in an alignment matrix. I ended up using this twice in the lecture: first as a worked example in the slide copies the students received, and second during the lecture itself, so we could walk through the whole classroom and perform an alignment via student participation.

The animation would have been unthinkable in PowerPoint, since it added the equivalent of ~80 slides to the deck. With Beamer I could (i) make the animation, (ii) distribute a slide deck with the filled matrix from the animation, thus satisfying the demands of today’s students to have slides ahead of time, while still minimizing excessive printing, and (iii) generate a second animation from an unseen alignment problem for the class to work through together during the lecture. Doing it this year will be as simple as changing the input to the program and regenerating the slides.

This is all done via Org-babel, the miraculous multi-lingual literate programming environment supported by Org mode.

But before getting into that level of detail, I’ll mention one tweak I use in Org-beamer export. When exporting to LaTeX or HTML, Org mode knows to do the right thing for figures and tables. So, for example,

#+CAPTION: This is a caption
[[file:/media/foo.png]]

will create a figure with a caption when exported to any supported format, including LaTeX and HTML.

However, the LaTeX export uses \caption{}, which automatically adds a Figure 1 label to the caption of the first figure in the document. Likewise for tables. In normal LaTeX documents, that’s the right thing to do, but for Beamer numbered figures and tables aren’t needed.

But I still want captions! To fix this, I make sure to include the caption package in the header with

#+LATEX_HEADER: \usepackage[justification=centering]{caption}

Then I add a hook to convert all my \caption to \caption* as the last step in my Org Beamer exporter.

(defun latex-buffer-caption-to-caption* ()
(when org-beamer-export-is-beamer-p
(replace-regexp "\\(\\\\caption\\)\\([[{]\\)" "\\1*\\2" nil
(point-min) (point-max))))

(add-hook 'org-export-latex-final-hook
'latex-buffer-caption-to-caption* 'append)

The org-export-latex-final-hook captures all the hooks that run right before saving the generated LaTeX buffer, and org-beamer-export-is-beamer-p restricts the behavior to Beamer export.