"Literate programming"

Table of Contents

1 Introduction

Several tools to make code human-readable are available:

(Note that tools for development and presentation may not be one and the same.)

R Markdown example:

Jupyter example (for maintaining persistent sessions among cells in GNU Octave):

While the browser was presciently suggested as the editor of the future (from 2008):

IDEs are draining users away, but it's not the classic fat-client IDEs that are ultimately going to kill Emacs. It's the browsers.

many features of advanced editors are still not available for editing Jupyter notebooks (perhaps improved with JupyterLab). With R Markdown, it is possible to do major editing in an external editor (emacs, vim) - RStudio updates script buffer immediately - and then running code piecewise and debugging (also possible in emacs, but possibly simpler in RStudio).

Citations for R Markdown and Jupyter are handled by pandoc. For org-babel, citations can be handed with pandoc (through ox-pandoc) or org-ref. Pandoc trails in implementations of more advanced features of org-mode, but may be used with most common markup tags. The documentation for org-ref is a bit sparse and pathway to customization (though possible) is not fully apparent at the moment. citeproc-org provides additional features to org-ref; also ox-bibtex or company-bibtex exist but I am not as familiar with them.

The rest of the document will cover org-babel.

2 Org-babel

2.1 General org-mode settings

Once standard in org-mode, newer org versions require easy templates to be accessed through (require 'org-tempo). Easy templates generate #+begin_=/=#+end_ tags with intuitive keyboard shortcuts, such as:

  • <s TAB: "src"
  • <q TAB: "quote"
  • <e TAB: "example"

Lower case is now accepted for keywords.

There is one issue with org-tempo.el in org-plus-contrib-20190415 in that <s TAB incorrectly expands to #+begin_end#+begin_src if you are at the end of the buffer. You can remediate this with this correction in your .emacs file.

(defadvice org-tempo-add-block (before org-tempo-add-block-ensure-newline)
  (progn
    (if (forward-line 1) (newline))
    (forward-line -1)))
(ad-activate 'org-tempo-add-block)
;; (ad-unadvise 'org-tempo-add-block)

Indentation options - two that I found useful:

  • (setq org-startup-indented t): indent according to outline level
  • (setq org-adapt-indentation nil): flush left (start at column 0)

Some additional customizations -

#+options: html-postamble:nil

\n:t is also possible but can lead to undesirable newlines in code.

In principle, org-babel files can be exported to html, pdf, etc. Preamble for LaTeX can be added as below:

#+latex_class_options: [10pt, a4paper]
#+latex_header: \usepackage[margin=2.5cm]{geometry}
#+latex_header: \usepackage{parskip}
#+latex_header: \usepackage{natbib}
#+latex_header: \usepackage{enumitem}
#+latex_header: \setlist[1]{itemsep=-2pt}
#+latex_header: \usepackage{amsmath}
#+latex_header: \usepackage{xspace}
#+latex_header: \newcommand{\wav}{\ensuremath{\tilde\nu}\xspace}

2.2 Specific org-babel settings

Custom variables:

(custom-set-variables
 '(org-babel-load-languages (quote ((emacs-lisp . t) (R . t) (python . t))))
 '(org-confirm-babel-evaluate nil))

For each document, the following property can be declared at the top:

#+property: header-args :session *R* :exports both :tangle yes :results output
  • :session is the name of the persistent session used by multiple code blocks
  • :exports
  • :tangle
  • :results

To evaluate code blocks, use C-c at the beginning of the block. For interactive work with figures, enable M-x org-toggle-inline-images.

3 Citing references

3.1 ox-pandoc

ox-pandoc is an appealing option because of pandoc's general compatibility across many formats, and consistent citation style with R Markdown and Jupyter (with possibility for .csl specification).

Enable with (require 'ox-pandoc) and include such statements in the header:

#+pandoc_options: bibliography:references.bib
#+pandoc_options: filter:pandoc-citeproc
#+pandoc_options: csl:copernicus-publications.csl

Here is a working example (only using R at the moment) based on a modification of this example using org-ref.

org-babel-ox-pandoc.org

This can be exported to HTML with C-e p 4 (org-pandoc-export-to-html5). The output can be found below.

org-babel-ox-pandoc.html

However, pandoc is currently incompatible with orb-babel because it loses execution output because #+results blocks aren't processed:

3.2 org-ref

org-ref is the default citation tool. Pandoc does permit some org-ref syntax for citations, which could argue for adopting org-ref.

Enable with (require 'org-ref) and in each document include a bibliography statement. For exporting to LaTeX, #+latex_header: \usepackage{natbib} or equivalent in the preamble and a bibliographystyle statement at the end are also necessary.

For large bib files, it is necessary to use (setq org-ref-show-broken-links nil) to keep org-mode from choking.

Here is a minimal working example (only using R at the moment) based on a modification of the same example using org-ref. This exports to both HTML and LaTeX.

org-babel-org-ref.org

This can be exported to HTML with C-e h o (org-html-export-to-html). Support for citations in HTML export is limited but functional. The behavior of cite, citep, citet is unfortunately identical and is the citation surrounded by parentheses (I believe the common use case is to use numbered citations). It is possible to change the default to (setq org-ref-default-citation-link "citet"). The output can be found below.

org-babel-org-ref.html

3.3 citeproc-org

citeproc-org is based on org-ref but allows CSL file definitions. Enable with

(require 'citeproc-org) 
(citeproc-org-setup))

In addition to the statements required for org-ref, the preamble in each document can define a CSL file:

#+csl_style: american-geophysical-union.csl

citeproc-org accepts same citation syntax as org-ref with same quirk of parentheses or "affixes". A kludge is to set the variable custom variable citeproc-org-suppress-affixes-cite-link-types (optionally, making it a local buffer variable) to include "citet", which remove parentheses altogether (originally reserved for "citealp") but is perhaps a reasonable compromise.

(setq citeproc-org-suppress-affixes-cite-link-types '("citet" "citealt"))

Example for using citeproc-org is shown below.

org-babel-citeproc-org.org

This can be exported to HTML with C-e h o (org-html-export-to-html). The output can be found below.

org-babel-citeproc-org.html.

Note that for installation, M-x package-install-file should be invoked on the downloaded tar file (the directories locals/ and styles/ may need to be manually copied over into the .emacs.d/elpa/citeproc-org-XXX/ directory afterward).

4 LaTeX snippets in HTML export

  • LaTeX packages implemented in MathJax can be specified with the variable org-html-mathjax-template. The downside is that macros must be called within a math environment (which is not necessarily the case with \ce or \si in pure LaTeX, for instance).
  • LaTeX macros can be defined by setting up a fake Babel language to be evaluated.

See the following emacs lisp file for examples:

org-settings.el

5 Example

Example with citeproc-org.

#+title: Example export for HTML and LaTeX
#+options: html-postamble:nil
#+html_head: <link rel="stylesheet" type="text/css" href="htts://aprl.gitlab.io/markdown-roboto.css"/>
#+csl_style: american-geophysical-union.csl
#+property: header-args :session *R* :exports both :tangle yes :results output
#+latex_class_options: [10pt, a4paper]
#+latex_header: \usepackage[margin=2.5cm]{geometry}
#+latex_header: \usepackage{natbib}

#+begin_src R
x <- rnorm(10)
summary(x)
#+end_src

\begin{equation}
y = \exp\left[\frac{(x-\mu)^{2}}{2\sigma^{2}}\right]
\end{equation}

#+begin_src R :results graphics :file a.png
  y <- rnorm(10)
  plot(x, y)
#+end_src

citet:Lowry1951

bibliographystyle:agu
bibliography:references.bib