"Literate programming"
Table of Contents
1 Introduction
Several tools to make code human-readable are available:
- R Markdown (mostly R, now python with reticulate)
- Jupyter (language agnostic but not plain text file)
- org-babel (through emacs)
(Note that tools for development and presentation may not be one and the same.)
R Markdown example:
Jupyter example (for maintaining persistent sessions among cells in GNU Octave):
While the browser was presciently suggested as the editor of the future (from 2008):
IDEs are draining users away, but it's not the classic fat-client IDEs that are ultimately going to kill Emacs. It's the browsers.
many features of advanced editors are still not available for editing Jupyter notebooks (perhaps improved with JupyterLab). With R Markdown, it is possible to do major editing in an external editor (emacs, vim) - RStudio updates script buffer immediately - and then running code piecewise and debugging (also possible in emacs, but possibly simpler in RStudio).
Citations for R Markdown and Jupyter are handled by pandoc. For org-babel, citations can be handed with pandoc (through ox-pandoc
) or org-ref
. Pandoc trails in implementations of more advanced features of org-mode, but may be used with most common markup tags. The documentation for org-ref is a bit sparse and pathway to customization (though possible) is not fully apparent at the moment. citeproc-org
provides additional features to org-ref
; also ox-bibtex
or company-bibtex
exist but I am not as familiar with them.
The rest of the document will cover org-babel.
2 Org-babel
2.1 General org-mode settings
Once standard in org-mode, newer org versions require easy templates to be accessed through (require 'org-tempo)
. Easy templates generate #+begin_=/=#+end_
tags with intuitive keyboard shortcuts, such as:
<s TAB
: "src"<q TAB
: "quote"<e TAB
: "example"
Lower case is now accepted for keywords.
There is one issue with org-tempo.el
in org-plus-contrib-20190415
in that <s TAB
incorrectly expands to #+begin_end#+begin_src
if you are at the end of the buffer. You can remediate this with this correction in your .emacs file.
(defadvice org-tempo-add-block (before org-tempo-add-block-ensure-newline) (progn (if (forward-line 1) (newline)) (forward-line -1))) (ad-activate 'org-tempo-add-block) ;; (ad-unadvise 'org-tempo-add-block)
Indentation options - two that I found useful:
(setq org-startup-indented t)
: indent according to outline level(setq org-adapt-indentation nil)
: flush left (start at column 0)
Some additional customizations -
#+options: html-postamble:nil
\n:t
is also possible but can lead to undesirable newlines in code.
In principle, org-babel files can be exported to html, pdf, etc. Preamble for LaTeX can be added as below:
#+latex_class_options: [10pt, a4paper] #+latex_header: \usepackage[margin=2.5cm]{geometry} #+latex_header: \usepackage{parskip} #+latex_header: \usepackage{natbib} #+latex_header: \usepackage{enumitem} #+latex_header: \setlist[1]{itemsep=-2pt} #+latex_header: \usepackage{amsmath} #+latex_header: \usepackage{xspace} #+latex_header: \newcommand{\wav}{\ensuremath{\tilde\nu}\xspace}
2.2 Specific org-babel settings
Custom variables:
(custom-set-variables
'(org-babel-load-languages (quote ((emacs-lisp . t) (R . t) (python . t))))
'(org-confirm-babel-evaluate nil))
For each document, the following property can be declared at the top:
#+property: header-args :session *R* :exports both :tangle yes :results output
:session
is the name of the persistent session used by multiple code blocks:exports
:tangle
:results
To evaluate code blocks, use C-c
at the beginning of the block. For interactive work with figures, enable M-x org-toggle-inline-images
.
3 Citing references
3.1 ox-pandoc
ox-pandoc is an appealing option because of pandoc's general compatibility across many formats, and consistent citation style with R Markdown and Jupyter (with possibility for .csl specification).
Enable with (require 'ox-pandoc)
and include such statements in the header:
#+pandoc_options: bibliography:references.bib #+pandoc_options: filter:pandoc-citeproc #+pandoc_options: csl:copernicus-publications.csl
Here is a working example (only using R at the moment) based on a modification of this example using org-ref.
This can be exported to HTML with C-e p 4
(org-pandoc-export-to-html5
). The output can be found below.
However, pandoc is currently incompatible with orb-babel because it loses execution output because #+results
blocks aren't processed:
3.2 org-ref
org-ref is the default citation tool. Pandoc does permit some org-ref syntax for citations, which could argue for adopting org-ref.
Enable with (require 'org-ref)
and in each document include a bibliography
statement. For exporting to LaTeX, #+latex_header: \usepackage{natbib}
or equivalent in the preamble and a bibliographystyle
statement at the end are also necessary.
For large bib files, it is necessary to use (setq org-ref-show-broken-links nil)
to keep org-mode from choking.
Here is a minimal working example (only using R at the moment) based on a modification of the same example using org-ref. This exports to both HTML and LaTeX.
This can be exported to HTML with C-e h o
(org-html-export-to-html
). Support for citations in HTML export is limited but functional. The behavior of cite
, citep
, citet
is unfortunately identical and is the citation surrounded by parentheses (I believe the common use case is to use numbered citations). It is possible to change the default to (setq org-ref-default-citation-link "citet")
. The output can be found below.
3.3 citeproc-org
citeproc-org is based on org-ref but allows CSL file definitions. Enable with
(require 'citeproc-org) (citeproc-org-setup))
In addition to the statements required for org-ref
, the preamble in each document can define a CSL file:
#+csl_style: american-geophysical-union.csl
citeproc-org
accepts same citation syntax as org-ref
with same quirk of parentheses or "affixes". A kludge is to set the variable custom variable citeproc-org-suppress-affixes-cite-link-types
(optionally, making it a local buffer variable) to include "citet"
, which remove parentheses altogether (originally reserved for "citealp"
) but is perhaps a reasonable compromise.
(setq citeproc-org-suppress-affixes-cite-link-types '("citet" "citealt"))
Example for using citeproc-org
is shown below.
This can be exported to HTML with C-e h o
(org-html-export-to-html
). The output can be found below.
Note that for installation, M-x package-install-file
should be invoked on the downloaded tar file (the directories locals/ and styles/ may need to be manually copied over into the .emacs.d/elpa/citeproc-org-XXX/ directory afterward).
4 LaTeX snippets in HTML export
- LaTeX packages implemented in MathJax can be specified with the variable
org-html-mathjax-template
. The downside is that macros must be called within a math environment (which is not necessarily the case with\ce
or\si
in pure LaTeX, for instance). - LaTeX macros can be defined by setting up a fake Babel language to be evaluated.
See the following emacs lisp file for examples:
5 Example
Example with citeproc-org
.
#+title: Example export for HTML and LaTeX #+options: html-postamble:nil #+html_head: <link rel="stylesheet" type="text/css" href="htts://aprl.gitlab.io/markdown-roboto.css"/> #+csl_style: american-geophysical-union.csl #+property: header-args :session *R* :exports both :tangle yes :results output #+latex_class_options: [10pt, a4paper] #+latex_header: \usepackage[margin=2.5cm]{geometry} #+latex_header: \usepackage{natbib} #+begin_src R x <- rnorm(10) summary(x) #+end_src \begin{equation} y = \exp\left[\frac{(x-\mu)^{2}}{2\sigma^{2}}\right] \end{equation} #+begin_src R :results graphics :file a.png y <- rnorm(10) plot(x, y) #+end_src citet:Lowry1951 bibliographystyle:agu bibliography:references.bib