zettelkasten

Search IconIcon to open search
Dark ModeDark Mode

Academic Writing in Markdown with Pandoc and LaTeX

Date: 3 Mar 2022

#post

This post originally appeared on Blog 2.0


Markdown is a simple yet powerful way to write. It’s plain text, so you don’t need to deal with all the messy styling. It’s also highly compatible, used almost everywhere.

There are a few things I need when it comes to academic writing:

  • Professional look (i.e. not the random sans font exported by Obsidian)
  • Citations
  • Quick to use
  • High compatibility (i.e. works with citation managers, LaTeX, etc.)

Pandoc is a versatile document converter. It comes with a citation processor and has the features I need. A pandoc citation looks like this: [@bailey-serresWaterproofingCrops2012]; and I want an automated process that turns it into the citation according to my style like (Bailey-Serres et al., 2012) while also adding the reference to the reference list.

To do so, we first need a .bib file. Simply use the Better BibTex plugin for Zotero to export a BibTex citation library. Enable auto export to keep the .bib file up to date automatically.

Then, we need to get the right .cls Zotero style file containing information about the citation style. The Zotero Style Repository has thousands of styles available.

Then, we can use pandoc commands to convert markdown documents.

A basic command looks like this. Notice that pandoc automatically figures out file type! The -o here means output. They have a gigantic manual if you want the detail.

pandoc input.md -o output.html

The --standalone keyword makes the generated file independent

pandoc input.md -o output.html --standalone

To include citations, we use pandoc’s --citeproc option. Before that, we want to add references to citation style file and bib file in markdown’s yaml:

---
bibliography: ['Path/to/Bib/Name.bib']
csl: 'Path/to/Bib/Name.csl'
---

Then run this. We will get a html with beautiful citations.

pandoc input.md -o output.html --standalone --citeproc

Personally I like single line breaks so add this to the command:

--from markdown+hard_line_breaks

A proper markdown header for html should also include the title key:

---
title:  'This is the title: it contains a colon'
---

In yaml, use link-citations: true to make citations clickable

There are just so many options and CLI is just a bit overwhelming. So it’s worth keeping a cheatsheet (ummm can’t find one, maybe this post counts?).

We can do the same thing while converting markdown to pdfs.

Pandoc requires us to to specify a pdf engine by adding:

--pdf-engine=xelatex

This is because pandoc essentially create pdf via LaTeX, so we can actually export LaTeX first. Some helpful stuff to include in the yaml includes:

# change paper size
papersize: a4
fontsize: 12pt
margin-top: 25mm
margin-right: 25mm
margin-bottom: 25mm
margin-left: 25mm
# add packages to latex header
header-includes: 
- |
  \```{=latex}
  \thispagestyle{empty}
  ...
  \```

Basic yaml metadata that pandoc would add to latex:

author: Kathy Reid 
title: blog post on pandoc 
date: December 2020 
abstract: | This is the abstract.

There are extra variable for LaTeX to style it, but there are simply too many options out there. Some examples:

  • --variable documentclass=acmart
  • --variable classname=acmlarge

That’s it. The basics of how to write markdown with citations. There are a few other tools that I personally use in my workflow:

  • Use the Citations plugin in Obsidian to insert [@citeKey]s that pandoc understand. This can be integrated with Zotero.
  • Use VSCode’s Pandoc Citer extension for inserting cite keys in markdown.
  • Use VSCode’s Markdown Preview Enhanced plugin to preview markdown with citations (need to enable the pandoc parser in settings)
  • Take a look at the Eisvogel latex template if you want some fancy pdf exports through pandoc.