Latex generator using Jinja
Wed 11 November 2020
Kawasukune-jinja (Photo creadit: Wikipedia)
The goal is to generate a PDF file using python. I decided to generate \(\LaTeX\).
Pipeline
I decided to use jinja as its documentation mention it.
The hard part is the \(\LaTeX\) generation using python. This require few modification in jinja templates.
Jinja modification
Jinja is made to generate html files. Directives (such as statements and expressions) use delimiters that are not entirely compatible with \(\LaTeX\) file.
There is a widely used way <http://eosrei.net/articles/2015/11/latex-templates-python-and-jinja2-generate-pdfs>_ to overcome this problem written by Brad Erikson.
A basic minimal working example is the following:
- The \(LaTeX\) template:
templates/document.tex:
\documentclass{article}
\BLOCK{ for pkg in packages }
\usepackage{\VAR{pkg}}
\BLOCK{ endfor }
\begin{document}
\BLOCK{ block content } \BLOCK{ endblock }
\end{document}
- The code to use it
import os.path
import jinja2
latex_jinja_env = jinja2.Environment(
block_start_string="\BLOCK{",
block_end_string="}",
variable_start_string="\VAR{",
variable_end_string="}",
comment_start_string="\#{",
comment_end_string="}",
line_statement_prefix="%%",
line_comment_prefix="%#",
trim_blocks=True,
autoescape=False,
loader=jinja2.FileSystemLoader(os.path.abspath("templates")),
)
doc = latex_jinja_env.get_template("document.tex")
simpledoc = doc.render(packages=["amsmath", "makeidx"])
with open("/tmp/doc.tex", "w") as fd:
fd.write(simpledoc)
- The produced file
\documentclass{article}
\usepackage{amsmath}
\usepackage{makeidx}
\begin{document}
\end{document}
Yes, I know, there is no content, but is uses several jinja mechanisms:
- a for loop to specify packages used
- jinja block
- jinja variable
I don't understand the difference between block and statement. In both cases it consists of executable code lines.
Template specificity
Inheritance
Let's now play with the inheritance feature of jinja templates.
We have to:
- extend the document
- modify the packages variable to ensure one specific package in used
- fill the content block
I decided to implement a template to write a table taking as argument a list of dict (API comparable to the csv.DictWriter)
The directives to extend the first template and to include a package (let's say the array package) are the following
%% extends "document.tex"
%% set packages = packages + ["array"] if packages else ["array",]
Then, we have to retrieve the table headers and the number of columns
%% if not headers
%% set headers = table[0].keys()
%% set ncol = headers|length
%% endif
I decided to use keys from the first line if no headers were provided.
Finally, we have to draw the table
\BLOCK{ block content }
\begin{tabular}{ l|*{\VAR{ ncol }}{l}|}
\cline{2-\VAR{ ncol + 1 }}
\BLOCK{ for h in headers }& \VAR{ h } \BLOCK{ endfor }\\
\cline{2-\VAR{ ncol + 1 }}
\end{tabular}
\BLOCK{ endblock }
I decided to offset the columns because I first wanted to begin each line by its number but then I dropped this idea.
dict keys
One difficulty is to access dict element by keys when keys are themselves variables. I didn't find any documentation although this step is not that difficult:
Given the dict row and the keys in the iteratble headers:
\BLOCK{ for col in headers }& \VAR{ row[col] } \BLOCK{ endfor }
The table template
Putting all together, the template is the following one:
templates/dict_writer.tex
%% extends "document.tex"
%% set packages = packages + ["array"] if packages else ["array",]
%% if not headers
%% set headers = table[0].keys()
%% set ncol = headers|length
%% endif
\BLOCK{ block content }
\begin{tabular}{ l|*{\VAR{ ncol }}{l}|}
\cline{2-\VAR{ ncol + 1 }}
\BLOCK{ for h in headers }& \VAR{ h } \BLOCK{ endfor }\\
\cline{2-\VAR{ ncol + 1 }}
%% for row in table
\BLOCK{ for col in headers }& \VAR{ row[col] } \BLOCK{ endfor }\\
%% endfor
\cline{2-\VAR{ ncol + 1 }}
\end{tabular}
\BLOCK{ endblock }
Alternatives
This method works but I think other alternatives are to be considered, especially if they avoid using too many languages:
- generate an html
There are several way to transfrom an html to a pdf, wkhtml2pdf is only one of them.
This method takes advantage of all the power of jinja to generate html.
- python library generating latex
The pandas library can export a DataFrame to latex, other libraries exist to generate latex but I didn't test them (yet)
- directly generate the latex in python