Preprocessors

See also

Configuration options
Configurable options for the nbconvert application
class nbconvert.preprocessors.Preprocessor(**kw)

A configurable preprocessor

Inherit from this class if you wish to have configurability for your preprocessor.

Any configurable traitlets this class exposed will be configurable in profiles using c.SubClassName.attribute = value

you can overwrite preprocess_cell() to apply a transformation independently on each cell or preprocess() if you prefer your own logic. See corresponding docstring for information.

Disabled by default and can be enabled via the config by
‘c.YourPreprocessorName.enabled = True’
__init__(**kw)

Public constructor

Parameters:
  • config (Config) – Configuration file structure
  • **kw – Additional keyword arguments passed to parent
preprocess(nb, resources)

Preprocessing to apply on each notebook.

Must return modified nb, resources.

If you wish to apply your preprocessing to each cell, you might want to override preprocess_cell method instead.

Parameters:
  • nb (NotebookNode) – Notebook being converted
  • resources (dictionary) – Additional resources used in the conversion process. Allows preprocessors to pass variables into the Jinja engine.
preprocess_cell(cell, resources, index)

Override if you want to apply some preprocessing to each cell. Must return modified cell and resource dictionary.

Parameters:
  • cell (NotebookNode cell) – Notebook cell being processed
  • resources (dictionary) – Additional resources used in the conversion process. Allows preprocessors to pass variables into the Jinja engine.
  • index (int) – Index of the cell being processed

Specialized preprocessors

class nbconvert.preprocessors.ConvertFiguresPreprocessor(**kw)

Converts all of the outputs in a notebook from one format to another.

class nbconvert.preprocessors.SVG2PDFPreprocessor(**kw)

Converts all of the outputs in a notebook from SVG to PDF.

class nbconvert.preprocessors.ExtractOutputPreprocessor(**kw)

Extracts all of the outputs from the notebook file. The extracted outputs are returned in the ‘resources’ dictionary.

class nbconvert.preprocessors.LatexPreprocessor(**kw)

Preprocessor for latex destined documents.

Mainly populates the latex key in the resources dict, adding definitions for pygments highlight styles.

class nbconvert.preprocessors.CSSHTMLHeaderPreprocessor(*pargs, **kwargs)

Preprocessor used to pre-process notebook for HTML output. Adds IPython notebook front-end CSS and Pygments CSS to HTML output.

class nbconvert.preprocessors.HighlightMagicsPreprocessor(config=None, **kw)

Detects and tags code cells that use a different languages than Python.

class nbconvert.preprocessors.ClearOutputPreprocessor(**kw)

Removes the output from all code cells in a notebook.

class nbconvert.preprocessors.RegexRemovePreprocessor(**kw)

Removes cells from a notebook that match one or more regular expression.

For each cell, the preprocessor checks whether its contents match the regular expressions in the patterns traitlet which is a list of unicode strings. If the contents match any of the patterns, the cell is removed from the notebook.

To modify the list of matched patterns, modify the patterns traitlet. For example, execute the following command to convert a notebook to html and remove cells containing only whitespace:

jupyter nbconvert --RegexRemovePreprocessor.patterns="['\s*\Z']" mynotebook.ipynb

The command line argument sets the list of patterns to '\s*\Z' which matches an arbitrary number of whitespace characters followed by the end of the string.

See https://regex101.com/ for an interactive guide to regular expressions (make sure to select the python flavor). See https://docs.python.org/library/re.html for the official regular expression documentation in python.

class nbconvert.preprocessors.ExecutePreprocessor(**kw)

Executes all the cells in a notebook

preprocess(nb, resources, km=None)

Preprocess notebook executing each code cell.

The input argument nb is modified in-place.

Parameters:
  • nb (NotebookNode) – Notebook being executed.
  • resources (dictionary) – Additional resources used in the conversion process. For example, passing {'metadata': {'path': run_path}} sets the execution path to run_path.
  • km (KernelManager (optional)) – Optional kernel manager. If none is provided, a kernel manager will be created.
Returns:

  • nb (NotebookNode) – The executed notebook.
  • resources (dictionary) – Additional resources used in the conversion process.

preprocess_cell(cell, resources, cell_index)

Executes a single code cell. See base.py for details.

To execute all cells see preprocess().

setup_preprocessor(nb, resources, km=None)

Context manager for setting up the class to execute a notebook.

The assigns nb to self.nb where it will be modified in-place. It also creates and assigns the Kernel Manager (self.km) and Kernel Client(self.kc).

It is intended to yield to a block that will execute codeself.

When control returns from the yield it stops the client’s zmq channels, shuts down the kernel, and removes the now unused attributes.

Parameters:
  • nb (NotebookNode) – Notebook being executed.
  • resources (dictionary) – Additional resources used in the conversion process. For example, passing {'metadata': {'path': run_path}} sets the execution path to run_path.
  • km (KernerlManager (optional)) – Optional kernel manaher. If none is provided, a kernel manager will be created.
Returns:

  • nb (NotebookNode) – The executed notebook.
  • resources (dictionary) – Additional resources used in the conversion process.

start_new_kernel(**kwargs)

Creates a new kernel manager and kernel client.

Parameters:kwargs – Any options for self.kernel_manager_class.start_kernel(). Because that defaults to KernelManager, this will likely include options accepted by KernelManager.start_kernel()`, which includes cwd.
Returns:
  • km (KernelManager) – A kernel manager as created by self.kernel_manager_class.
  • kc (KernelClient) – Kernel client as created by the kernel manager km.
nbconvert.preprocessors.coalesce_streams(cell, resources, index)

Merge consecutive sequences of stream output into single stream to prevent extra newlines inserted at flush calls

Parameters:
  • cell (NotebookNode cell) – Notebook cell being processed
  • resources (dictionary) – Additional resources used in the conversion process. Allows transformers to pass variables into the Jinja engine.
  • index (int) – Index of the cell being processed