Comparing and Merging Jupyter Notebooks: A Guide to Using nbdime

Jupyter
Version Control
Collaboration
Author

Mehul Parakh, Palaksh Shah, Taddi Krishna Vamsi

Published

February 26, 2025

Introduction

Jupyter Notebooks are now a ubiquitous tool among data scientists, researchers, and developers alike, providing an incredible interactive computing platform, data visualization tool, and collaborative environment. Yet, version control for .ipynb files can prove especially difficult. Unlike text files, Jupyter Notebooks exist in a JSON-based format, containing not just code and markdown but also metadata, execution output, and cell structure. This complexity also makes it challenging for conventional version control tools, such as git diff, to track and present changes effectively.

When several collaborators edit the same notebook, discrepancies frequently occur in the form of concurrent edits, varying execution outputs, or internal structure modifications. Resolving them manually is frustrating, time-consuming, and error-prone. Step forward, nbdime (Notebook Diff and Merge)—a specialized tool that addresses these issues through intelligent diffing and merging that is specifically intended for Jupyter Notebooks. In this blog, we’ll dive into how nbdime simplifies the process of diffing (identifying changes) and merging (combining changes) in Jupyter Notebooks. We’ll explore:

  • How nbdime provides clear, structured comparisons of .ipynb files, highlighting differences in code, markdown, outputs, and metadata.

  • Strategies for effectively merging conflicting changes, even in complex scenarios.

  • A step-by-step guide to using nbdime, both via the command line and its user-friendly web interface.

By the end of this blog, you’ll have a solid understanding of how to leverage nbdime to streamline collaboration and version control for Jupyter Notebooks, ensuring smoother workflows and more efficient teamwork.

Why Use nbdime?

Traditional diffing and merging tools handle Jupyter notebooks as plain text, which makes it hard to understand differences in code, outputs, and metadata well. This usually leads to incomprehensible diffs, particularly for visual components such as plots and rich outputs. nbdime is specifically built for Jupyter notebooks, offering structured diffs that point out significant changes in code cells, outputs, and metadata. It also supports effortless three-way merging, which helps resolve conflicts in an efficient manner without compromising the notebook structure. The traditional diffing tools often leads to the following output: Fig: Showing the output generated by diffing two Jupyter notebooks using conventional diffing tools. Fig: Showing the output generated by diffing two Jupyter notebooks using conventional diffing tools.

If we look closely at the image showing the output, we can observe that traditional diffing tools simply compare the text and note down the differing cell as it appears, without regard to the output. nbdime, however, interactively presents the differences, with changes in code, outputs, and metadata being presented in a structured manner. Additionally, it offers strong merging functionality, enabling users to fix conflicts effectively without compromising the integrity of Jupyter notebooks. We shall see later in the ensuing sections how nbdime facilitates the merge of Jupyter files.

Installation

To efficiently merge and diff the Jupyter notebooks we should first install the nbdime packages. nbdime is a Python package, and its installation is straightforward. Below is a step-by-step guide to get you started.

Step 1:- Ensure Python and pip are Installed

nbdime requires Python and pip (Python’s package installer). If you don’t already have them, download and install Python from python.org pip is included by default with Python 3.4 and later.

Step 2:- To use nbdime, install it via pip:

pip install nbdime

This command installs nbdime and its dependencies. If you prefer to install it in a specific environment (e.g., using conda), activate your environment first.

Step 3:- Verify installation:

After installation, verify that nbdime is installed correctly by running

nbdime --version

This should display the installed version of nbdime, confirming that the installation was successful.

Step 4:- To integrate nbdime with Git:(Optional)

To integrate nbdime with Git for seamless version control, run the following command:

nbdime config-git --enable

This sets up nbdime as the default diff and merge tool for Jupyter Notebooks in your Git configuration.

Step:- 5. Start Using nbdime

Now you can use nbdime to compare and merge notebooks directly from the command line or through its web interface. For example, to compare two notebooks, use:

python -m nbdime diff diffing_1.ipynb diffing_2.ipynb

Or, to launch the web-based diff viewer, run:

python -m nbdime diff-web diffing_1.ipynb diffing_2.ipynb

You can similarly merge files with the command,

python -m nbdime merge merging_1.ipynb merging_2.ipynb --out merged_output1.ipynb

Or, to launch the web-based merge viewer, run:

python -m nbdime merge-web base.ipynb merging_1.ipynb merging_2.ipynb

Key Features and Explaining the Command-line codes

1. Notebook-Specific Diffing

  • Intelligent Comparison: Unlike traditional diff tools that treat notebooks as plain JSON, nbdime understands the structure of Jupyter Notebooks. It can intelligently compare:

○ Code cells

○ Markdown cells

○ Outputs (e.g., plots, tables, and text)

○ Metadata (e.g., cell execution order, kernel information)

  • Context-Awareness: It preserves the notebook’s structure and readability, making it easier to track changes.

2. Command-Line Interface (CLI)

Nbdiff(nbdime diff): Compare two notebooks and display differences in the terminal.

python -m nbdime diff diffing_1.ipynb diffing_2.ipynb

Let’s try to understand the code,

python -m nbdime

  • This part of the command tells Python to run the nbdime module as a script.

  • m stands for “module” and allows you to run a Python module directly from the command line.

  • nbdime is the module that provides the diffing and merging functionality for Jupyter Notebooks.

diff

  • This is the subcommand provided by nbdime to compare two notebooks.

  • It tells nbdime to perform a diff operation (i.e., show the differences between the two files).

diffing_1.ipynb diffing_2.ipynb

  • These are the two Jupyter Notebook files you want to compare.

  • diffing_1.ipynb is the first notebook file.

  • diffing_2.ipynb is the second notebook file.

  • nbdime will compare these two files and display the differences.

To compare notebooks and see a web based differing we use the prompt:

python -m nbdime diff-web diffing_1.ipynb diffing_2.ipynb

This prompt have the similar breakdown as of the terminal based output the only difference is the use of,

diff-web

  • This is the subcommand provided by nbdime to launch a web-based diff viewer.

  • Unlike the diff command, which outputs differences in the terminal, diff-web opens a visual, interactive web interface for comparing notebooks.

Nbmerge (nbdime merge): Merge changes from one notebook into another.

python -m nbdime merge merging_1.ipynb merging_2.ipynb --out merged_output1.ipynb

The python -m nbdime command means the same in all cases what changes here is,

merge

  • This is the subcommand provided by nbdime to merge two notebooks.

  • It tells nbdime to perform a merge operation (i.e., combine changes from both notebooks into a single notebook).

merging_1.ipynb merging_2.ipynb

  • These are the two Jupyter Notebook files you want to merge.

  • merging_1.ipynb is the first notebook file (often considered the “base” notebook).

  • merging_2.ipynb is the second notebook file (often containing changes to be merged into the base notebook).

–out merged_output1.ipynb

  • This is an optional flag that specifies the output file where the merged notebook will be saved.

  • merged_output1.ipynb is the name of the output file. If this flag is not provided, the merged notebook will be printed to the terminal.

To perform Web based merging,

python -m nbdime merge-web base.ipynb merging_1.ipynb merging_2.ipynb

merge-web

  • This is the subcommand provided by nbdime to launch a web-based merge viewer.

  • Unlike the merge command, which performs the merge directly in the terminal, merge-web opens a visual, interactive web interface for merging notebooks.

base.ipynb merging_1.ipynb merging_2.ipynb

  • These are the three Jupyter Notebook files involved in the merge operation:

o base.ipynb: The common ancestor or base version of the notebook.

o merging_1.ipynb: The first modified version of the notebook.

o merging_2.ipynb: The second modified version of the notebook.

  • nbdime will compare the changes in merging_1.ipynb and merging_2.ipynb relative to base.ipynb and merge them.

  • The web based merging involves three files and implies the use of three way merging, where the base notebook is shown alongside the two modified versions, making it easy to understand changes and resolve conflicts.

The remaining part of the code means the same as that of the terminal based merging

3. Web-Based Interface

  • Interactive Visualization: Launch a web-based diff viewer for a more intuitive and user-friendly experience.

  • Side-by-Side Comparison: View differences between notebooks in a clean, side-by-side layout.

  • Conflict Resolution: Easily resolve merge conflicts in a visual interface.

4. Git Integration

  • Seamless Version Control: nbdime can be configured as the default diff and merge tool for Jupyter Notebooks in Git.
nbdime config-git --enable
  • Git Diff and Merge: Automatically use nbdime for git diff and git merge operations on .ipynb files.

5. Output Comparison

  • Output-Aware Diffing: nbdime can compare notebook outputs (e.g., plots, tables, and text) in addition to code and markdown.

  • Output Filtering: Optionally ignore outputs during diffing to focus on code and markdown changes.

6. Customizable Diffing

  • Configurable Settings: Customize how nbdime handles diffs, such as ignoring metadata or specific cell types.

  • Filtering Options: Exclude certain cells or outputs from the diff process.

7. Cross-Platform Support

  • Works Everywhere: nbdime is compatible with Linux, macOS, and Windows.

  • Easy Installation: Install via pip or conda:

8. Lightweight and Fast

  • Efficient Performance: Designed to handle large notebooks efficiently, even with complex outputs.

  • Minimal Dependencies: Lightweight and easy to integrate into existing workflows.

9. Open Source and Actively Maintained

  • Community-Driven: nbdime is an open-source project with active development and community support.

  • Extensible: Developers can extend its functionality to suit specific needs.

10. Conflict Resolution

  • Merge Conflicts: nbdime provides tools to resolve conflicts during notebook merges, ensuring smooth collaboration.

  • Interactive Resolution: Use the web interface to resolve conflicts interactively.

Code Examples

Let’s study with some examples

Firstly, lets take example of diffing and create a Jupyter file namely diffing_1.ipynb with the below cells

import matplotlib.pyplot as plt
import pandas as pd
a=10
b=11
print(a+b)
21
x=25
y=13
print(x+y)
38
data=pd.DataFrame({
    "X" : [1,2,3,4,5],
    "Y" : [1,2,3,4,5]
})
plt.plot(data['X'],data['Y'],label="Equation of line: y=x")
plt.grid()
plt.legend()
plt.show()

Output

Let’s take another Jupyter file for diffing namely, diffed_2.ipynnb

import matplotlib.pyplot as plt
import pandas as pd
a=10
b=11
print(a+b)
21
x=25
y=13
print(x+y)
38
data=pd.DataFrame({
    "X" : [1,2,3,4,5],
    "Y" : [5,4,3,2,1]
})
plt.plot(data['X'],data['Y'],label="Equation of line: y=-x")
plt.grid()
plt.legend()
plt.show()

Output

Now we can Diff the file using

python -m nbdime diff diffing_1.ipynb diffing_2.ipynb

Output: Fig: This is the output we get after diffing in the terminal Fig: This resembles the output we get after diffing in the terminal

Now trying to get the Web based diffing output

python -m nbdime diff-web diffing_1.ipynb diffing_2.ipynb

Output: Fig: Web based output for diffing of jupyter notebook Fig: Web based output for diffing of jupyter notebook

Here, we can clearly see the output of diffing the two codes, using nbdime help us to get a more interactive and a user friendly output

Now Let’s try with merging of files

First we will create a file namely merging_1.ipynb

import pandas as pd
import matplotlib.pyplot as plt
a=5
b=6
print(a+b)
11
x=10
y=11
print(y-x)
1
data=pd.DataFrame({
    "x": [1,9],
    "y": [2,3]
})
plt.plot(data["x"],data["y"])
plt.grid()
plt.xlim(1,9)
plt.ylim(2,3)
plt.show()

Output

Now we will create another file namely, merging_2.ipynb

import pandas as pd
import matplotlib.pyplot as plt
a=5
b=6
print(a+b)
11
x=10
y=11
print(x+y)
21
data=pd.DataFrame({
    "x": [1,9],
    "y": [6,2]
})
plt.plot(data["x"],data["y"])
plt.grid()
plt.xlim(1,9)
plt.ylim(2,6)
plt.show()

Output

We can get the output for the terminal merging which will directtly store a new Jupyter folder by giving the prompt in the terminal,

python -m nbdime merge merging_1.ipynb merging_2.ipynb --out merged_output1.ipynb

Output:

Fig: Showing the Output of merging using nbdime in the terminal Window Fig: Showing the Output of merging using nbdime in the terminal Window

Here it can be clearly noticed that a file has been created named merged_output1.ipynb and the merged file is stored i it, It can also be seen that the locations where conflict has occured have been clearly showed so that it could be manuall changed

Now let,s try the web based output by running the comand in the terminal:

Since web based merging will have three way merging we can create a base file which would just contain our file merging_1.ipynb which will act as both our base file as well as our modified file,

It can be done by giving a prompt in the terminal window

copy merging_1.ipynb base.ipynb

Now let’s run the web based prompt,

python -m nbdime merge-web base.ipynb merging_1.ipynb merging_2.ipynb

Output:

Fig: Web based output for merging of Jupyter notebooks Fig: Web based output for merging of Jupyter notebooks

The merging of two Jupyter notebook is done by nbdime in such a way that it allows you to compare between the conflicts in a better interface

For more examples read more….

Use Cases

1. Merge Conflict Resolution in Team Workflows

  • If several team members collaborate on the same Jupyter notebook, manual merging of changes can be tricky because of JSON-based conflicts.

  • nbdime’s three-way merge intelligently merges edits without risking lost work.

  • It indicates code changes, markdown edits, and output changes, making it easier to resolve conflicts.

2. Readable and Clean Diffs for Code Review

  • In collaborative projects, comparing changes in Jupyter notebooks is challenging with traditional diff tools that show raw JSON.

  • nbdime offers a cell-by-cell structured diff, making it easy for teams to comprehend changes in code, text, and outputs.

  • This enhances peer reviews and accelerates the approval process.

3. Effective Collaboration with Git Integration

  • Teams working with Git version control can set nbdime as the default diff and merge tool for notebooks.

  • This makes it possible for developers, data scientists, and researchers to follow changes easily without having to work with unstructured JSON diffs.

  • Assists with feature branch management, parallel development, and reproducibility.

4. Experimentation and Model Iterations Tracking

  • Various members in data science and machine learning teams might make adjustments to models, datasets, or parameters within the same notebook.

  • nbdime makes comparisons across various versions easy, such that no critical changes are missed.

  • This is especially helpful for monitoring progress across several rounds of analysis.

5. Avoiding Accidental Overwrites and Lost Work

  • In collaborative projects, two authors may edit the same notebook at the same time.

  • Without nbdime, combining changes could result in lost changes because of JSON conflicts.

  • nbdime’s structured merging preserves all contributions, avoiding unnecessary rework

Conclusion

nbdime is a game-changer for managing Jupyter Notebooks, offering notebook-aware diffing and merging that traditional tools can’t match. Whether you’re collaborating, resolving conflicts, or automating workflows, nbdime ensures seamless version control and efficient workflows. Its command-line tools and web-based interface make it easy to track changes and merge notebooks intelligently. For anyone working with Jupyter Notebooks, nbdime is an essential tool to enhance productivity and collaboration, For a better understanding of the library visit nbdime