'The sizes of saved vector outputs (PDF, EPS) are too large when using mplcairo as the Matplotlib backend

As the title says, I'm using mplcairo as the Matploblit backend. However, the saved files are too large compared to those using the default Matplotlib backend. I want to reduce the file sizes.

This question is also posted on GitHub: https://github.com/matplotlib/mplcairo/issues/37

Version Information

>>> import mplcairo
>>> mplcairo.get_versions()
{'python': '3.9.5 (default, Jun  4 2021, 12:28:51) \n[GCC 7.5.0]', 'mplcairo': '0.4', 'matplotlib': '3.5.1', 'cairo': '1.16.0', 'freetype': '2.10.1', 'pybind11': '2.6.2', 'raqm': None, 'hb': None}

Problem Specification

mplcairo Backend

Code:

from pathlib import Path

import numpy as np
import matplotlib
print("matplotlib.__version__:", matplotlib.__version__)
print('Default backend:', matplotlib.get_backend())
matplotlib.use("module://mplcairo.base")
# matplotlib.use("cairo")
print('Backend is now:', matplotlib.get_backend())
import matplotlib.pyplot as plt
matplotlib.rcParams['pdf.fonttype'] = 42
matplotlib.rcParams['ps.fonttype'] = 42


def format_size(num, suffix="B"):
    """Reference: https://stackoverflow.com/a/1094933
    """
    for unit in ["", "K", "M", "G", "T", "P", "E", "Z"]:
        if abs(num) < 1024.0:
            return f"{num:3.1f}{unit}{suffix}"
        num /= 1024.0
    return f"{num:.1f}Y{suffix}"


# Plot and save figures
fig, ax = plt.subplots(figsize=(8,6), dpi=300)
for i in range(5):
    ax.plot(range(100000), np.random.rand(100000), linewidth=2.0)
fig.savefig('./mplcairo_file_size_test.pdf', format='pdf', bbox_inches='tight')
fig.savefig('./mplcairo_file_size_test.eps', format='eps', bbox_inches='tight')
fig.savefig('./mplcairo_file_size_test.png', format='png', bbox_inches='tight')
print("Figures saved!")


# Display the sizes
pathlist = [ Path("./mplcairo_file_size_test.pdf"), Path("./mplcairo_file_size_test.eps"), Path("./mplcairo_file_size_test.png") ]
for path in sorted(pathlist):
    print("{:s}: {:s}".format(path.name, format_size(path.stat().st_size)))

Output:

matplotlib.__version__: 3.5.1
Default backend: module://matplotlib_inline.backend_inline
Backend is now: module://mplcairo.base
Figures saved!
mplcairo_file_size_test.eps: 8.4MB
mplcairo_file_size_test.pdf: 8.4MB
mplcairo_file_size_test.png: 63.2KB

Default Backend

Code is the same as the mplcairo code except that the matplotlib.use line is commented out so that the default backend is used.

Output:

matplotlib.__version__: 3.5.1
Default backend: module://matplotlib_inline.backend_inline
Backend is now: module://matplotlib_inline.backend_inline
Figures saved!
mplcairo_file_size_test.eps: 1.2MB
mplcairo_file_size_test.pdf: 535.4KB
mplcairo_file_size_test.png: 76.8KB

Observation

The vector files (PDF, EPS) produced by mplcairo are much larger than those produced by the default backend (8.4MB v.s. 1.2MB). This issue is worse when there are more lines (Artists) in the figure. The difference in the file size is too large.



Solution 1:[1]

The author has fixed the library (see this issue). This question should be accordingly closed.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 WinDerek