<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Blosc Home Page  (Posts by Francesc Alted, Luke Shaw)</title><link>https://blosc.org/</link><description></description><atom:link href="https://blosc.org/authors/francesc-alted-luke-shaw.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2026 &lt;a href="mailto:blosc@blosc.org"&gt;The Blosc Developers&lt;/a&gt; </copyright><lastBuildDate>Wed, 04 Mar 2026 11:43:34 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Blosc2: A Universal Lazy Engine for Array Operations</title><link>https://blosc.org/posts/tensordot-pure-persistent/</link><dc:creator>Francesc Alted, Luke Shaw</dc:creator><description>&lt;p&gt;While compression is often seen merely as a way to save storage, the Blosc development team has long viewed it as a foundational element for high-performance computing. This philosophy is at the heart of Blosc2, which is not just a compression library but a powerful framework for handling large datasets. This post will highlight one of Python-Blosc2's most exciting capabilities: its lazy evaluation engine for array operations.&lt;/p&gt;
&lt;p&gt;Libraries optimised for computation on large datasets that don't fit in memory - such as Dask or Spark - often use lazy evaluation of computation expressions. This typically speeds up evaluation since one can build the full chain of computations and only execute them when the final result is needed. Consequently, Python-Blosc2's compute engine also uses the lazy imperative paradigm, which proves to be both &lt;a class="reference external" href="https://ironarray.io/blog/compute-bigger"&gt;powerful and efficient&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;An additional benefit of the engine is its ability to act as a universal backend. Python-Blosc2 has a native &lt;code class="docutils literal"&gt;blosc2.NDArray&lt;/code&gt; format, but it can also easily execute lazy operations on arrays from other popular libraries like NumPy, HDF5, Zarr, Xarray or TileDB - basically any array object which complies with a minimal protocol.&lt;/p&gt;
&lt;p&gt;In the recent &lt;a class="reference external" href="https://github.com/Blosc/python-blosc2/releases"&gt;Python-Blosc2 3.10.x series&lt;/a&gt;, we added support for lazy evaluation of eager functions, expanding the capabilities of the compute engine, and making interaction with other formats easier. Let's explore how this works using an out-of-core &lt;a class="reference external" href="https://www.blosc.org/python-blosc2/reference/linalg.html#blosc2.linalg.tensordot"&gt;tensordot&lt;/a&gt; operation as an example.&lt;/p&gt;
&lt;section id="from-eager-to-lazy-with-blosc2-lazyexpr"&gt;
&lt;h2&gt;From Eager to Lazy with &lt;code class="docutils literal"&gt;blosc2.lazyexpr&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Functions which return a result with a different shape to the input operands - such as reductions or linear algebra operations - must be evaluated eagerly (computed and the result returned immediately). For example, &lt;code class="docutils literal"&gt;blosc2.tensordot()&lt;/code&gt; executes eagerly.&lt;/p&gt;
&lt;p&gt;Nevertheless, we can defer this computation, by wrapping the call in a string and passing it to &lt;code class="docutils literal"&gt;blosc2.lazyexpr&lt;/code&gt;. This creates a &lt;code class="docutils literal"&gt;LazyExpr&lt;/code&gt; object that represents the operation without executing it.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code python"&gt;&lt;a id="rest_code_e8aec2e5ad384155a19b11cabe84224a-1" name="rest_code_e8aec2e5ad384155a19b11cabe84224a-1" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_e8aec2e5ad384155a19b11cabe84224a-1"&gt;&lt;/a&gt;&lt;span class="c1"&gt;# Assume a and b are large, on-disk blosc2 arrays&lt;/span&gt;
&lt;a id="rest_code_e8aec2e5ad384155a19b11cabe84224a-2" name="rest_code_e8aec2e5ad384155a19b11cabe84224a-2" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_e8aec2e5ad384155a19b11cabe84224a-2"&gt;&lt;/a&gt;&lt;span class="n"&gt;axis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;a id="rest_code_e8aec2e5ad384155a19b11cabe84224a-3" name="rest_code_e8aec2e5ad384155a19b11cabe84224a-3" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_e8aec2e5ad384155a19b11cabe84224a-3"&gt;&lt;/a&gt;
&lt;a id="rest_code_e8aec2e5ad384155a19b11cabe84224a-4" name="rest_code_e8aec2e5ad384155a19b11cabe84224a-4" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_e8aec2e5ad384155a19b11cabe84224a-4"&gt;&lt;/a&gt;&lt;span class="c1"&gt;# Create a lazy expression object&lt;/span&gt;
&lt;a id="rest_code_e8aec2e5ad384155a19b11cabe84224a-5" name="rest_code_e8aec2e5ad384155a19b11cabe84224a-5" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_e8aec2e5ad384155a19b11cabe84224a-5"&gt;&lt;/a&gt;&lt;span class="n"&gt;lexpr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blosc2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lazyexpr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"tensordot(a, b, axes=(axis, axis))"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;a id="rest_code_e8aec2e5ad384155a19b11cabe84224a-6" name="rest_code_e8aec2e5ad384155a19b11cabe84224a-6" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_e8aec2e5ad384155a19b11cabe84224a-6"&gt;&lt;/a&gt;
&lt;a id="rest_code_e8aec2e5ad384155a19b11cabe84224a-7" name="rest_code_e8aec2e5ad384155a19b11cabe84224a-7" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_e8aec2e5ad384155a19b11cabe84224a-7"&gt;&lt;/a&gt;&lt;span class="c1"&gt;# The computation has not run yet.&lt;/span&gt;
&lt;a id="rest_code_e8aec2e5ad384155a19b11cabe84224a-8" name="rest_code_e8aec2e5ad384155a19b11cabe84224a-8" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_e8aec2e5ad384155a19b11cabe84224a-8"&gt;&lt;/a&gt;&lt;span class="c1"&gt;# To execute it and save the result to a new persistent array:&lt;/span&gt;
&lt;a id="rest_code_e8aec2e5ad384155a19b11cabe84224a-9" name="rest_code_e8aec2e5ad384155a19b11cabe84224a-9" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_e8aec2e5ad384155a19b11cabe84224a-9"&gt;&lt;/a&gt;&lt;span class="n"&gt;out_blosc2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lexpr&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urlpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"out.b2nd"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is useful, and highly efficient both in terms of computation time and memory usage, as we'll see later. But the real magic happens when we use this computation engine with other array formats.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="one-engine-many-backends"&gt;
&lt;h2&gt;One Engine, Many Backends&lt;/h2&gt;
&lt;p&gt;The &lt;code class="docutils literal"&gt;blosc2.evaluate()&lt;/code&gt; function takes the same string expression but can operate on any array-like objects that follow the &lt;code class="docutils literal"&gt;blosc2.Array&lt;/code&gt; protocol. This protocol simply requires the object to have &lt;code class="docutils literal"&gt;shape&lt;/code&gt;, &lt;code class="docutils literal"&gt;dtype&lt;/code&gt;, &lt;code class="docutils literal"&gt;__getitem__&lt;/code&gt;, and &lt;code class="docutils literal"&gt;__setitem__&lt;/code&gt; attributes, which are standard in &lt;code class="docutils literal"&gt;h5py&lt;/code&gt;, &lt;code class="docutils literal"&gt;zarr&lt;/code&gt;, &lt;code class="docutils literal"&gt;tiledb&lt;/code&gt;, &lt;code class="docutils literal"&gt;xarray&lt;/code&gt; and &lt;code class="docutils literal"&gt;numpy&lt;/code&gt; arrays.&lt;/p&gt;
&lt;p&gt;This means you can use Blosc2's efficient evaluation engine to perform out-of-core computations directly on your existing (HDF5, Zarr, etc.) datasets.&lt;/p&gt;
&lt;section id="example-with-hdf5"&gt;
&lt;h3&gt;Example with HDF5&lt;/h3&gt;
&lt;p&gt;Here, we instruct &lt;code class="docutils literal"&gt;blosc2.evaluate&lt;/code&gt; to run the &lt;code class="docutils literal"&gt;tensordot&lt;/code&gt; operation on two &lt;code class="docutils literal"&gt;h5py&lt;/code&gt; datasets and store the result in a third one.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code python"&gt;&lt;a id="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-1" name="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-1" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_9725b4c2abdb467c8396e9df47f0c4ab-1"&gt;&lt;/a&gt;&lt;span class="c1"&gt;# Open HDF5 datasets&lt;/span&gt;
&lt;a id="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-2" name="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-2" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_9725b4c2abdb467c8396e9df47f0c4ab-2"&gt;&lt;/a&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;h5py&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;File&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"a_b_out.h5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;a id="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-3" name="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-3" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_9725b4c2abdb467c8396e9df47f0c4ab-3"&gt;&lt;/a&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;a id="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-4" name="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-4" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_9725b4c2abdb467c8396e9df47f0c4ab-4"&gt;&lt;/a&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"b"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;a id="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-5" name="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-5" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_9725b4c2abdb467c8396e9df47f0c4ab-5"&gt;&lt;/a&gt;&lt;span class="n"&gt;out_hdf5&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"out"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;a id="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-6" name="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-6" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_9725b4c2abdb467c8396e9df47f0c4ab-6"&gt;&lt;/a&gt;
&lt;a id="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-7" name="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-7" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_9725b4c2abdb467c8396e9df47f0c4ab-7"&gt;&lt;/a&gt;&lt;span class="c1"&gt;# Use blosc2.evaluate() with HDF5 arrays&lt;/span&gt;
&lt;a id="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-8" name="rest_code_9725b4c2abdb467c8396e9df47f0c4ab-8" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_9725b4c2abdb467c8396e9df47f0c4ab-8"&gt;&lt;/a&gt;&lt;span class="n"&gt;blosc2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"tensordot(a, b, axes=(axis, axis))"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;out_hdf5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Notice that the expression string is identical to the one we used before. &lt;code class="docutils literal"&gt;blosc2&lt;/code&gt; inspects the objects in the expression's namespace and computes with them, regardless of their underlying format.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="example-with-zarr"&gt;
&lt;h3&gt;Example with Zarr&lt;/h3&gt;
&lt;p&gt;The same principle applies to Zarr arrays.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code python"&gt;&lt;a id="rest_code_57a303deff314380b2b76a8b748fca29-1" name="rest_code_57a303deff314380b2b76a8b748fca29-1" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_57a303deff314380b2b76a8b748fca29-1"&gt;&lt;/a&gt;&lt;span class="c1"&gt;# Open Zarr arrays&lt;/span&gt;
&lt;a id="rest_code_57a303deff314380b2b76a8b748fca29-2" name="rest_code_57a303deff314380b2b76a8b748fca29-2" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_57a303deff314380b2b76a8b748fca29-2"&gt;&lt;/a&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;zarr&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"a.zarr"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"r"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;a id="rest_code_57a303deff314380b2b76a8b748fca29-3" name="rest_code_57a303deff314380b2b76a8b748fca29-3" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_57a303deff314380b2b76a8b748fca29-3"&gt;&lt;/a&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;zarr&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"b.zarr"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"r"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;a id="rest_code_57a303deff314380b2b76a8b748fca29-4" name="rest_code_57a303deff314380b2b76a8b748fca29-4" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_57a303deff314380b2b76a8b748fca29-4"&gt;&lt;/a&gt;&lt;span class="n"&gt;zout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;zarr&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;open_array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"out.zarr"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;a id="rest_code_57a303deff314380b2b76a8b748fca29-5" name="rest_code_57a303deff314380b2b76a8b748fca29-5" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_57a303deff314380b2b76a8b748fca29-5"&gt;&lt;/a&gt;
&lt;a id="rest_code_57a303deff314380b2b76a8b748fca29-6" name="rest_code_57a303deff314380b2b76a8b748fca29-6" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_57a303deff314380b2b76a8b748fca29-6"&gt;&lt;/a&gt;&lt;span class="c1"&gt;# Use blosc2.evaluate() with Zarr arrays&lt;/span&gt;
&lt;a id="rest_code_57a303deff314380b2b76a8b748fca29-7" name="rest_code_57a303deff314380b2b76a8b748fca29-7" href="https://blosc.org/posts/tensordot-pure-persistent/#rest_code_57a303deff314380b2b76a8b748fca29-7"&gt;&lt;/a&gt;&lt;span class="n"&gt;blosc2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"tensordot(a, b, axes=(axis, axis))"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;zout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This makes &lt;code class="docutils literal"&gt;blosc2.evaluate&lt;/code&gt; a powerful, backend-agnostic tool for out-of-core array computations.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="performance-comparison"&gt;
&lt;h2&gt;Performance Comparison&lt;/h2&gt;
&lt;p&gt;As well as offering smooth integration, &lt;code class="docutils literal"&gt;blosc2.evaluate&lt;/code&gt; is highly performant. Python-Blosc2 uses a lazy evaluation engine that integrates tightly with the Blosc2 format. This means that the computation is performed on-the-fly, without any intermediate copies. This is a huge advantage for large datasets, as it allows us to perform computations on arrays that don't fit in memory.  In addition, it actively tries to leverage the hierarchical memory layout in modern CPUs, so that it can use both private and shared caches in the best way possible.&lt;/p&gt;
&lt;p&gt;We ran a &lt;a class="reference external" href="https://github.com/Blosc/python-blosc2/blob/main/bench/ndarray/tensordot_pure_persistent.ipynb"&gt;benchmark&lt;/a&gt; performing a &lt;code class="docutils literal"&gt;tensordot&lt;/code&gt; operation (run over three different axis combinations) on two 3D arrays stored on disk; we then write the output to disk as well.
We consider four approaches:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Blosc2 Native&lt;/strong&gt;: Using &lt;code class="docutils literal"&gt;blosc2.lazyexpr&lt;/code&gt; with &lt;code class="docutils literal"&gt;blosc2.NDArray&lt;/code&gt; containers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Blosc2+HDF5&lt;/strong&gt;: Using &lt;code class="docutils literal"&gt;blosc2.evaluate&lt;/code&gt; with HDF5 for storage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Blosc2+Zarr&lt;/strong&gt;: Using &lt;code class="docutils literal"&gt;blosc2.evaluate&lt;/code&gt; with Zarr for storage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dask+HDF5&lt;/strong&gt;: The combination of Dask for computation and HDF5 for storage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dask+Zarr&lt;/strong&gt;: The combination of Dask for computation and Zarr for storage.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For each approach we plot the memory consumption vs. time for arrays of increasing size.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Results on two (600, 600, 600) float64 arrays (3 GB working set):&lt;/strong&gt;&lt;/p&gt;
&lt;img alt="/images/tensordot_pure_persistent/tensordot-600c-amd.png" src="https://blosc.org/images/tensordot_pure_persistent/tensordot-600c-amd.png" style="width: 100%;"&gt;
&lt;p&gt;&lt;strong&gt;Results on two (1200, 1200, 1200) float64 arrays (26 GB working set):&lt;/strong&gt;&lt;/p&gt;
&lt;img alt="/images/tensordot_pure_persistent/tensordot-1200c-amd.png" src="https://blosc.org/images/tensordot_pure_persistent/tensordot-1200c-amd.png" style="width: 100%;"&gt;
&lt;p&gt;&lt;strong&gt;Results on two (1500, 1500, 1500) float64 arrays (50 GB working set):&lt;/strong&gt;&lt;/p&gt;
&lt;img alt="/images/tensordot_pure_persistent/tensordot-1500c-amd.png" src="https://blosc.org/images/tensordot_pure_persistent/tensordot-1500c-amd.png" style="width: 100%;"&gt;
&lt;p&gt;As can be seen, the amount of memory required by the different approaches is very different, although none requires more than a small fraction of the total working set (which is 3, 26 and 50 GB, respectively). This is because all approaches are out-of-core, and only load small chunks of data into memory at any given time.&lt;/p&gt;
&lt;p&gt;The benchmarks were executed on an AMD Ryzen 9800X3D CPU, with 16 logical cores and 64GB of RAM, using Ubuntu Linux 25.04. We have used the following versions of the libraries: python-blosc2 3.10.1, h5py 3.14.0, zarr 3.1.3, 2025.9.1, and numpy 2.3.3.  All backends are using Blosc or Blosc2 as the compression backend, with same codecs and filters, and using the same number of threads for compression and decompression.&lt;/p&gt;
&lt;section id="analysis"&gt;
&lt;h3&gt;Analysis&lt;/h3&gt;
&lt;p&gt;The results are revealing:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Blosc2 native is fastest&lt;/strong&gt;: The tight integration between the Blosc2 compute engine and its native array format yields the best performance, making it the fastest solution by a significant margin.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rapid computation time&lt;/strong&gt;: &lt;code class="docutils literal"&gt;blosc2.evaluate&lt;/code&gt; delivers impressive speed when operating directly on HDF5 and Zarr files, outperforming the more complex Dask+HDF5 and Dask+Zarr stack. This is great news for anyone with existing HDF5/Zarr datasets.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Low memory usage&lt;/strong&gt;: While the memory consumption for the Blosc2+HDF5 combination is a bit high (we are still analyzing why), the memory usage for the Blosc2 native approach is pretty low, making it suitable for systems with limited RAM and/or operands not fitting in memory.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is not to say that Dask (or Spark) is an inferior choice for out-of-core computations. It's a great tool for large-scale data processing, especially when using clusters, is very flexible, and offers a wide range of functions; it's certainly a first-class citizen in the PyData ecosystem. However, if your needs are more modest and you want a simple, efficient way to run computations on existing datasets, using a core of common functions, and leveraging the full capabilities of modern multi-core systems, all without the overhead of a full Dask setup, &lt;code class="docutils literal"&gt;blosc2.evaluate()&lt;/code&gt; is a fantastic alternative.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="conclusion"&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Python-Blosc2 is more than just a compression library for storing data in &lt;code class="docutils literal"&gt;blosc2.NDArray&lt;/code&gt; objects; it's a high-performance computing tool as well. Its lazy evaluation engine provides a simple yet powerful way to handle out-of-core operations. The computation engine is completely decoupled from the compression backend, and thus can easily work with many different array formats; however, the compute engine meshes most tightly with the Blosc2 native array format, achieving maximal performance (in terms of both computation time and memory usage).&lt;/p&gt;
&lt;p&gt;By adhering to the &lt;a class="reference external" href="https://data-apis.org/array-api/"&gt;Array API standard&lt;/a&gt;, it acts as a universal engine that can work with different storage backends; we already implement &lt;a class="reference external" href="https://ironarray.io/blog/array-api"&gt;more than 100 functions that are required by that standard&lt;/a&gt;, and the number will only grow in the future. If you have existing datasets in HDF5 or Zarr or TileDB (and we are always looking forward to support even more formats), and need a lightweight, efficient way to run computations on them, &lt;code class="docutils literal"&gt;blosc2.evaluate()&lt;/code&gt; is a fantastic tool to have in your arsenal. Of course, for maximum performance, the native Blosc2 format is a clear winner.&lt;/p&gt;
&lt;p&gt;Our work continues. We are committed to enhancing Python-Blosc2 by expanding its supported operations, improving performance across backends, and adding new ones. Stay tuned for more updates! If you found this post useful, please share it. For questions or comments, reach out to us on &lt;a class="reference external" href="https://github.com/Blosc/python-blosc2/discussions"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;/section&gt;</description><category>blosc2 hdf5 zarr tiledb dask numpy</category><guid>https://blosc.org/posts/tensordot-pure-persistent/</guid><pubDate>Wed, 15 Oct 2025 10:32:20 GMT</pubDate></item></channel></rss>