Working with VLArray¶
A VLArray is a list-like container for variable-length Python values backed by a single SChunk. Each entry is stored in its own compressed chunk, and values are serialized with msgpack before reaching storage.
This makes VLArray a good fit for heterogeneous, variable-length payloads such as small dictionaries, strings, tuples, byte blobs, or nested list/dict structures.
[1]:
import blosc2
def show(label, value):
print(f"{label}: {value}")
urlpath = "vlarray_tutorial.b2frame"
copy_path = "vlarray_tutorial_copy.b2frame"
blosc2.remove_urlpath(urlpath)
blosc2.remove_urlpath(copy_path)
Creating and populating a VLArray¶
Entries can be appended one by one or in batches with extend(). The container accepts the msgpack-safe Python types supported by the implementation: bytes, str, int, float, bool, None, list, tuple, and dict.
[2]:
vla = blosc2.VLArray(urlpath=urlpath, mode="w")
vla.append({"name": "alpha", "count": 1})
vla.extend([b"bytes", ("a", 2), ["x", "y"], 42, None])
vla.insert(1, "between")
show("Initial entries", list(vla))
show("Length", len(vla))
Initial entries: [{'name': 'alpha', 'count': 1}, 'between', b'bytes', ('a', 2), ['x', 'y'], 42, None]
Length: 7
Indexing and slicing¶
Indexing behaves like a Python list. Negative indexes are supported, and slice reads return a plain Python list.
[3]:
show("Last entry", vla[-1])
show("Slice [1:6:2]", vla[1:6:2])
show("Reverse slice", vla[::-2])
Last entry: None
Slice [1:6:2]: ['between', ('a', 2), 42]
Reverse slice: [None, ['x', 'y'], b'bytes', {'name': 'alpha', 'count': 1}]
Updating, inserting, and deleting¶
Single entries can be overwritten by index. Slice assignment follows Python list rules: slices with step == 1 may resize the container, while extended slices require matching lengths.
[4]:
vla[2:5] = ["replaced", {"nested": True}]
show("After slice replacement", list(vla))
vla[::2] = ["even-0", "even-1", "even-2"]
show("After extended-slice update", list(vla))
del vla[1::3]
show("After slice deletion", list(vla))
removed = vla.pop()
show("Popped entry", removed)
show("After pop", list(vla))
After slice replacement: [{'name': 'alpha', 'count': 1}, 'between', 'replaced', {'nested': True}, 42, None]
After extended-slice update: ['even-0', 'between', 'even-1', {'nested': True}, 'even-2', None]
After slice deletion: ['even-0', 'even-1', {'nested': True}, None]
Popped entry: None
After pop: ['even-0', 'even-1', {'nested': True}]
Copying with new storage or compression parameters¶
The copy() method can duplicate the container into a different storage layout or with different compression settings.
[5]:
vla_copy = vla.copy(
urlpath=copy_path,
contiguous=False,
cparams={"codec": blosc2.Codec.LZ4, "clevel": 5},
)
show("Copied entries", list(vla_copy))
show("Copy storage is contiguous", vla_copy.schunk.contiguous)
show("Copy codec", vla_copy.cparams.codec)
Copied entries: ['even-0', 'even-1', {'nested': True}]
Copy storage is contiguous: False
Copy codec: Codec.LZ4
Round-tripping through cframes and reopening from disk¶
Tagged persistent stores automatically reopen as VLArray, and a serialized cframe buffer does too.
[6]:
cframe = vla.to_cframe()
restored = blosc2.from_cframe(cframe)
show("from_cframe type", type(restored).__name__)
show("from_cframe entries", list(restored))
reopened = blosc2.open(urlpath, mode="r", mmap_mode="r")
show("Reopened type", type(reopened).__name__)
show("Reopened entries", list(reopened))
from_cframe type: VLArray
from_cframe entries: ['even-0', 'even-1', {'nested': True}]
Reopened type: VLArray
Reopened entries: ['even-0', 'even-1', {'nested': True}]
Clearing and reusing a container¶
Calling clear() resets the backing storage so the container remains ready for new variable-length entries.
[7]:
scratch = vla.copy()
scratch.clear()
scratch.extend(["fresh", 123, {"done": True}])
show("After clear + extend on in-memory copy", list(scratch))
blosc2.remove_urlpath(urlpath)
blosc2.remove_urlpath(copy_path)
After clear + extend on in-memory copy: ['fresh', 123, {'done': True}]
[7]: