ListArray

Overview

ListArray is a row-oriented container for variable-length list cells. It is the natural public container for list-valued blosc2.CTable columns, but it is also useful on its own whenever you want typed, row-addressable list data.

Internally, ListArray uses one of two lower-level backends:

Quick example

import blosc2

arr = blosc2.ListArray(
    item_spec=blosc2.string(max_length=16),
    nullable=True,
    storage="batch",
    urlpath="ingredients.b2b",
    mode="w",
)
arr.append(["salt", "sugar"])
arr.append([])
arr.append(None)

print(arr[0])
print(arr[1:])

reopened = blosc2.open("ingredients.b2b", mode="r")
print(type(reopened).__name__)

Note

Returned Python lists are detached values. Mutating them locally does not write back to the container; reassign the whole cell instead.

class blosc2.ListArray(spec: ListSpec | None = None, *, item_spec: SchemaSpec | None = None, nullable: bool = False, storage: str = 'batch', serializer: str = 'msgpack', batch_rows: int | None = None, items_per_block: int | None = None, _from_schunk=None, **kwargs: Any)[source]

A row-oriented container for list-valued data.

Backed internally by either blosc2.ObjectArray or blosc2.BatchArray.

Attributes:
batch_rows

Target number of rows per persisted batch, if configured.

cbytes

Compressed byte size reported by the backend.

contiguous

Whether the backing store is contiguous on disk.

cparams

Compression parameters of the underlying container.

cratio

Compression ratio reported by the backend.

dparams

Decompression parameters of the underlying container.

info

Human-readable information reporter for this array.

info_items

Items used by info to render this array’s summary.

items_per_block

Maximum number of list cells per internal compressed block.

meta

Fixed-length metadata mapping for the underlying container.

nbytes

Uncompressed byte size reported by the backend.

schunk

Underlying blosc2.SChunk used by the backend.

urlpath

Path of the persistent backing store, or None for memory-only arrays.

vlmeta

Variable-length metadata mapping for the underlying container.

Methods

append(value)

Append one list cell and return the new number of rows.

close()

Flush pending rows and close the logical container.

copy(**kwargs)

Return a copy, optionally with different storage arguments.

extend(values, *[, validate])

Append multiple list cells.

extend_arrow(arrow_array)

Append a PyArrow list array without materializing Python cells.

flush()

Persist any pending rows when using the batch backend.

from_arrow(arrow_array, *[, item_spec, ...])

Build a ListArray from a PyArrow list or chunked list array.

to_arrow()

Return the data as a PyArrow list array.

to_cframe()

Serialize the underlying container to a contiguous C-frame.

Constructors

__init__(spec: ListSpec | None = None, *, item_spec: SchemaSpec | None = None, nullable: bool = False, storage: str = 'batch', serializer: str = 'msgpack', batch_rows: int | None = None, items_per_block: int | None = None, _from_schunk=None, **kwargs: Any) None[source]

Create a list-valued container.

Parameters may be supplied either as a complete spec or as an item_spec plus list/storage options. Storage-related keyword arguments are passed to blosc2.Storage.

classmethod from_arrow(arrow_array, *, item_spec: SchemaSpec | None = None, nullable: bool = True, storage: str = 'batch', serializer: str = 'msgpack', batch_rows: int | None = None, items_per_block: int | None = None, **kwargs: Any) ListArray[source]

Build a ListArray from a PyArrow list or chunked list array.

Row Interface

__getitem__(index: int | slice | list[int] | tuple[int, ...] | ndarray) Any[source]

Return one cell or a list of cells selected by index, slice, or mask.

__setitem__(index: int, value: Any) None[source]

Replace one list cell.

__len__() int[source]

Return the number of rows.

__iter__() Iterator[Any][source]

Iterate over list cells.

Mutation

append(value: Any) int[source]

Append one list cell and return the new number of rows.

extend(values: Iterable[Any], *, validate: bool = True) None[source]

Append multiple list cells.

Set validate=False only for trusted values that already match this array’s schema.

flush() None[source]

Persist any pending rows when using the batch backend.

copy(**kwargs: Any) ListArray[source]

Return a copy, optionally with different storage arguments.

close() None[source]

Flush pending rows and close the logical container.

Context Manager

__enter__() ListArray[source]

Enter a context manager and return this array.

__exit__(exc_type, exc_val, exc_tb) bool[source]

Exit a context manager, flushing pending rows.

Public Members

to_arrow()[source]

Return the data as a PyArrow list array.

to_cframe() bytes[source]

Serialize the underlying container to a contiguous C-frame.