Miscellaneous

This page documents the miscellaneous members of the blosc2 module that do not fit into other categories.

class blosc2.Batch(parent: BatchStore, nbatch: int, lazybatch: bytes)[source]

A lazy sequence representing one batch in a BatchStore.

Batch provides sequence-style access to the items stored in a single batch. Integer indexing can use block-local reads when possible, while slicing materializes the full batch into Python items.

Batch instances are normally obtained via BatchStore indexing or iteration rather than constructed directly.

Attributes:
cbytes
cratio
lazybatch
nbytes

Methods

count(value)

index(value, [start, [stop]])

Raises ValueError if the value is not present.

class blosc2.BatchStore(max_blocksize: int | None = None, serializer: str = 'msgpack', _from_schunk: SChunk | None = None, **kwargs: Any)[source]

A batched container for variable-length Python items.

BatchStore stores data as a sequence of batches, where each batch contains one or more Python items. Each batch is stored in one compressed chunk, and each chunk is internally split into one or more variable-length blocks for efficient item access.

The main abstraction is batch-oriented:

  • indexing the store returns batches

  • iterating the store yields batches

  • iter_items() provides flat item-wise traversal

BatchStore is a good fit when:

  • data arrives naturally in batches

  • batch-level append/update operations are important

  • occasional item-level reads are needed inside a batch

Parameters:
  • max_blocksize (int, optional) – Maximum number of items stored in each internal variable-length block. If not provided, a value is inferred from the first batch.

  • serializer ({"msgpack", "arrow"}, optional) – Serializer used for batch payloads. "msgpack" is the default and is the general-purpose choice for Python items. "arrow" is optional and requires pyarrow.

  • _from_schunk (blosc2.SChunk, optional) – Internal hook used when reopening an already-tagged BatchStore.

  • **kwargs – Storage, compression, and decompression arguments accepted by the constructor.

Attributes:
cbytes
contiguous
cparams
cratio
dparams
info

Return an info reporter with a compact summary of the store.

info_items

Return summary information as (name, value) pairs.

items
max_blocksize
meta
nbytes
serializer

Serializer name used for batch payloads.

typesize
urlpath
vlmeta

Methods

append(value)

Append one batch and return the new number of batches.

clear()

Remove all entries from the container.

delete(index)

Delete the batch at index and return the new number of batches.

extend(values)

Append all batches from an iterable of batches.

insert(index, value)

Insert one batch at index and return the new number of batches.

iter_items()

Iterate over all items across all batches in order.

pop([index])

Remove and return the batch at index as a Python list.

to_cframe()

Serialize the full store to a Blosc2 cframe buffer.

append(value: object) int[source]

Append one batch and return the new number of batches.

clear() None[source]

Remove all entries from the container.

delete(index: int | slice) int[source]

Delete the batch at index and return the new number of batches.

extend(values: object) None[source]

Append all batches from an iterable of batches.

property info: InfoReporter

Return an info reporter with a compact summary of the store.

property info_items: list

Return summary information as (name, value) pairs.

insert(index: int, value: object) int[source]

Insert one batch at index and return the new number of batches.

iter_items() Iterator[Any][source]

Iterate over all items across all batches in order.

pop(index: int = -1) list[Any][source]

Remove and return the batch at index as a Python list.

property serializer: str

Serializer name used for batch payloads.

to_cframe() bytes[source]

Serialize the full store to a Blosc2 cframe buffer.

blosc2.DEFAULT_COMPLEX

Default complex floating dtype.

Attributes:
T

Scalar attribute identical to ndarray.T.

base

Scalar attribute identical to ndarray.base.

data

Pointer to start of data.

device
dtype

Get array data-descriptor.

flags

The integer value of flags.

flat

A 1-D view of the scalar.

itemsize

The length of one element in bytes.

nbytes
ndim

The number of array dimensions.

shape

Tuple of array dimensions.

size

The number of elements in the gentype.

strides

Tuple of bytes steps in each dimension.

Methods

trace

program/module to trace Python program or function execution

blosc2.DEFAULT_FLOAT

Default real floating dtype.

Attributes:
T

Scalar attribute identical to ndarray.T.

base

Scalar attribute identical to ndarray.base.

data

Pointer to start of data.

device
dtype

Get array data-descriptor.

flags

The integer value of flags.

flat

A 1-D view of the scalar.

itemsize

The length of one element in bytes.

nbytes
ndim

The number of array dimensions.

shape

Tuple of array dimensions.

size

The number of elements in the gentype.

strides

Tuple of bytes steps in each dimension.

Methods

trace

program/module to trace Python program or function execution

blosc2.DEFAULT_INDEX

Default indexing dtype.

Attributes:
T

Scalar attribute identical to ndarray.T.

base

Scalar attribute identical to ndarray.base.

data

Pointer to start of data.

denominator

denominator of value (1)

device
dtype

Get array data-descriptor.

flags

The integer value of flags.

flat

A 1-D view of the scalar.

itemsize

The length of one element in bytes.

nbytes
ndim

The number of array dimensions.

numerator

numerator of value (the value itself)

shape

Tuple of array dimensions.

size

The number of elements in the gentype.

strides

Tuple of bytes steps in each dimension.

Methods

trace

program/module to trace Python program or function execution

blosc2.DEFAULT_INT

Default integer dtype.

Attributes:
T

Scalar attribute identical to ndarray.T.

base

Scalar attribute identical to ndarray.base.

data

Pointer to start of data.

denominator

denominator of value (1)

device
dtype

Get array data-descriptor.

flags

The integer value of flags.

flat

A 1-D view of the scalar.

itemsize

The length of one element in bytes.

nbytes
ndim

The number of array dimensions.

numerator

numerator of value (the value itself)

shape

Tuple of array dimensions.

size

The number of elements in the gentype.

strides

Tuple of bytes steps in each dimension.

Methods

trace

program/module to trace Python program or function execution

class blosc2.DSLKernel(func)[source]

Wrap a Python function and optionally extract a miniexpr DSL kernel from it.

Methods

__call__(inputs_tuple, output[, offset])

Call self as a function.

exception blosc2.DSLSyntaxError[source]

Raised when a @dsl_kernel function uses unsupported DSL syntax.

class blosc2.Operand[source]

Base class for all operands in expressions.

Attributes:
device

Hardware device the array data resides on.

dtype

Get the data type of the Operand.

info

Get information about the Operand.

ndim

Get the number of dimensions of the Operand.

shape

Get the shape of the Operand.

Methods

item()

Copy an element of an array to a standard Python scalar and return it.

to_device(device)

Copy the array from the device on which it currently resides to the specified device.

property device

Hardware device the array data resides on. Always equal to ‘cpu’.

abstract property dtype: dtype

Get the data type of the Operand.

Returns:

out – The data type of the Operand.

Return type:

np.dtype

abstract property info: InfoReporter

Get information about the Operand.

Returns:

out – A printable class with information about the Operand.

Return type:

InfoReporter

item() float | bool | complex | int[source]

Copy an element of an array to a standard Python scalar and return it.

abstract property ndim: int

Get the number of dimensions of the Operand.

Returns:

out – The number of dimensions of the Operand.

Return type:

int

abstract property shape: tuple[int]

Get the shape of the Operand.

Returns:

out – The shape of the Operand.

Return type:

tuple

to_device(device: str)[source]

Copy the array from the device on which it currently resides to the specified device.

Parameters:
  • self (NDArray) – Array instance.

  • device (str) – Device to move array object to. Returns error except when device==’cpu’.

Returns:

out – If device=’cpu’, the same array; else raises an Error.

Return type:

NDArray

class blosc2.ProxyNDField(proxy: Proxy, field: str)[source]
Attributes:
device

Hardware device the array data resides on.

dtype

Get the data type of the ProxyNDField.

info

Get information about the Operand.

ndim

Get the number of dimensions of the Operand.

shape

Get the shape of the ProxyNDField.

Methods

item()

Copy an element of an array to a standard Python scalar and return it.

to_device(device)

Copy the array from the device on which it currently resides to the specified device.

property dtype: dtype

Get the data type of the ProxyNDField.

Returns:

out – The data type of the ProxyNDField.

Return type:

np.dtype

property shape: tuple[int]

Get the shape of the ProxyNDField.

Returns:

out – The shape of the ProxyNDField.

Return type:

tuple

blosc2.array_from_ffi_ptr(array_ptr) NDArray[source]

Create an NDArray from a raw FFI pointer.

This function is useful for passing arrays across FFI boundaries. This function move the ownership of the underlying b2nd_array_t* object to the new NDArray, and it will be freed when the object is destroyed.

blosc2.as_simpleproxy(*arrs: Sequence[Array]) tuple[SimpleProxy | Operand][source]

Convert an Array object which fulfills Array protocol into SimpleProxy. If x is already a blosc2.Operand simply returns object.

Parameters:

arrs (Sequence[blosc2.Array]) – Objects fulfilling Array protocol.

Returns:

out – Objects with minimal interface for blosc2 LazyExpr computations.

Return type:

tuple[blosc2.SimpleProxy | blosc2.Operand]

blosc2.dsl_kernel(func)[source]

Decorator to wrap a function in a DSLKernel.

class blosc2.finfo(dtype)

Machine limits for floating point types.

bits

The number of bits occupied by the type.

Type:

int

dtype

Returns the dtype for which finfo returns information. For complex input, the returned dtype is the associated float* dtype for its real and complex components.

Type:

dtype

eps

The difference between 1.0 and the next smallest representable float larger than 1.0. For example, for 64-bit binary floats in the IEEE-754 standard, eps = 2**-52, approximately 2.22e-16.

Type:

float

epsneg

The difference between 1.0 and the next smallest representable float less than 1.0. For example, for 64-bit binary floats in the IEEE-754 standard, epsneg = 2**-53, approximately 1.11e-16.

Type:

float

iexp

The number of bits in the exponent portion of the floating point representation.

Type:

int

machep

The exponent that yields eps.

Type:

int

max

The largest representable number.

Type:

floating point number of the appropriate type

maxexp

The smallest positive power of the base (2) that causes overflow. Corresponds to the C standard MAX_EXP.

Type:

int

min

The smallest representable number, typically -max.

Type:

floating point number of the appropriate type

minexp

The most negative power of the base (2) consistent with there being no leading 0’s in the mantissa. Corresponds to the C standard MIN_EXP - 1.

Type:

int

negep

The exponent that yields epsneg.

Type:

int

nexp

The number of bits in the exponent including its sign and bias.

Type:

int

nmant

The number of explicit bits in the mantissa (excluding the implicit leading bit for normalized numbers).

Type:

int

precision

The approximate number of decimal digits to which this kind of float is precise.

Type:

int

resolution

The approximate decimal resolution of this type, i.e., 10**-precision.

Type:

floating point number of the appropriate type

tiny

An alias for smallest_normal, kept for backwards compatibility.

Type:

float

smallest_normal

The smallest positive floating point number with 1 as leading bit in the mantissa following IEEE-754 (see Notes).

Type:

float

smallest_subnormal

The smallest positive floating point number with 0 as leading bit in the mantissa following IEEE-754.

Type:

float

Parameters:

dtype (float, dtype, or instance) – Kind of floating point or complex floating point data-type about which to get information.

See also

iinfo

The equivalent for integer data types.

spacing

The distance between a value and the nearest adjacent number

nextafter

The next floating point value after x1 towards x2

Notes

For developers of NumPy: do not instantiate this at the module level. The initial calculation of these parameters is expensive and negatively impacts import times. These objects are cached, so calling finfo() repeatedly inside your functions is not a problem.

Note that smallest_normal is not actually the smallest positive representable value in a NumPy floating point type. As in the IEEE-754 standard [1], NumPy floating point types make use of subnormal numbers to fill the gap between 0 and smallest_normal. However, subnormal numbers may have significantly reduced precision [2].

For longdouble, the representation varies across platforms. On most platforms it is IEEE 754 binary128 (quad precision) or binary64-extended (80-bit extended precision). On PowerPC systems, it may use the IBM double-double format (a pair of float64 values), which has special characteristics for precision and range.

This function can also be used for complex data types as well. If used, the output will be the same as the corresponding real float type (e.g. numpy.finfo(numpy.csingle) is the same as numpy.finfo(numpy.single)). However, the output is true for the real and imaginary components.

References

[1]

IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2008, pp.1-70, 2008, https://doi.org/10.1109/IEEESTD.2008.4610935

[2]

Wikipedia, “Denormal Numbers”, https://en.wikipedia.org/wiki/Denormal_number

Examples

>>> import numpy as np
>>> np.finfo(np.float64).dtype
dtype('float64')
>>> np.finfo(np.complex64).dtype
dtype('float32')
Attributes:
epsneg
iexp
machep
negep
nexp
resolution
tiny

Return the value for tiny, alias of smallest_normal.

tinyfloat

Value for the smallest normal, alias of smallest_normal.

UserWarning

If the calculated value for the smallest normal is requested for double-double.

property tiny

Return the value for tiny, alias of smallest_normal.

Returns:

tiny – Value for the smallest normal, alias of smallest_normal.

Return type:

float

Warns:

UserWarning – If the calculated value for the smallest normal is requested for double-double.

blosc2.get_cpu_info()

Construct the result of cpuinfo.get_cpu_info(), without actually using cpuinfo.get_cpu_info() since that function takes 1s to run and this method is ran at import time.

class blosc2.iinfo(type)

Machine limits for integer types.

bits

The number of bits occupied by the type.

Type:

int

dtype

Returns the dtype for which iinfo returns information.

Type:

dtype

min

The smallest integer expressible by the type.

Type:

int

max

The largest integer expressible by the type.

Type:

int

Parameters:

int_type (integer type, dtype, or instance) – The kind of integer data type to get information about.

See also

finfo

The equivalent for floating point data types.

Examples

With types:

>>> import numpy as np
>>> ii16 = np.iinfo(np.int16)
>>> ii16.min
-32768
>>> ii16.max
32767
>>> ii32 = np.iinfo(np.int32)
>>> ii32.min
-2147483648
>>> ii32.max
2147483647

With instances:

>>> ii32 = np.iinfo(np.int32(10))
>>> ii32.min
-2147483648
>>> ii32.max
2147483647
blosc2.validate_dsl(func)[source]

Validate a DSL kernel function without executing it.

Parameters:

func – A Python callable or DSLKernel.

Returns:

A dictionary with: - valid (bool): whether the DSL is valid - dsl_source (str | None): extracted DSL source when valid - input_names (list[str] | None): input signature names when valid - error (str | None): user-facing error message when invalid

Return type:

dict