Super-chunk#

This API describes the new Blosc 2 container, the super-chunk (or schunk for short).

struct blosc2_storage#

This struct is meant for holding storage parameters for a for a blosc2 container, allowing to specify, for example, how to interpret the contents included in the schunk.

Public Members

bool contiguous#

Whether the chunks are contiguous or sparse.

char *urlpath#

The path for persistent storage. If NULL, that means in-memory.

blosc2_cparams *cparams#

The compression params when creating a schunk.

If NULL, sensible defaults are used depending on the context.

blosc2_dparams *dparams#

The decompression params when creating a schunk.

If NULL, sensible defaults are used depending on the context.

blosc2_io *io#

Input/output backend.

struct blosc2_schunk#

This struct is the standard container for Blosc 2 compressed data.

This is essentially a container for Blosc 1 chunks of compressed data, and it allows to overcome the 32-bit limitation in Blosc 1. Optionally, a blosc2_frame can be attached so as to store the compressed chunks contiguously.

Public Members

uint8_t compcode#

The default compressor. Each chunk can override this.

uint8_t compcode_meta#

The default compressor metadata. Each chunk can override this.

uint8_t clevel#

The compression level and other compress params.

uint8_t splitmode#

The split mode.

int32_t typesize#

The type size.

int32_t blocksize#

The requested size of the compressed blocks (0; meaning automatic).

int32_t chunksize#

Size of each chunk. 0 if not a fixed chunksize.

uint8_t filters[BLOSC2_MAX_FILTERS]#

The (sequence of) filters. 8-bit per filter.

uint8_t filters_meta[BLOSC2_MAX_FILTERS]#

Metadata for filters. 8-bit per meta-slot.

int64_t nchunks#

Number of chunks in super-chunk.

int64_t current_nchunk#

The current chunk that is being accessed.

int64_t nbytes#

The data size (uncompressed).

int64_t cbytes#

The data size + chunks header size (compressed).

uint8_t **data#

Pointer to chunk data pointers buffer.

size_t data_len#

Length of the chunk data pointers buffer.

blosc2_storage *storage#

Pointer to storage info.

blosc2_frame *frame#

Pointer to frame used as store for chunks.

uint8_t* ctx; Context for the thread holder. NULL if not acquired.

blosc2_context *cctx#

Context for compression.

blosc2_context *dctx#

Context for decompression.

struct blosc2_metalayer *metalayers[16]#

The array of metalayers.

uint16_t nmetalayers#

The number of metalayers in the super-chunk.

int16_t nvlmetalayers#

The number of variable-length metalayers.

void *tuner_params#

Tune configuration.

blosc2_schunk *blosc2_schunk_new(blosc2_storage *storage)#

Create a new super-chunk.

Remark

In case that storage.urlpath is not NULL, the data is stored on-disk. If the data file(s) exist, they are overwritten.

Parameters:
  • storage – The storage properties.

Returns:

The new super-chunk.

int blosc2_schunk_free(blosc2_schunk *schunk)#

Release resources from a super-chunk.

Remark

All the memory resources attached to the super-chunk are freed. If the super-chunk is on-disk, the data continues there for a later re-opening.

Parameters:
  • schunk – The super-chunk to be freed.

Returns:

0 if success.

blosc2_schunk *blosc2_schunk_open(const char *urlpath)#

Open an existing super-chunk that is on-disk (frame).

No in-memory copy is made.

Parameters:
  • urlpath – The file name.

Returns:

The new super-chunk. NULL if not found or not in frame format.

blosc2_schunk *blosc2_schunk_open_offset(const char *urlpath, int64_t offset)#

Open an existing super-chunk that is on-disk (frame).

No in-memory copy is made.

Parameters:
  • urlpath – The file name.

  • offset – The frame offset.

Returns:

The new super-chunk. NULL if not found or not in frame format.

blosc2_schunk *blosc2_schunk_open_udio(const char *urlpath, const blosc2_io *udio)#

Open an existing super-chunk (no copy is made) using a user-defined I/O interface.

Parameters:
  • urlpath – The file name.

  • udio – The user-defined I/O interface.

Returns:

The new super-chunk.

blosc2_schunk *blosc2_schunk_copy(blosc2_schunk *schunk, blosc2_storage *storage)#

Create a copy of a super-chunk.

Parameters:
  • schunk – The super-chunk to be copied.

  • storage – The storage properties.

Returns:

The new super-chunk.

blosc2_schunk *blosc2_schunk_from_buffer(uint8_t *cframe, int64_t len, bool copy)#

Create a super-chunk out of a contiguous frame buffer.

Remark

If copy is false, the cframe buffer passed will be owned by the super-chunk and will be automatically freed when blosc2_schunk_free() is called. If the user frees it after the opening, bad things will happen. Don’t do that (or set copy).

Parameters:
  • cframe – The buffer of the in-memory frame.

  • copy – Whether the super-chunk should make a copy of the cframe data or not. The copy will be made to an internal sparse frame.

  • len – The length of the buffer (in bytes).

Returns:

The new super-chunk.

int64_t blosc2_schunk_to_buffer(blosc2_schunk *schunk, uint8_t **cframe, bool *needs_free)#
int64_t blosc2_schunk_to_file(blosc2_schunk *schunk, const char *urlpath)#
int64_t blosc2_schunk_append_file(blosc2_schunk *schunk, const char *urlpath)#
int blosc2_schunk_get_cparams(blosc2_schunk *schunk, blosc2_cparams **cparams)#

Return the cparams associated to a super-chunk.

Parameters:
  • schunk – The super-chunk from where to extract the compression parameters.

  • cparams – The pointer where the compression params will be returned.

Returns:

0 if succeeds. Else a negative code is returned.

Warning

A new struct is allocated, and the user should free it after use.

int blosc2_schunk_get_dparams(blosc2_schunk *schunk, blosc2_dparams **dparams)#

Return the dparams struct associated to a super-chunk.

Parameters:
  • schunk – The super-chunk from where to extract the decompression parameters.

  • dparams – The pointer where the decompression params will be returned.

Returns:

0 if succeeds. Else a negative code is returned.

Warning

A new struct is allocated, and the user should free it after use.

int blosc2_schunk_reorder_offsets(blosc2_schunk *schunk, int64_t *offsets_order)#

Reorder the chunk offsets of an existing super-chunk.

Parameters:
  • schunk – The super-chunk whose chunk offsets are to be reordered.

  • offsets_order – The new order of the chunk offsets.

Returns:

0 if succeeds. Else a negative code is returned.

int64_t blosc2_schunk_frame_len(blosc2_schunk *schunk)#

Get the length (in bytes) of the internal frame of the super-chunk.

Parameters:
  • schunk – The super-chunk.

Returns:

The length (in bytes) of the internal frame. If there is not an internal frame, an estimate of the length is provided.

int64_t blosc2_schunk_fill_special(blosc2_schunk *schunk, int64_t nitems, int special_value, int32_t chunksize)#

Quickly fill an empty frame with special values (zeros, NaNs, uninit).

Parameters:
  • schunk – The super-chunk to be filled. This must be empty initially.

  • nitems – The number of items to fill.

  • special_value – The special value to use for filling. The only values supported for now are BLOSC2_SPECIAL_ZERO, BLOSC2_SPECIAL_NAN and BLOSC2_SPECIAL_UNINIT.

  • chunksize – The chunksize for the chunks that are to be added to the super-chunk.

Returns:

The total number of chunks that have been added to the super-chunk. If there is an error, a negative value is returned.

int64_t blosc2_schunk_append_buffer(blosc2_schunk *schunk, const void *src, int32_t nbytes)#

Append a src data buffer to a super-chunk.

Parameters:
  • schunk – The super-chunk where data will be appended.

  • src – The buffer of data to compress.

  • nbytes – The size of the src buffer.

Returns:

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

int blosc2_schunk_get_slice_buffer(blosc2_schunk *schunk, int64_t start, int64_t stop, void *buffer)#

Fill buffer with a schunk slice.

Parameters:
  • schunk – The super-chunk from where to extract a slice.

  • start – Index (0-based) where the slice begins.

  • stop – The first index (0-based) that is not in the selected slice.

  • buffer – The buffer where the data will be stored.

Returns:

An error code.

Warning

You must make sure that you have enough space in buffer to store the uncompressed data.

int blosc2_schunk_set_slice_buffer(blosc2_schunk *schunk, int64_t start, int64_t stop, void *buffer)#

Update a schunk slice from buffer.

Parameters:
  • schunk – The super-chunk where to set the slice.

  • start – Index (0-based) where the slice begins.

  • stop – The first index (0-based) that is not in the selected slice.

  • buffer – The buffer containing the data to set.

Returns:

An error code.

void blosc2_schunk_avoid_cframe_free(blosc2_schunk *schunk, bool avoid_cframe_free)#

Set the private avoid_cframe_free field in a frame.

Parameters:
  • schunk – The super-chunk referencing the frame.

  • avoid_cframe_free – The value to set in the blosc2_frame_s structure.

Warning

If you set it to true you will be responsible of freeing it.

Dealing with chunks#

int blosc2_schunk_get_chunk(blosc2_schunk *schunk, int64_t nchunk, uint8_t **chunk, bool *needs_free)#

Return a compressed chunk that is part of a super-chunk in the chunk parameter.

Parameters:
  • schunk – The super-chunk from where to extract a chunk.

  • nchunk – The chunk to be extracted (0 indexed).

  • chunk – The pointer to the chunk of compressed data.

  • needs_free – The pointer to a boolean indicating if it is the user’s responsibility to free the chunk returned or not.

Returns:

The size of the (compressed) chunk or 0 if it is non-initialized. If some problem is detected, a negative code is returned instead.

Warning

If the super-chunk is backed by a frame that is disk-based, a buffer is allocated for the (compressed) chunk, and hence a free is needed. You can check whether the chunk requires a free with the needs_free parameter. If the chunk does not need a free, it means that a pointer to the location in the super-chunk (or the backing in-memory frame) is returned in the chunk parameter.

int blosc2_schunk_get_lazychunk(blosc2_schunk *schunk, int64_t nchunk, uint8_t **chunk, bool *needs_free)#

Return a (lazy) compressed chunk that is part of a super-chunk in the chunk parameter.

Parameters:
  • schunk – The super-chunk from where to extract a chunk.

  • nchunk – The chunk to be extracted (0 indexed).

  • chunk – The pointer to the (lazy) chunk of compressed data.

  • needs_free – The pointer to a boolean indicating if it is the user’s responsibility to free the chunk returned or not.

Returns:

The size of the (compressed) chunk or 0 if it is non-initialized. If some problem is detected, a negative code is returned instead. Note that a lazy chunk is somewhat larger than a regular chunk because of the trailer section (for details see README_CHUNK_FORMAT.rst).

Note

For disk-based frames, a lazy chunk is always returned.

Warning

Currently, a lazy chunk can only be used by blosc2_decompress_ctx and blosc2_getitem_ctx.

Warning

If the super-chunk is backed by a frame that is disk-based, a buffer is allocated for the (compressed) chunk, and hence a free is needed. You can check whether requires a free with the needs_free parameter. If the chunk does not need a free, it means that a pointer to the location in the super-chunk (or the backing in-memory frame) is returned in the chunk parameter. In this case the returned chunk is not lazy.

int blosc2_schunk_decompress_chunk(blosc2_schunk *schunk, int64_t nchunk, void *dest, int32_t nbytes)#

Decompress and return the nchunk chunk of a super-chunk.

If the chunk is uncompressed successfully, it is put in the *dest pointer.

Parameters:
  • schunk – The super-chunk from where the chunk will be decompressed.

  • nchunk – The chunk to be decompressed (0 indexed).

  • dest – The buffer where the decompressed data will be put.

  • nbytes – The size of the area pointed by *dest.

Returns:

The size of the decompressed chunk or 0 if it is non-initialized. If some problem is detected, a negative code is returned instead.

Warning

You must make sure that you have enough space to store the uncompressed data.

int64_t blosc2_schunk_append_chunk(blosc2_schunk *schunk, uint8_t *chunk, bool copy)#

Append an existing chunk to a super-chunk.

Parameters:
  • schunk – The super-chunk where the chunk will be appended.

  • chunk – The chunk to append. An internal copy is made, so chunk can be reused or freed if desired.

  • copy – Whether the chunk should be copied internally or can be used as-is.

Returns:

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

int64_t blosc2_schunk_insert_chunk(blosc2_schunk *schunk, int64_t nchunk, uint8_t *chunk, bool copy)#

Insert a chunk at a specific position in a super-chunk.

Parameters:
  • schunk – The super-chunk where the chunk will be appended.

  • nchunk – The position where the chunk will be inserted.

  • chunk – The chunk to insert. If an internal copy is made, the chunk can be reused or freed if desired.

  • copy – Whether the chunk should be copied internally or can be used as-is.

Returns:

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

int64_t blosc2_schunk_update_chunk(blosc2_schunk *schunk, int64_t nchunk, uint8_t *chunk, bool copy)#

Update a chunk at a specific position in a super-chunk.

Parameters:
  • schunk – The super-chunk where the chunk will be updated.

  • nchunk – The position where the chunk will be updated.

  • chunk – The new chunk. If an internal copy is made, the chunk can be reused or freed if desired.

  • copy – Whether the chunk should be copied internally or can be used as-is.

Returns:

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

int64_t blosc2_schunk_delete_chunk(blosc2_schunk *schunk, int64_t nchunk)#

Delete a chunk at a specific position in a super-chunk.

Parameters:
  • schunk – The super-chunk where the chunk will be deleted.

  • nchunk – The position where the chunk will be deleted.

Returns:

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

Creating chunks#

int blosc2_chunk_zeros(blosc2_cparams cparams, int32_t nbytes, void *dest, int32_t destsize)#

Create a chunk made of zeros.

Parameters:
  • cparams – The compression parameters.

  • nbytes – The size (in bytes) of the chunk.

  • dest – The buffer where the data chunk will be put.

  • destsize – The size (in bytes) of the dest buffer; must be BLOSC_EXTENDED_HEADER_LENGTH at least.

Returns:

The number of bytes compressed (BLOSC_EXTENDED_HEADER_LENGTH). If negative, there has been an error and dest is unusable.

int blosc2_chunk_nans(blosc2_cparams cparams, int32_t nbytes, void *dest, int32_t destsize)#

Create a chunk made of nans.

Parameters:
  • cparams – The compression parameters; only 4 bytes (float) and 8 bytes (double) are supported.

  • nbytes – The size (in bytes) of the chunk.

  • dest – The buffer where the data chunk will be put.

  • destsize – The size (in bytes) of the dest buffer; must be BLOSC_EXTENDED_HEADER_LENGTH at least.

Returns:

The number of bytes compressed (BLOSC_EXTENDED_HEADER_LENGTH). If negative, there has been an error and dest is unusable.

Note

Whether the NaNs are floats or doubles will be given by the typesize.

int blosc2_chunk_repeatval(blosc2_cparams cparams, int32_t nbytes, void *dest, int32_t destsize, const void *repeatval)#

Create a chunk made of repeated values.

Parameters:
  • cparams – The compression parameters.

  • nbytes – The size (in bytes) of the chunk.

  • dest – The buffer where the data chunk will be put.

  • destsize – The size (in bytes) of the dest buffer.

  • repeatval – A pointer to the repeated value (little endian). The size of the value is given by cparams.typesize param.

Returns:

The number of bytes compressed (BLOSC_EXTENDED_HEADER_LENGTH + typesize). If negative, there has been an error and dest is unusable.

int blosc2_chunk_uninit(blosc2_cparams cparams, int32_t nbytes, void *dest, int32_t destsize)#

Create a chunk made of uninitialized values.

Parameters:
  • cparams – The compression parameters.

  • nbytes – The size (in bytes) of the chunk.

  • dest – The buffer where the data chunk will be put.

  • destsize – The size (in bytes) of the dest buffer; must be BLOSC_EXTENDED_HEADER_LENGTH at least.

Returns:

The number of bytes compressed (BLOSC_EXTENDED_HEADER_LENGTH). If negative, there has been an error and dest is unusable.

Frame specific functions#

int64_t *blosc2_frame_get_offsets(blosc2_schunk *schunk)#

Get the offsets of a frame in a super-chunk.

Parameters:
  • schunk – The super-chunk containing the frame.

Returns:

If successful, return a pointer to a buffer of the decompressed offsets. The number of offsets is equal to schunk->nchunks; the user is responsible to free this buffer. Else, return a NULL value.