blosc2.Storage#

class blosc2.Storage(contiguous: bool = None, urlpath: str = None, mode: str = 'a', mmap_mode: str = None, initial_mapping_size: int = None, meta: dict = None)#

Dataclass for hosting the different storage parameters.

Parameters:
  • contiguous (bool) – If the chunks are stored contiguously or not. Default is True when urlpath is not None; False otherwise.

  • urlpath (str or pathlib.Path, optional) – If the storage is persistent, the name of the file (when contiguous = True) or the directory (if contiguous = False). If the storage is in-memory, then this field is None.

  • mode (str, optional) – Persistence mode: ‘r’ means read only (must exist); ‘a’ means read/write (create if it doesn’t exist); ‘w’ means create (overwrite if it exists). Default is ‘a’.

  • mmap_mode (str, optional) –

    If set, the file will be memory-mapped instead of using the default I/O functions and the mode argument will be ignored. The memory-mapping modes are similar as used by the numpy.memmap function, but it is possible to extend the file:

    mode

    description

    ’r’

    Open an existing file for reading only.

    ’r+’

    Open an existing file for reading and writing. Use this mode if you want to append data to an existing schunk file.

    ’w+’

    Create or overwrite an existing file for reading and writing. Use this mode if you want to create a new schunk.

    ’c’

    Open an existing file in copy-on-write mode: all changes affect the data in memory but changes are not saved to disk. The file on disk is read-only. On Windows, the size of the mapping cannot change.

    Only contiguous storage can be memory-mapped. Hence, urlpath must point to a file (and not a directory).

    Note

    Memory-mapped files are opened once and the file contents remain in (virtual) memory for the lifetime of the schunk. Using memory-mapped I/O can be faster than using the default I/O functions depending on the use case. Whereas reading performance is generally better, writing performance may also be slower in some cases on certain systems. In any case, memory-mapped files can be especially beneficial when operating with network file systems (like NFS).

    This is currently a beta feature (especially write operations) and we recommend trying it out and reporting any issues you may encounter.

  • initial_mapping_size (int, optional) –

    The initial size of the mapping for the memory-mapped file when writes are allowed (r+ w+, or c mode). Once a file is memory-mapped and extended beyond the initial mapping size, the file must be remapped which may be expensive. This parameter allows to decouple the mapping size from the actual file size to early reserve memory for future writes and avoid remappings. The memory is only reserved virtually and does not occupy physical memory unless actual writes happen. Since the virtual address space is large enough, it is ok to be generous with this parameter (with special consideration on Windows, see note below). For best performance, set this to the maximum expected size of the compressed data (see example in SChunk.__init__). The size is in bytes.

    Default: 1 GiB.

    Note

    On Windows, the size of the mapping is directly coupled to the file size. When the schunk gets destroyed, the file size will be truncated to the actual size of the schunk.

  • meta (dict or None) –

    A dictionary with different metalayers. One entry per metalayer:

    key: bytes or str

    The name of the metalayer.

    value: object

    The metalayer object that will be serialized using msgpack.

Attributes:
contiguous
initial_mapping_size
meta
mmap_mode
urlpath
__init__(contiguous: bool = None, urlpath: str = None, mode: str = 'a', mmap_mode: str = None, initial_mapping_size: int = None, meta: dict = None) None#

Methods

__init__([contiguous, urlpath, mode, ...])

Attributes

contiguous

initial_mapping_size

meta

mmap_mode

mode

urlpath