BitMagic-C++
|
Bit-vector serialization class. More...
#include <bmserial.h>
Public Types | |
typedef BV | bvector_type |
typedef bvector_type::allocator_type | allocator_type |
typedef bvector_type::blocks_manager_type | blocks_manager_type |
typedef bvector_type::statistics | statistics_type |
typedef bvector_type::block_idx_type | block_idx_type |
typedef bvector_type::size_type | size_type |
typedef byte_buffer< allocator_type > | buffer |
Public Member Functions | |
serializer (const allocator_type &alloc=allocator_type(), bm::word_t *temp_block=0) | |
Constructor. More... | |
serializer (bm::word_t *temp_block) | |
~serializer () | |
void | set_compression_level (unsigned clevel) |
Set compression level. More... | |
unsigned | get_compression_level () const |
Get compression level (0-5), Default 5 (recommended) 0 - take as is 1, 2 - apply light weight RLE/GAP encodings, limited depth hierarchical compression, intervals encoding 3 - variant of 2 with different cut-offs 4 - delta transforms plus Elias Gamma encoding where possible legacy) 5 - binary interpolated encoding (Moffat, et al) More... | |
Serialization Methods | |
size_type | serialize (const BV &bv, unsigned char *buf, size_t buf_size) |
Bitvector serialization into memory block. More... | |
void | serialize (const BV &bv, typename serializer< BV >::buffer &buf, const statistics_type *bv_stat=0) |
Bitvector serialization into buffer object (resized automatically) More... | |
void | optimize_serialize_destroy (BV &bv, typename serializer< BV >::buffer &buf) |
Bitvector serialization into buffer object (resized automatically) Input bit-vector gets optimized and then destroyed, content is NOT guaranteed after this operation. More... | |
const size_type * | get_compression_stat () const |
Return serialization counter vector. More... | |
void | gap_length_serialization (bool value) |
Set GAP length serialization (serializes GAP levels of the original vector) More... | |
void | byte_order_serialization (bool value) |
Set byte-order serialization (for cross platform compatibility) More... | |
void | encode_header (const BV &bv, bm::encoder &enc) |
Encode serialization header information. More... | |
void | encode_gap_block (const bm::gap_word_t *gap_block, bm::encoder &enc) |
void | gamma_gap_block (const bm::gap_word_t *gap_block, bm::encoder &enc) |
void | gamma_gap_array (const bm::gap_word_t *gap_block, unsigned arr_len, bm::encoder &enc, bool inverted=false) |
Encode GAP block as delta-array with Elias Gamma coder. More... | |
void | encode_bit_array (const bm::word_t *block, bm::encoder &enc, bool inverted) |
Encode bit-block as an array of bits. More... | |
void | gamma_gap_bit_block (const bm::word_t *block, bm::encoder &enc) |
void | gamma_arr_bit_block (const bm::word_t *block, bm::encoder &enc, bool inverted) |
void | bienc_arr_bit_block (const bm::word_t *block, bm::encoder &enc, bool inverted) |
void | bienc_gap_bit_block (const bm::word_t *block, bm::encoder &enc) |
encode bit-block as interpolated bit block of gaps More... | |
void | interpolated_arr_bit_block (const bm::word_t *block, bm::encoder &enc, bool inverted) |
void | interpolated_gap_bit_block (const bm::word_t *block, bm::encoder &enc) |
encode bit-block as interpolated gap block More... | |
void | interpolated_gap_array (const bm::gap_word_t *gap_block, unsigned arr_len, bm::encoder &enc, bool inverted) |
Encode GAP block as an array with binary interpolated coder. More... | |
void | interpolated_encode_gap_block (const bm::gap_word_t *gap_block, bm::encoder &enc) |
void | encode_bit_interval (const bm::word_t *blk, bm::encoder &enc, unsigned size_control) |
Encode BIT block with repeatable runs of zeroes. More... | |
void | encode_bit_digest (const bm::word_t *blk, bm::encoder &enc, bm::id64_t d0) |
Encode bit-block using digest (hierarchical compression) More... | |
unsigned char | find_gap_best_encoding (const bm::gap_word_t *gap_block) |
Determine best representation for GAP block based on current set compression level. More... | |
unsigned char | find_bit_best_encoding (const bm::word_t *block) |
Determine best representation for a bit-block. More... | |
void | reset_compression_stats () |
Reset all accumulated compression statistics. More... | |
void | reset_models () |
void | add_model (unsigned char mod, unsigned score) |
Bit-vector serialization class.
Class designed to convert sparse bit-vectors into a single block of memory ready for file or database storage or network transfer.
Reuse of this class for multiple serializations (but not across threads). Class resue offers some performance advantage (helps with temp memory reallocations).
Definition at line 77 of file bmserial.h.
typedef bvector_type::allocator_type bm::serializer< BV >::allocator_type |
Definition at line 81 of file bmserial.h.
typedef bvector_type::block_idx_type bm::serializer< BV >::block_idx_type |
Definition at line 84 of file bmserial.h.
typedef bvector_type::blocks_manager_type bm::serializer< BV >::blocks_manager_type |
Definition at line 82 of file bmserial.h.
typedef byte_buffer<allocator_type> bm::serializer< BV >::buffer |
Definition at line 87 of file bmserial.h.
typedef BV bm::serializer< BV >::bvector_type |
Definition at line 80 of file bmserial.h.
typedef bvector_type::size_type bm::serializer< BV >::size_type |
Definition at line 85 of file bmserial.h.
typedef bvector_type::statistics bm::serializer< BV >::statistics_type |
Definition at line 83 of file bmserial.h.
bm::serializer< BV >::serializer | ( | const allocator_type & | alloc = allocator_type() , |
bm::word_t * | temp_block = 0 |
||
) |
Constructor.
alloc | - memory allocator |
temp_block | - temporary block for various operations (if NULL it will be allocated and managed by serializer class) Temp block is used as a scratch memory during serialization, use of external temp block allows to avoid unnecessary re-allocations. |
Temp block attached is not owned by the class and NOT deallocated on destruction.
Definition at line 725 of file bmserial.h.
bm::serializer< BV >::serializer | ( | bm::word_t * | temp_block | ) |
Definition at line 749 of file bmserial.h.
bm::serializer< BV >::~serializer |
Definition at line 772 of file bmserial.h.
|
protected |
Definition at line 1025 of file bmserial.h.
|
protected |
Definition at line 1449 of file bmserial.h.
|
protected |
encode bit-block as interpolated bit block of gaps
Definition at line 1476 of file bmserial.h.
void bm::serializer< BV >::byte_order_serialization | ( | bool | value | ) |
Set byte-order serialization (for cross platform compatibility)
value | - TRUE serialization format includes byte-order marker |
Definition at line 803 of file bmserial.h.
Referenced by convert_bv2bvs(), main(), and bm::serialize().
|
protected |
Encode bit-block as an array of bits.
Definition at line 1397 of file bmserial.h.
|
protected |
Encode bit-block using digest (hierarchical compression)
Definition at line 1302 of file bmserial.h.
|
protected |
Encode BIT block with repeatable runs of zeroes.
Definition at line 1250 of file bmserial.h.
|
protected |
Encode GAP block
Definition at line 1194 of file bmserial.h.
|
protected |
Encode serialization header information.
Definition at line 809 of file bmserial.h.
|
protected |
Determine best representation for a bit-block.
Definition at line 1033 of file bmserial.h.
|
protected |
Determine best representation for GAP block based on current set compression level.
Definition at line 1152 of file bmserial.h.
|
protected |
Definition at line 1430 of file bmserial.h.
|
protected |
Encode GAP block as delta-array with Elias Gamma coder.
Definition at line 935 of file bmserial.h.
|
protected |
Definition at line 1421 of file bmserial.h.
|
protected |
Encode GAP block with Elias Gamma coder
Definition at line 897 of file bmserial.h.
void bm::serializer< BV >::gap_length_serialization | ( | bool | value | ) |
Set GAP length serialization (serializes GAP levels of the original vector)
value | - when TRUE serialized vector includes GAP levels parameters |
Definition at line 797 of file bmserial.h.
Referenced by convert_bv2bvs(), main(), bm::compressed_collection_serializer< CBC >::serialize(), and bm::serialize().
|
inline |
Get compression level (0-5), Default 5 (recommended) 0 - take as is 1, 2 - apply light weight RLE/GAP encodings, limited depth hierarchical compression, intervals encoding 3 - variant of 2 with different cut-offs 4 - delta transforms plus Elias Gamma encoding where possible legacy) 5 - binary interpolated encoding (Moffat, et al)
Recommended: use 3 or 5
Definition at line 125 of file bmserial.h.
|
inline |
Return serialization counter vector.
Definition at line 183 of file bmserial.h.
|
protected |
Definition at line 1519 of file bmserial.h.
|
protected |
Encode GAP block with using binary interpolated encoder
Definition at line 855 of file bmserial.h.
|
protected |
Encode GAP block as an array with binary interpolated coder.
Definition at line 979 of file bmserial.h.
|
protected |
encode bit-block as interpolated gap block
Definition at line 1467 of file bmserial.h.
void bm::serializer< BV >::optimize_serialize_destroy | ( | BV & | bv, |
typename serializer< BV >::buffer & | buf | ||
) |
Bitvector serialization into buffer object (resized automatically) Input bit-vector gets optimized and then destroyed, content is NOT guaranteed after this operation.
Effectively it moves data into the buffer.
The reason this operation exsists is because it is faster to do all three operations in one single pass. This is a destructive serialization!
bv | - input/output bitvector |
buf | - output buffer object |
Definition at line 1381 of file bmserial.h.
Referenced by main().
|
protected |
Reset all accumulated compression statistics.
Definition at line 782 of file bmserial.h.
|
inlineprotected |
Definition at line 288 of file bmserial.h.
void bm::serializer< BV >::serialize | ( | const BV & | bv, |
typename serializer< BV >::buffer & | buf, | ||
const statistics_type * | bv_stat = 0 |
||
) |
Bitvector serialization into buffer object (resized automatically)
bv | - input bitvector |
buf | - output buffer object |
bv_stat | - input (optional) bit-vector statistics object if NULL, serialize will compute the statistics |
Definition at line 1359 of file bmserial.h.
serializer< BV >::size_type bm::serializer< BV >::serialize | ( | const BV & | bv, |
unsigned char * | buf, | ||
size_t | buf_size | ||
) |
Bitvector serialization into memory block.
bv | - input bitvector |
buf | - out buffer (pre-allocated) No range checking is done in this method. It is responsibility of caller to allocate sufficient amount of memory using information from calc_stat() function. |
buf_size | - size of the output buffer |
Definition at line 1608 of file bmserial.h.
Referenced by convert_bv2bvs(), main(), make_BLOB(), bm::compressed_collection_serializer< CBC >::serialize(), and bm::serialize().
void bm::serializer< BV >::set_compression_level | ( | unsigned | clevel | ) |
Set compression level.
Higher compression takes more time to process.
clevel | - compression level (0-4) |
Definition at line 790 of file bmserial.h.
Referenced by convert_bv2bvs(), main(), and make_BLOB().