TradeSkillMaster/External/EmbeddedLibs/LibSerialize/README.md

273 lines
11 KiB
Markdown
Raw Normal View History

2020-11-13 14:13:12 -05:00
# LibSerialize
LibSerialize is a Lua library for efficiently serializing/deserializing arbitrary values.
It supports serializing nils, numbers, booleans, strings, and tables containing these types.
It is best paired with [LibDeflate](https://github.com/safeteeWow/LibDeflate), to compress
the serialized output and optionally encode it for World of Warcraft addon or chat channels.
IMPORTANT: if you decide not to compress the output and plan on transmitting over an addon
channel, it still needs to be encoded, but encoding via `LibDeflate:EncodeForWoWAddonChannel()`
or `LibCompress:GetAddonEncodeTable()` will likely inflate the size of the serialization
by a considerable amount. See the usage below for an alternative.
Note that serialization and compression are sensitive to the specifics of your data set.
You should experiment with the available libraries (LibSerialize, AceSerializer, LibDeflate,
LibCompress, etc.) to determine which combination works best for you.
## Usage:
```lua
-- Dependencies: AceAddon-3.0, AceComm-3.0, LibSerialize, LibDeflate
MyAddon = LibStub("AceAddon-3.0"):NewAddon("MyAddon", "AceComm-3.0")
local LibSerialize = LibStub("LibSerialize")
local LibDeflate = LibStub("LibDeflate")
function MyAddon:OnEnable()
self:RegisterComm("MyPrefix")
end
-- With compression (recommended):
function MyAddon:Transmit(data)
local serialized = LibSerialize:Serialize(data)
local compressed = LibDeflate:CompressDeflate(serialized)
local encoded = LibDeflate:EncodeForWoWAddonChannel(compressed)
self:SendCommMessage("MyPrefix", encoded, "WHISPER", UnitName("player"))
end
function MyAddon:OnCommReceived(prefix, payload, distribution, sender)
local decoded = LibDeflate:DecodeForWoWAddonChannel(payload)
if not decoded then return end
local decompressed = LibDeflate:DecompressDeflate(decoded)
if not decompressed then return end
local success, data = LibSerialize:Deserialize(decompressed)
if not success then return end
-- Handle `data`
end
-- Without compression (custom codec):
MyAddon._codec = LibDeflate:CreateCodec("\000", "\255", "")
function MyAddon:Transmit(data)
local serialized = LibSerialize:Serialize(data)
local encoded = self._codec:Encode(serialized)
self:SendCommMessage("MyPrefix", encoded, "WHISPER", UnitName("player"))
end
function MyAddon:OnCommReceived(prefix, payload, distribution, sender)
local decoded = self._codec:Decode(payload)
if not decoded then return end
local success, data = LibSerialize:Deserialize(decoded)
if not success then return end
-- Handle `data`
end
```
## API:
* **`LibSerialize:SerializeEx(opts, ...)`**
Arguments:
* `opts`: options (see below)
* `...`: a variable number of serializable values
Returns:
* result: `...` serialized as a string
* **`LibSerialize:Serialize(...)`**
Arguments:
* `...`: a variable number of serializable values
Returns:
* `result`: `...` serialized as a string
Calls `SerializeEx(opts, ...)` with the default options (see below)
* **`LibSerialize:Deserialize(input)`**
Arguments:
* `input`: a string previously returned from `LibSerialize:Serialize()`
Returns:
* `success`: a boolean indicating if deserialization was successful
* `...`: the deserialized value(s), or a string containing the encountered Lua error
* **`LibSerialize:DeserializeValue(input)`**
Arguments:
* `input`: a string previously returned from `LibSerialize:Serialize()`
Returns:
* `...`: the deserialized value(s)
* **`LibSerialize:IsSerializableType(...)`**
Arguments:
* `...`: a variable number of values
Returns:
* `result`: true if all of the values' types are serializable.
Note that if you pass a table, it will be considered serializable
even if it contains unserializable keys or values. Only the types
of the arguments are checked.
`Serialize()` will raise a Lua error if the input cannot be serialized.
This will occur if any of the following exceed 16777215: any string length,
any table key count, number of unique strings, number of unique tables.
It will also occur by default if any unserializable types are encountered,
though that behavior may be disabled (see options).
`Deserialize()` and `DeserializeValue()` are equivalent, except the latter
returns the deserialization result directly and will not catch any Lua
errors that may occur when deserializing invalid input.
Note that none of the serialization/deseriazation methods support reentrancy,
and modifying tables during the serialization process is unspecified and
should be avoided. Table serialization is multi-phased and assumes a consistent
state for the key/value pairs across the phases.
## Options:
The following serialization options are supported:
* `errorOnUnserializableType`: `boolean` (default true)
* `true`: unserializable types will raise a Lua error
* `false`: unserializable types will be ignored. If it's a table key or value,
the key/value pair will be skipped. If it's one of the arguments to the
call to SerializeEx(), it will be replaced with `nil`.
* `stable`: `boolean` (default false)
* `true`: the resulting string will be stable, even if the input includes
maps. This option comes with an extra memory usage and CPU time cost.
* `false`: the resulting string will be unstable and will potentially differ
between invocations if the input includes maps
* `filter`: `function(t, k, v) => boolean` (default nil)
* If specified, the function will be called on every key/value pair in every
table encountered during serialization. The function must return true for
the pair to be serialized. It may be called multiple times on a table for
the same key/value pair. See notes on reeentrancy and table modification.
If an option is unspecified in the table, then its default will be used.
This means that if an option `foo` defaults to true, then:
* `myOpts.foo = false`: option `foo` is false
* `myOpts.foo = nil`: option `foo` is true
## Customizing table serialization:
For any serialized table, LibSerialize will check for the presence of a
metatable key `__LibSerialize`. It will be interpreted as a table with
the following possible keys:
* `filter`: `function(t, k, v) => boolean`
* If specified, the function will be called on every key/value pair in that
table. The function must return true for the pair to be serialized. It may
be called multiple times on a table for the same key/value pair. See notes
on reeentrancy and table modification. If combined with the `filter` option,
both functions must return true.
## Examples:
1. `LibSerialize:Serialize()` supports variadic arguments and arbitrary key types,
maintaining a consistent internal table identity.
```lua
local t = { "test", [false] = {} }
t[ t[false] ] = "hello"
local serialized = LibSerialize:Serialize(t, "extra")
local success, tab, str = LibSerialize:Deserialize(serialized)
assert(success)
assert(tab[1] == "test")
assert(tab[ tab[false] ] == "hello")
assert(str == "extra")
```
2. Normally, unserializable types raise an error when encountered during serialization,
but that behavior can be disabled in order to silently ignore them instead.
```lua
local serialized = LibSerialize:SerializeEx(
{ errorOnUnserializableType = false },
print, { a = 1, b = print })
local success, fn, tab = LibSerialize:Deserialize(serialized)
assert(success)
assert(fn == nil)
assert(tab.a == 1)
assert(tab.b == nil)
```
3. Tables may reference themselves recursively and will still be serialized properly.
```lua
local t = { a = 1 }
t.t = t
t[t] = "test"
local serialized = LibSerialize:Serialize(t)
local success, tab = LibSerialize:Deserialize(serialized)
assert(success)
assert(tab.t.t.t.t.t.t.a == 1)
assert(tab[tab.t] == "test")
```
4. You may specify a global filter that applies to all tables encountered during
serialization, and to individual tables via their metatable.
```lua
local t = { a = 1, b = print, c = 3 }
local nested = { a = 1, b = print, c = 3 }
t.nested = nested
setmetatable(nested, { __LibSerialize = {
filter = function(t, k, v) return k ~= "c" end
}})
local opts = {
filter = function(t, k, v) return LibSerialize:IsSerializableType(k, v) end
}
local serialized = LibSerialize:SerializeEx(opts, t)
local success, tab = LibSerialize:Deserialize(serialized)
assert(success)
assert(tab.a == 1)
assert(tab.b == nil)
assert(tab.c == 3)
assert(tab.nested.a == 1)
assert(tab.nested.b == nil)
assert(tab.nested.c == nil)
```
## Encoding format:
Every object is encoded as a type byte followed by type-dependent payload.
For numbers, the payload is the number itself, using a number of bytes
appropriate for the number. Small numbers can be embedded directly into
the type byte, optionally with an additional byte following for more
possible values. Negative numbers are encoded as their absolute value,
with the type byte indicating that it is negative. Floats are decomposed
into their eight bytes, unless serializing as a string is shorter.
For strings and tables, the length/count is also encoded so that the
payload doesn't need a special terminator. Small counts can be embedded
directly into the type byte, whereas larger counts are encoded directly
following the type byte, before the payload.
Strings are stored directly, with no transformations. Tables are stored
in one of three ways, depending on their layout:
* Array-like: all keys are numbers starting from 1 and increasing by 1.
Only the table's values are encoded.
* Map-like: the table has no array-like keys.
The table is encoded as key-value pairs.
* Mixed: the table has both map-like and array-like keys.
The table is encoded first with the values of the array-like keys,
followed by key-value pairs for the map-like keys. For this version,
two counts are encoded, one each for the two different portions.
Strings and tables are also tracked as they are encountered, to detect reuse.
If a string or table is reused, it is encoded instead as an index into the
tracking table for that type. Strings must be >2 bytes in length to be tracked.
Tables may reference themselves recursively.
#### Type byte:
The type byte uses the following formats to implement the above:
* `NNNN NNN1`: a 7 bit non-negative int
* `CCCC TT10`: a 2 bit type index and 4 bit count (strlen, #tab, etc.)
* Followed by the type-dependent payload
* `NNNN S100`: the lower four bits of a 12 bit int and 1 bit for its sign
* Followed by a byte for the upper bits
* `TTTT T000`: a 5 bit type index
* Followed by the type-dependent payload, including count(s) if needed