0doc.go

  1// Copyright (c) 2012-2018 Ugorji Nwoke. All rights reserved.
  2// Use of this source code is governed by a MIT license found in the LICENSE file.
  3
  4/*
  5Package codec provides a
  6High Performance, Feature-Rich Idiomatic Go 1.4+ codec/encoding library
  7for binc, msgpack, cbor, json.
  8
  9Supported Serialization formats are:
 10
 11  - msgpack: https://github.com/msgpack/msgpack
 12  - binc:    http://github.com/ugorji/binc
 13  - cbor:    http://cbor.io http://tools.ietf.org/html/rfc7049
 14  - json:    http://json.org http://tools.ietf.org/html/rfc7159
 15  - simple:
 16
 17To install:
 18
 19    go get github.com/ugorji/go/codec
 20
 21This package will carefully use 'unsafe' for performance reasons in specific places.
 22You can build without unsafe use by passing the safe or appengine tag
 23i.e. 'go install -tags=safe ...'. Note that unsafe is only supported for the last 3
 24go sdk versions e.g. current go release is go 1.9, so we support unsafe use only from
 25go 1.7+ . This is because supporting unsafe requires knowledge of implementation details.
 26
 27For detailed usage information, read the primer at http://ugorji.net/blog/go-codec-primer .
 28
 29The idiomatic Go support is as seen in other encoding packages in
 30the standard library (ie json, xml, gob, etc).
 31
 32Rich Feature Set includes:
 33
 34  - Simple but extremely powerful and feature-rich API
 35  - Support for go1.4 and above, while selectively using newer APIs for later releases
 36  - Excellent code coverage ( > 90% )
 37  - Very High Performance.
 38    Our extensive benchmarks show us outperforming Gob, Json, Bson, etc by 2-4X.
 39  - Careful selected use of 'unsafe' for targeted performance gains.
 40    100% mode exists where 'unsafe' is not used at all.
 41  - Lock-free (sans mutex) concurrency for scaling to 100's of cores
 42  - Coerce types where appropriate
 43    e.g. decode an int in the stream into a float, decode numbers from formatted strings, etc
 44  - Corner Cases:
 45    Overflows, nil maps/slices, nil values in streams are handled correctly
 46  - Standard field renaming via tags
 47  - Support for omitting empty fields during an encoding
 48  - Encoding from any value and decoding into pointer to any value
 49    (struct, slice, map, primitives, pointers, interface{}, etc)
 50  - Extensions to support efficient encoding/decoding of any named types
 51  - Support encoding.(Binary|Text)(M|Unm)arshaler interfaces
 52  - Support IsZero() bool to determine if a value is a zero value.
 53    Analogous to time.Time.IsZero() bool.
 54  - Decoding without a schema (into a interface{}).
 55    Includes Options to configure what specific map or slice type to use
 56    when decoding an encoded list or map into a nil interface{}
 57  - Mapping a non-interface type to an interface, so we can decode appropriately
 58    into any interface type with a correctly configured non-interface value.
 59  - Encode a struct as an array, and decode struct from an array in the data stream
 60  - Option to encode struct keys as numbers (instead of strings)
 61    (to support structured streams with fields encoded as numeric codes)
 62  - Comprehensive support for anonymous fields
 63  - Fast (no-reflection) encoding/decoding of common maps and slices
 64  - Code-generation for faster performance.
 65  - Support binary (e.g. messagepack, cbor) and text (e.g. json) formats
 66  - Support indefinite-length formats to enable true streaming
 67    (for formats which support it e.g. json, cbor)
 68  - Support canonical encoding, where a value is ALWAYS encoded as same sequence of bytes.
 69    This mostly applies to maps, where iteration order is non-deterministic.
 70  - NIL in data stream decoded as zero value
 71  - Never silently skip data when decoding.
 72    User decides whether to return an error or silently skip data when keys or indexes
 73    in the data stream do not map to fields in the struct.
 74  - Detect and error when encoding a cyclic reference (instead of stack overflow shutdown)
 75  - Encode/Decode from/to chan types (for iterative streaming support)
 76  - Drop-in replacement for encoding/json. `json:` key in struct tag supported.
 77  - Provides a RPC Server and Client Codec for net/rpc communication protocol.
 78  - Handle unique idiosyncrasies of codecs e.g.
 79    - For messagepack, configure how ambiguities in handling raw bytes are resolved
 80    - For messagepack, provide rpc server/client codec to support
 81      msgpack-rpc protocol defined at:
 82      https://github.com/msgpack-rpc/msgpack-rpc/blob/master/spec.md
 83
 84Extension Support
 85
 86Users can register a function to handle the encoding or decoding of
 87their custom types.
 88
 89There are no restrictions on what the custom type can be. Some examples:
 90
 91    type BisSet   []int
 92    type BitSet64 uint64
 93    type UUID     string
 94    type MyStructWithUnexportedFields struct { a int; b bool; c []int; }
 95    type GifImage struct { ... }
 96
 97As an illustration, MyStructWithUnexportedFields would normally be
 98encoded as an empty map because it has no exported fields, while UUID
 99would be encoded as a string. However, with extension support, you can
100encode any of these however you like.
101
102Custom Encoding and Decoding
103
104This package maintains symmetry in the encoding and decoding halfs.
105We determine how to encode or decode by walking this decision tree
106
107  - is type a codec.Selfer?
108  - is there an extension registered for the type?
109  - is format binary, and is type a encoding.BinaryMarshaler and BinaryUnmarshaler?
110  - is format specifically json, and is type a encoding/json.Marshaler and Unmarshaler?
111  - is format text-based, and type an encoding.TextMarshaler?
112  - else we use a pair of functions based on the "kind" of the type e.g. map, slice, int64, etc
113
114This symmetry is important to reduce chances of issues happening because the
115encoding and decoding sides are out of sync e.g. decoded via very specific
116encoding.TextUnmarshaler but encoded via kind-specific generalized mode.
117
118Consequently, if a type only defines one-half of the symmetry
119(e.g. it implements UnmarshalJSON() but not MarshalJSON() ),
120then that type doesn't satisfy the check and we will continue walking down the
121decision tree.
122
123RPC
124
125RPC Client and Server Codecs are implemented, so the codecs can be used
126with the standard net/rpc package.
127
128Usage
129
130The Handle is SAFE for concurrent READ, but NOT SAFE for concurrent modification.
131
132The Encoder and Decoder are NOT safe for concurrent use.
133
134Consequently, the usage model is basically:
135
136    - Create and initialize the Handle before any use.
137      Once created, DO NOT modify it.
138    - Multiple Encoders or Decoders can now use the Handle concurrently.
139      They only read information off the Handle (never write).
140    - However, each Encoder or Decoder MUST not be used concurrently
141    - To re-use an Encoder/Decoder, call Reset(...) on it first.
142      This allows you use state maintained on the Encoder/Decoder.
143
144Sample usage model:
145
146    // create and configure Handle
147    var (
148      bh codec.BincHandle
149      mh codec.MsgpackHandle
150      ch codec.CborHandle
151    )
152
153    mh.MapType = reflect.TypeOf(map[string]interface{}(nil))
154
155    // configure extensions
156    // e.g. for msgpack, define functions and enable Time support for tag 1
157    // mh.SetExt(reflect.TypeOf(time.Time{}), 1, myExt)
158
159    // create and use decoder/encoder
160    var (
161      r io.Reader
162      w io.Writer
163      b []byte
164      h = &bh // or mh to use msgpack
165    )
166
167    dec = codec.NewDecoder(r, h)
168    dec = codec.NewDecoderBytes(b, h)
169    err = dec.Decode(&v)
170
171    enc = codec.NewEncoder(w, h)
172    enc = codec.NewEncoderBytes(&b, h)
173    err = enc.Encode(v)
174
175    //RPC Server
176    go func() {
177        for {
178            conn, err := listener.Accept()
179            rpcCodec := codec.GoRpc.ServerCodec(conn, h)
180            //OR rpcCodec := codec.MsgpackSpecRpc.ServerCodec(conn, h)
181            rpc.ServeCodec(rpcCodec)
182        }
183    }()
184
185    //RPC Communication (client side)
186    conn, err = net.Dial("tcp", "localhost:5555")
187    rpcCodec := codec.GoRpc.ClientCodec(conn, h)
188    //OR rpcCodec := codec.MsgpackSpecRpc.ClientCodec(conn, h)
189    client := rpc.NewClientWithCodec(rpcCodec)
190
191Running Tests
192
193To run tests, use the following:
194
195    go test
196
197To run the full suite of tests, use the following:
198
199    go test -tags alltests -run Suite
200
201You can run the tag 'safe' to run tests or build in safe mode. e.g.
202
203    go test -tags safe -run Json
204    go test -tags "alltests safe" -run Suite
205
206Running Benchmarks
207
208Please see http://github.com/ugorji/go-codec-bench .
209
210Caveats
211
212Struct fields matching the following are ignored during encoding and decoding
213    - struct tag value set to -
214    - func, complex numbers, unsafe pointers
215    - unexported and not embedded
216    - unexported and embedded and not struct kind
217    - unexported and embedded pointers (from go1.10)
218
219Every other field in a struct will be encoded/decoded.
220
221Embedded fields are encoded as if they exist in the top-level struct,
222with some caveats. See Encode documentation.
223
224*/
225package codec
226
227// TODO:
228//   - For Go 1.11, when mid-stack inlining is enabled,
229//     we should use committed functions for writeXXX and readXXX calls.
230//     This involves uncommenting the methods for decReaderSwitch and encWriterSwitch
231//     and using those (decReaderSwitch and encWriterSwitch) in all handles
232//     instead of encWriter and decReader.
233//     The benefit is that, for the (En|De)coder over []byte, the encWriter/decReader
234//     will be inlined, giving a performance bump for that typical case.
235//     However, it will only  be inlined if mid-stack inlining is enabled,
236//     as we call panic to raise errors, and panic currently prevents inlining.
237//
238// PUNTED:
239//   - To make Handle comparable, make extHandle in BasicHandle a non-embedded pointer,
240//     and use overlay methods on *BasicHandle to call through to extHandle after initializing
241//     the "xh *extHandle" to point to a real slice.
242//
243// BEFORE EACH RELEASE:
244//   - Look through and fix padding for each type, to eliminate false sharing
245//     - critical shared objects that are read many times
246//       TypeInfos
247//     - pooled objects:
248//       decNaked, decNakedContainers, codecFner, typeInfoLoadArray, 
249//     - small objects allocated independently, that we read/use much across threads:
250//       codecFn, typeInfo
251//     - Objects allocated independently and used a lot
252//       Decoder, Encoder,
253//       xxxHandle, xxxEncDriver, xxxDecDriver (xxx = json, msgpack, cbor, binc, simple)
254//     - In all above, arrange values modified together to be close to each other.
255//
256//     For all of these, either ensure that they occupy full cache lines,
257//     or ensure that the things just past the cache line boundary are hardly read/written
258//     e.g. JsonHandle.RawBytesExt - which is copied into json(En|De)cDriver at init
259//
260//     Occupying full cache lines means they occupy 8*N words (where N is an integer).
261//     Check this out by running: ./run.sh -z
262//     - look at those tagged ****, meaning they are not occupying full cache lines
263//     - look at those tagged <<<<, meaning they are larger than 32 words (something to watch)
264//   - Run "golint -min_confidence 0.81"