This EnScript plugin is designed to parse Protocol Buffer data of the type documented at the following website:
Quoting from the website:
Protocol Buffers are language-neutral, platform-neutral extensible mechanisms for serializing structured data.
The structure of Protocol Buffer data is documented at the following URL:
A protocol buffer message is a series of key-value pairs. The binary version of a message just uses the field’s number as the key – the name and declared type for each field can only be determined on the decoding end by referencing the message type’s definition, which is defined by a *.proto file.
When a message is encoded, each key-value pair is turned into a record consisting of the field number, a wire-type, and a payload. The wire type tells the parser how big the payload after it is.
There are six wire types each having an ID:
Variable-length integers (VARINTs) play a big part in protocol buffer encoding. Each one uses up to 10-bytes to store an integer value such that the most significant bit of each byte indicates whether another byte follows in the VARINT sequence. The remaining bits of each byte in the sequence are combined to form the value.
The record comprising a key-value pair will start with a VARINT tag whose low 3-bits represent the wire-type; the remaining bits represent the field number.
The LEN wire-type has a variable length, which will be specified by a VARINT following the tag.
Following the tag (and if the wire-type is LEN, the length) will be the payload.
The operation of the LEN wire-type is such that a protocol key-value pairs can be nested, i.e., they can have child key-value pairs.
The script will parse the protocol buffer highlighted in the GUI. Depending on the option chosen, the protocol buffer must be highlighted from beginning to end.
Given that protocol buffers are often to be found in Base64-encoded format. e.g., in Google Search URLs, the script also provides an option to decode Base64-encoded data, which can then be parsed as a protocol buffer (if that's what it is).
Note that the Base64-encoded ved protocol buffers in Google Search URLs will always be preceded by 0 (zero). This is not part of the Base64 encoded data.
When decoding protocol buffers, it's important to note the following comment from the protobuf.dev website:
Protocol buffer messages don’t inherently self-describe their data, but they have a fully reflective schema that you can use to implement self-description. That is, you cannot fully interpret one without access to its corresponding .proto file.
Accordingly, the output of this script may not always interpret the data correctly. The same applies to any other tool that attempts to parse protocol buffers without the associated *.proto file.
It's also important to note that lengthy protocol-buffer data written to the comment field of bookmarks created by the script may not be visible in the table pane; it may also be truncated.
The script also provides the option to decode tags and VARINTs, which can prove useful when investigating how protocol buffers are encoded.
This script was developed for use in EnCase training. For more details, please click the following link:
This release adds the ability to parse protocol buffers from beginning to end including those encoded with Base64.
Tested with OpenText Forensic (EnCase) 25.3.0.85.
Please upgrade to one of the following broswers: Internet Explorer 11 (or greater) or the latest version of Chrome or Firefox