Source code and complete Google Protocol Buffer specifications for NMSG are located at http://rsfcode.isc.org/git/nmsg/.

ISC types (source, current listing)

SIE types (http://rsfcode.isc.org/git/sie-nmsg/tree/)

Other types

Organizations outside of SIE maitnain their own types, see http://rsfcode.isc.org/git/nmsg/tree/nmsg/vendors.h

The text below is a description of the ISC (vendor_id=1) and SIE (vendor_id=2) NMSG data types for data transported into and within SIE.

SIE Message Types
-----------------

I. Wire formats: NCAP and NMSG

    NCAP is the original, unextensible wire format originally developed to
    encapsulate DNS payload data for SIE. It allows for two 32 bit "user1" and
    "user2" values. The NCAP format and the libncap / ncaptool implementation
    are deprecated and should not be used.

    NMSG is an extensible container format that allows for dynamic message
    types and supports. NMSG containers may be streamed to a file or
    transmitted as UDP datagrams. NMSG containers can contain multiple NMSG
    messages or a fragment of a message too large to fit in a single
    container. The contents of an NMSG container may be compressed.

    Each NMSG message is identified by a numeric <vendor ID, message type>
    tuple which determines which message type plugin to use when decoding the
    message's payload. If the <vendor ID, message type> tuple is unknown, the
    message payload will be treated opaquely. Each message has a nanosecond
    precision timestamp. Optionally, an NMSG message may set its "source",
    "operator", or "group" fields which are 32 bit values. The "source" field
    is an opaque integer that uniquely indentifies the organization submitting
    data to SIE; note that an organization may have multiple sensors deployed,
    all of which will use the same per-organization identifier when broadcast
    by SIE. The "operator" and "group" fields can be used for further
    differentiation, but these values may be aliased to a case-insensitive
    string when being printed.

    The current operator alias values being used by SIE are:

        1 ISC

    The current group alias values being used by SIE are:

        1 ConfickerAB (ch80)
        2 ConfickerC (ch80)
        3 trafficconverter (ch80)
        129 dns_parse_failure (ch206)
        130 dns_dedupe_blacklist (ch206)
        131 dns_chaff (ch206)
        132 dns_udp_truncated (ch206)

II. NMSG message types

    The <nmsg/vendors.h> C header file is the authoritative source of NMSG
    vendor IDs. The following seven vendor IDs are currently assigned:

        1   ISC
        2   SIE
        3   GTISC
        4   Defintel
        5   Univie (University of Vienna)
        6   Unveillance (Mandiant)
        7   Univ. of Wisconsin

    See http://rsfcode.isc.org/git/nmsg/tree/nmsg/vendors.h for the current
    and complete list.

    All the ISC message types are publicly distributed in the nmsg tarball and
    may be of general interest outside of SIE. The SIE message types are
    intended to only be used internally to SIE.

    NMSG message types are typically implemented using Google Protocol Buffers
    as the underlying wire encapsulation format. libnmsg provides some higher
    level data types on top of the base data types provided by protobuf.
    libnmsg also supports "aliasing" of message fields, allowing a message
    module plugin to provide custom code for accessing a field, so NMSG
    message fields do not necessarily have a one-to-one correspondence with
    the underlying protocobuf message fields.

    The NMSG message field types are:

        nmsg_msgmod_ft_enum
        nmsg_msgmod_ft_bytes
        nmsg_msgmod_ft_string
        nmsg_msgmod_ft_mlstring
        nmsg_msgmod_ft_ip
        nmsg_msgmod_ft_uint16
        nmsg_msgmod_ft_uint32
        nmsg_msgmod_ft_uint64
        nmsg_msgmod_ft_int16
        nmsg_msgmod_ft_int32
        nmsg_msgmod_ft_int64

    NMSG message fields may have zero or more of the following flags enabled:

        NMSG_MSGMOD_FIELD_REPEATED
        NMSG_MSGMOD_FIELD_REQUIRED
        NMSG_MSGMOD_FIELD_HIDDEN
        NMSG_MSGMOD_FIELD_NOPRINT

    HIDDEN fields are hidden from the message API. NOPRINT fields are not
    hidden from the message API, but are not printed when the message is
    converted to presentation format.

    The ISC message module supports the following message types:

    * "dns", which encodes DNS RRs, RRsets, and question RRs. It has the
      following fields, all of which are optional.

        * qname (bytes)
            The wire-format DNS question name.

        * qclass (uint16)
            The DNS question class.

        * qtype (uint16)
            The DNS question type.

        * section (uint16)
            The DNS section that the RR or RRset appeared in.

        * rrname (bytes)
            The wire-format DNS RR or RRset owner name.

        * rrclass (uint16)
            The DNS RR class.

        * rrtype (uint16)
            The DNS RR type.

        * rrttl (uint32)
            The DNS RR time-to-live.

        * rdata (bytes) (repeated)
            The DNS RR RDATA.

    * "dnsqr", which is a message type for capturing DNS query/response state.
      It has the following fields:

        * type (enum)
            One of the values UDP_INVALID, UDP_QUERY_RESPONSE,
            UDP_UNANSWERED_QUERY, UDP_UNSOLICITED_RESPONSE, TCP, or ICMP.

            UDP_QUERY_RESPONSE are pairs of query/response messages where the
            full 9-tuple of <query_ip, response_ip, IP protocol, query_port,
            response_port, DNS ID, qname, qtype, qclass> matches between query
            and response.

            UDP_UNANSWERED_QUERY are queries which were sent but never
            responded to within the state table window.

            UDP_UNSOLICITED_RESPONSE are responses which were received but for
            which no corresponding query could be found in the state table.

        A 9-tuple of fields associated with the transaction's state:

        * query_ip (ip)
        * response_ip (ip)
        * proto (uint16)
        * query_port (uint16)
        * response_port (uint16)
        * id (uint16)
        * qname (bytes)
        * qclass (bytes)
        * qtype (bytes)
            When the DNS QR flag is unset (i.e., the message is a query), the
            query IP and query port are the IP source address and source port
            and the response IP and response port are the IP destination
            address and destination port.

            When the DNS QR flag is set (i.e., the message is a response), the
            query IP and query port are the IP destination address and
            destination port and the response IP and response port are the IP
            source address and source port.

        * rcode (uint16)
            The DNS RCODE of the response.

        * query (bytes)
        * response (bytes)
            These are virtual fields which return the DNS query message and
            the DNS response message, which may have undergone IP reassembly.
            The original packets as seen on the wire are recorded in the
            following fields and may be accessed via the message API. (The
            "response" field is also accessible as the "dns" field for
            compatibility with the ISC/ncap message type.)

        * query_packet (bytes)
        * query_time_sec (int64)
        * query_time_nsec (int32)
        * response_packet (bytes)
        * response_time_sec (int64)
        * response_time_nsec (int32)
            The examples/nmsg-dnsqr2pcap program in the nmsg distribution
            reads these fields and can convert ISC/dnsqr NMSG files into
            standard DLT_RAW pcap savefiles.

    * "email", which is a message type primarily intended for representing
      SMTP message headers and body URLs received at spamtraps.

        * "type" (enum)
            One of unknown, spamtrap, rej_network (rejected by network test,
            e.g. DNSBL), rej_content (rejected by content filter), rej_user
            (rejected by manual user classification).

        * "headers" (mlstring)
            The SMTP headers.

        * "srcip" (ip)
            The address of the SMTP peer.

        * "srchost" (string)
            The hostname of the SMTP peer, if resolved.

        * "helo" (string)
            The parameter to the SMTP HELO or EHLO verb.

        * "from" (string)
            The parameter to the SMTP MAIL FROM verb.

        * "rcpt" (string) (repeated)
            The parameter(s) to the SMTP RCPT TO verb.

        * "bodyurl" (string) (repeated)
            Any URLs found in the message body.

    * "http", which is a message type primarily intended for representing hits
      to HTTP sinkholes.

        * "type" (enum)
            One of unknown, sinkhole.

        * "srcip" (ip)
        * "srchost" (string)
        * "srcport" (uint16)
        * "dstip" (ip)
        * "dstport" (uint16)

        * "request" (mlstring)
            The HTTP request, including request headers.

    * "ipconn", a basic message type representing an "IP connection".

        * "proto" (uint16)
        * "srcip" (ip)
        * "srcport" (uint16)
        * "dstip" (ip)
        * "dstport" (uint16)

    * "linkpair", a message type representing links between web pages.

        * "type" (enum)
            One of anchor, redirect.

        * "src" (string)
        * "dst" (string)
        * "headers" (mlstring)

    * "logline", a message type representing a log line.

        * "category" (string)
        * "message" (string)

    * "ncap", a transitional message type for the conversion of legacy NCAP
      data.

        * "type" (enum)
            One of IPV4, IPV6, Legacy.

        * "payload" (bytes)
            The network datagram.

        * "srcip" (ip)
        * "dstip" (ip)
        * "srcport" (uint16)
        * "dstport" (uint16)
        * "proto" (uint16)

        * "payload" (bytes)
            The application layer payload.

        * "dns" (bytes)
            If the application layer payload is a DNS message, this field will
            be present and is the DNS message.


    * "encode", a message type for enacpsulating data in other generic formats for
      transport .across SIE.

        * "type" (enum)
           One of Text, JSON, YAML, MSGPACK, or XML

        * "payload" (bytes)
           A payload of the described type.

    The SIE message module is under development and contains experimental
    message types.

    * "dnsdedupe", a message type for representing deduplicated passive DNS
      replication RRSET data

        * "type" (enum)
           One of Insertion, Expiration, Chaff, Authoritative, Merged, Merged_Authoritative,
           or Merged_Insertion.

        * "count" (uint32)
          How many times the RRSET was seen since the last broadcast message.

    * "time_first" (uint32)        
    * "time_last" (uint32)        
    * "zone_time_first" (uint32)        
    * "zone_time_last" (uint32)        
          Over what period the data was seen in passive DNS replication or zone files.

    * "response_ip (bytes)
           The IP address of the responding nameserver

        * "rrname" (bytes)
    * "rrtype" (uint32)
    * "rrclass" (uint32)
    * "rrttl" (uint32)
    * "rdata" (uint32)
    * "response (bytes)
       The RRSET data.

    * "bailiwick" (bytes)
       The domain under which the RRSET answer was given.  For exmaple, a GTLD nameserver
       for ORG might provide different answers for the same query than the delegated
       authoritative nameserrvers for ISC.ORG might provide.

    * "reputation", a message type for representing simple associations between
      identifiers and tagged meanings

    * "type" (enum)
      One of Address, Host, Domain, Nameserver, or URI

       * an Address type may have the following objects specified:
        * "address" (bytes)
        * "netmask" (bytes)
        * "port" (uint32)
       * a Host type would have the following object specified:
        * "host" (bytes)
       * a Domain type would have the following object specified:
        * "domain" (bytes)



    * "tag" (bytes)
      A required link to what meaning is assigned to the object defined
      by the operator supplying it.  It is typically a URL to a textual
      description.

    * "value" (bytes)
      Optional additional meaning or data further defined by the description
      referenced by the tag.