Source code and complete Google Protocol Buffer specifications for NMSG are located at http://rsfcode.isc.org/git/nmsg/.
ISC types (source, current listing)
SIE types (http://rsfcode.isc.org/git/sie-nmsg/tree/)
Organizations outside of SIE maitnain their own types, see http://rsfcode.isc.org/git/nmsg/tree/nmsg/vendors.h
The text below is a description of the ISC (vendor_id=1) and SIE (vendor_id=2) NMSG data types for data transported into and within SIE.
SIE Message Types ----------------- I. Wire formats: NCAP and NMSG NCAP is the original, unextensible wire format originally developed to encapsulate DNS payload data for SIE. It allows for two 32 bit "user1" and "user2" values. The NCAP format and the libncap / ncaptool implementation are deprecated and should not be used. NMSG is an extensible container format that allows for dynamic message types and supports. NMSG containers may be streamed to a file or transmitted as UDP datagrams. NMSG containers can contain multiple NMSG messages or a fragment of a message too large to fit in a single container. The contents of an NMSG container may be compressed. Each NMSG message is identified by a numeric <vendor ID, message type> tuple which determines which message type plugin to use when decoding the message's payload. If the <vendor ID, message type> tuple is unknown, the message payload will be treated opaquely. Each message has a nanosecond precision timestamp. Optionally, an NMSG message may set its "source", "operator", or "group" fields which are 32 bit values. The "source" field is an opaque integer that uniquely indentifies the organization submitting data to SIE; note that an organization may have multiple sensors deployed, all of which will use the same per-organization identifier when broadcast by SIE. The "operator" and "group" fields can be used for further differentiation, but these values may be aliased to a case-insensitive string when being printed. The current operator alias values being used by SIE are: 1 ISC The current group alias values being used by SIE are: 1 ConfickerAB (ch80) 2 ConfickerC (ch80) 3 trafficconverter (ch80) 129 dns_parse_failure (ch206) 130 dns_dedupe_blacklist (ch206) 131 dns_chaff (ch206) 132 dns_udp_truncated (ch206) II. NMSG message types The <nmsg/vendors.h> C header file is the authoritative source of NMSG vendor IDs. The following seven vendor IDs are currently assigned: 1 ISC 2 SIE 3 GTISC 4 Defintel 5 Univie (University of Vienna) 6 Unveillance (Mandiant) 7 Univ. of Wisconsin See http://rsfcode.isc.org/git/nmsg/tree/nmsg/vendors.h for the current and complete list. All the ISC message types are publicly distributed in the nmsg tarball and may be of general interest outside of SIE. The SIE message types are intended to only be used internally to SIE. NMSG message types are typically implemented using Google Protocol Buffers as the underlying wire encapsulation format. libnmsg provides some higher level data types on top of the base data types provided by protobuf. libnmsg also supports "aliasing" of message fields, allowing a message module plugin to provide custom code for accessing a field, so NMSG message fields do not necessarily have a one-to-one correspondence with the underlying protocobuf message fields. The NMSG message field types are: nmsg_msgmod_ft_enum nmsg_msgmod_ft_bytes nmsg_msgmod_ft_string nmsg_msgmod_ft_mlstring nmsg_msgmod_ft_ip nmsg_msgmod_ft_uint16 nmsg_msgmod_ft_uint32 nmsg_msgmod_ft_uint64 nmsg_msgmod_ft_int16 nmsg_msgmod_ft_int32 nmsg_msgmod_ft_int64 NMSG message fields may have zero or more of the following flags enabled: NMSG_MSGMOD_FIELD_REPEATED NMSG_MSGMOD_FIELD_REQUIRED NMSG_MSGMOD_FIELD_HIDDEN NMSG_MSGMOD_FIELD_NOPRINT HIDDEN fields are hidden from the message API. NOPRINT fields are not hidden from the message API, but are not printed when the message is converted to presentation format. The ISC message module supports the following message types: * "dns", which encodes DNS RRs, RRsets, and question RRs. It has the following fields, all of which are optional. * qname (bytes) The wire-format DNS question name. * qclass (uint16) The DNS question class. * qtype (uint16) The DNS question type. * section (uint16) The DNS section that the RR or RRset appeared in. * rrname (bytes) The wire-format DNS RR or RRset owner name. * rrclass (uint16) The DNS RR class. * rrtype (uint16) The DNS RR type. * rrttl (uint32) The DNS RR time-to-live. * rdata (bytes) (repeated) The DNS RR RDATA. * "dnsqr", which is a message type for capturing DNS query/response state. It has the following fields: * type (enum) One of the values UDP_INVALID, UDP_QUERY_RESPONSE, UDP_UNANSWERED_QUERY, UDP_UNSOLICITED_RESPONSE, TCP, or ICMP. UDP_QUERY_RESPONSE are pairs of query/response messages where the full 9-tuple of <query_ip, response_ip, IP protocol, query_port, response_port, DNS ID, qname, qtype, qclass> matches between query and response. UDP_UNANSWERED_QUERY are queries which were sent but never responded to within the state table window. UDP_UNSOLICITED_RESPONSE are responses which were received but for which no corresponding query could be found in the state table. A 9-tuple of fields associated with the transaction's state: * query_ip (ip) * response_ip (ip) * proto (uint16) * query_port (uint16) * response_port (uint16) * id (uint16) * qname (bytes) * qclass (bytes) * qtype (bytes) When the DNS QR flag is unset (i.e., the message is a query), the query IP and query port are the IP source address and source port and the response IP and response port are the IP destination address and destination port. When the DNS QR flag is set (i.e., the message is a response), the query IP and query port are the IP destination address and destination port and the response IP and response port are the IP source address and source port. * rcode (uint16) The DNS RCODE of the response. * query (bytes) * response (bytes) These are virtual fields which return the DNS query message and the DNS response message, which may have undergone IP reassembly. The original packets as seen on the wire are recorded in the following fields and may be accessed via the message API. (The "response" field is also accessible as the "dns" field for compatibility with the ISC/ncap message type.) * query_packet (bytes) * query_time_sec (int64) * query_time_nsec (int32) * response_packet (bytes) * response_time_sec (int64) * response_time_nsec (int32) The examples/nmsg-dnsqr2pcap program in the nmsg distribution reads these fields and can convert ISC/dnsqr NMSG files into standard DLT_RAW pcap savefiles. * "email", which is a message type primarily intended for representing SMTP message headers and body URLs received at spamtraps. * "type" (enum) One of unknown, spamtrap, rej_network (rejected by network test, e.g. DNSBL), rej_content (rejected by content filter), rej_user (rejected by manual user classification). * "headers" (mlstring) The SMTP headers. * "srcip" (ip) The address of the SMTP peer. * "srchost" (string) The hostname of the SMTP peer, if resolved. * "helo" (string) The parameter to the SMTP HELO or EHLO verb. * "from" (string) The parameter to the SMTP MAIL FROM verb. * "rcpt" (string) (repeated) The parameter(s) to the SMTP RCPT TO verb. * "bodyurl" (string) (repeated) Any URLs found in the message body. * "http", which is a message type primarily intended for representing hits to HTTP sinkholes. * "type" (enum) One of unknown, sinkhole. * "srcip" (ip) * "srchost" (string) * "srcport" (uint16) * "dstip" (ip) * "dstport" (uint16) * "request" (mlstring) The HTTP request, including request headers. * "ipconn", a basic message type representing an "IP connection". * "proto" (uint16) * "srcip" (ip) * "srcport" (uint16) * "dstip" (ip) * "dstport" (uint16) * "linkpair", a message type representing links between web pages. * "type" (enum) One of anchor, redirect. * "src" (string) * "dst" (string) * "headers" (mlstring) * "logline", a message type representing a log line. * "category" (string) * "message" (string) * "ncap", a transitional message type for the conversion of legacy NCAP data. * "type" (enum) One of IPV4, IPV6, Legacy. * "payload" (bytes) The network datagram. * "srcip" (ip) * "dstip" (ip) * "srcport" (uint16) * "dstport" (uint16) * "proto" (uint16) * "payload" (bytes) The application layer payload. * "dns" (bytes) If the application layer payload is a DNS message, this field will be present and is the DNS message. * "encode", a message type for enacpsulating data in other generic formats for transport .across SIE. * "type" (enum) One of Text, JSON, YAML, MSGPACK, or XML * "payload" (bytes) A payload of the described type. The SIE message module is under development and contains experimental message types. * "dnsdedupe", a message type for representing deduplicated passive DNS replication RRSET data * "type" (enum) One of Insertion, Expiration, Chaff, Authoritative, Merged, Merged_Authoritative, or Merged_Insertion. * "count" (uint32) How many times the RRSET was seen since the last broadcast message. * "time_first" (uint32) * "time_last" (uint32) * "zone_time_first" (uint32) * "zone_time_last" (uint32) Over what period the data was seen in passive DNS replication or zone files. * "response_ip (bytes) The IP address of the responding nameserver * "rrname" (bytes) * "rrtype" (uint32) * "rrclass" (uint32) * "rrttl" (uint32) * "rdata" (uint32) * "response (bytes) The RRSET data. * "bailiwick" (bytes) The domain under which the RRSET answer was given. For exmaple, a GTLD nameserver for ORG might provide different answers for the same query than the delegated authoritative nameserrvers for ISC.ORG might provide. * "reputation", a message type for representing simple associations between identifiers and tagged meanings * "type" (enum) One of Address, Host, Domain, Nameserver, or URI * an Address type may have the following objects specified: * "address" (bytes) * "netmask" (bytes) * "port" (uint32) * a Host type would have the following object specified: * "host" (bytes) * a Domain type would have the following object specified: * "domain" (bytes) * "tag" (bytes) A required link to what meaning is assigned to the object defined by the operator supplying it. It is typically a URL to a textual description. * "value" (bytes) Optional additional meaning or data further defined by the description referenced by the tag.