Loganalysis Mailing List R. Gerhards Request for Comments: DRAFT January 2003 The Simple Event Log Protocol (SELP) Status of this Memo This draft is being distributed to the members of the Loganalysis mailing list to detail an evolving transport protocol for event logging. Abstract This documents describes the evolving simple event log protocol which has been discussed on the loganalysis mailing list in December 2002 and January 2003. Please note that the discussion focussed on a "syslog over tcp" protocol. I have renamed the protocol to "simple event log protocol" - or SELP for short - to avoid any confusion with the syslog standardization protocols going on at the IETF. SELP will likely never become an RFC. I am documenting it here because it looks like at least some Implementors would like to have such a simple, TCP- based, syslog-like protocol and will not move to [RFC3195] for the time being. 1. Introduction The syslog protocol described in RFC 3164 [1] presents a spectrum of service options for provisioning an event-based logging service over a network. The reliable syslog protocol described in [2] provides reliable - and optional encrypted - delivery of network events via syslog protocol. It uses BEEP [3] to achieve this goal. While BEEP is acknowledged as a well-engineered protocol, there is a group of implementors who belief a simpler, TCP based protocol is needed for syslog-like event notification. This memo describes some slight additions to [1] that will enable syslog-like messages to travel over TCP connections. It was the effort of the group of implementors to keep the protocol as consistent with [1] as possible. To avoid confusion, the new syslog-like protocol over tcp has been named "simple event log protocol" (SELP). This shall make clear that it is not the actual syslog protocol. It is beyond the scope of this memo to argue for, or against, the use of SELP versus RFC3195 [2] for reliable delivery for syslog-like messages. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [2]. I DO NOT INTEND TO DUPLICATE [RFC3164] HERE. As such, this memo lists only the differences to [RFC3164]. I am using the original chapter numbers to list them. 2. Transport Layer Protocol SELP is a simplex protocol build on top of TCP. In any SELP session, there is exactly one sender and one receiver. The receiver listens to what the SELP sender sends, but it MUST NOT send any data back. All session management is done by TCP. SELP itself does NOT have any acknowledgment capabilities. As such, there are some reliability issues which are described in detail in section 2.4 below. The receiver or sender role is not hard-coded. But once a SELP session has been initiated, the roles can not be changed without closing the session. A SELP server can handle many multiple SELP connections and can also initiate SELP sessions in whom it is the sender. But once a session is established, the roles are fixed for that session. 2.1 Port Assignment There has no well-known port been assigned to SELP and we do not expect this to happen. It is recommended that a server listens to port 1514. The actual port should be a configurable entity in both the SELP client and server. The client can use any random port to connect to the server. This is different from RFC 3164, where a client SHOULD use the source port of 514 (the same port, the server listens on). 2.2 Framing Framing tells how the beginning and ending of a message is terminated. For a general discussion of framing options, I recommend to look at 3.1 of RFC 3117 [5]. Other than in UDP-based RFC 3164 syslog, SELP uses TCP, which is stream oriented. As such, there is no concept of a "packet" on the wire. Consequently, a proper frame format must be established so that the receiver knows which octets belong to a given frame. SELP uses octet-stuffing as its framing method. The SELP TRAILER (see 4.1.4 below) delimits each SELP frame. There is no explicit "begin of frame". The next frame begins with the octect immediately following the previous TRAILER. One notable exception of this framing format is end of stream (EOS). If EOS is seen by a receiver, the last frame is deemed to be a full frame, even though the TRAILER might be missing. The receiver SHOULD flag the fact that the TRAILER was missing to its upper layers, but we recommend NOT dropping the frame as it contains potentially important data. 2.3 Messages There is no explicit concept of a "messages" inside a frame. Each frame is a complete message and each message will be transmitted within a single frame. This is a one-to-one relation. 2.4 Reliability SELP is a simples procotol. It relies on TCP exclusively for reliability. There is no SELP layer acknowledgment. As such, SELP is only reliable as far as the TCP stream is not broken. If the TCP stream breaks, the sender does not exactly know which data has actually been transmitted to the receiver and which not. This is due to the fact that TCP uses a sliding window of variable size. In short, this allows that the sender sends packets, receives an acknowledgement from the OS, but the data is still on the wire. The OS acknowledgment does NOT mean the data is actually received. While the sender continues to send data, the already OS acknowledge data is eventually delivered to the sender. If that succeeds, everthing is fine. If now, however, the TCP stack will detect the problem over time and notify the sender. However, the sender does not know exactly where the problem occured and so does not know what to re-send. Anyhow, it knows at least something went wrong and SHOULD log an event to a local system event repository. This behaviour is not completely satisfactory, but the author believes this is still a major advantage over UDP based syslog, where message loss is never discovered. There is also at least one other scenario where the missing acknowledgments brings uncertainty. If a connection breaks, it is not sure whether this was a network error or one actually at the receiver side. If a temporary network problem breaks a SELP session, the sender should re-try to connect to the receiver so that it can continue to send data once the temporary problem is solved. This is fine as long as it is a problem in a transit system. Now let us assume the reciever should store data on a local file system and the file system runs out of disk space. In this case, we assume the receiver is configured to shut down in such a situation. As such, it terminates all SELP sessions and then shuts down. The sender now, too, detects the broken session. It again may think that the broken session is a result of a temporary condition and tries to reconnect immediately. However, the receiver will not be able to accept the incoming TCP connection as it is no longer running. As such, the sender must use multiple retries to confirm that the sender is unreachable. Again, the situation is not perfect. A protocol like RFC3195 does solve all the uncertainty by procotol-layer acknowledgments or status messages. Anyhow, the author believes that this uncertainty is better than what syslog/udp provides. Those that are interested in a really reliable delivery of log messages are strongly advised to look at RFC3195. 2.5 SELP Session Initiation and Termination A sender can initiate a SELP session at any time to any receiver. It SHOULD keep only a single session to the same receiver, but it MAY open more than one session to the same receiver if there is good reasoning for such. A SELP session is typically terminated by the sender, only. Graceful shutdown happens when the receiver sends the last full SELP frame it has to send and then closes the socket. The SELP session can also be closed by the receiver. This is important as the receiver might otherwise be forced to keep open a potentially large number of TCP sessions, making it an easy target for denial of service attacks. As such, the receiver MAY close SELP sessions if there is good reason to do so. A receiver SHOULD do so only after waiting a configurable amount of time, as this will reduce the probability that the SELP sender sends further data. In any case, the sender has no idea of why the receiver closed the session. It will just be notified that the TCP session is no longer alive. If that happens and the sender intends to send further data, the sender MUST try at least once to re-establish the session. A SELP receiver actively closing down a SELP session SHOULD log this to an appropriate place. A SELP sender experiencing a shut down session SHOULD also log this incident. The third scenario is that the TCP session SELP is running on can also be terminated in the middle of the stream, most likely due to an error in the transport between sender and receiver. In this case, the SELP receiver should accept the last octets received as a complete frame, SHOULD flag it as being in doubt when there is no valid TRAILER and MUST log an event in this case. The SELP sender SHOULD also log an event and SHOULD re-try to connect to the receiver at least once. It is left to the SELP sender's decision (or configuration) if any messages are being retransmitted in this case and if so, how many. Without a retransmit, messages can potentially become lost, with a retransmit, they may become duplicated. Due to the simplex nature of SELP, there is no way to know this for certain (see 2.4 - Reliability). 4.1 SELP Message Parts The full format of a SELP message seen on the wire has four discernable parts. The first part is called the PRI, the second part is the HEADER, and the third part is the MSG and the fourth part is the TRAILER. The total length of the message MUST be 65530 bytes or less. There is no minimum length of the SELP message although sending a SELP messagewith no contents is worthless and SHOULD NOT be transmitted. 4.1.2 HEADER Part of a SELP Packet The HEADER part contains a timestamp and an indication of the hostname or IP address of the device. The HEADER part of the selp message MUST contain visible (printing) characters. The code set used MUST also be seven-bit ASCII in an eight-bit field like that used in the PRI part. In this code set, the only allowable characters are the ABNF VCHAR values (%d33-126) and spaces (SP value %d32). The HEADER contains two fields called the TIMESTAMP and the HOSTNAME. The TIMESTAMP will immediately follow the trailing ">" from the PRI part and single space characters MUST follow each of the TIMESTAMP and HOSTNAME fields. HOSTNAME will contain the hostname, as it knows itself. If it does not have a hostname, then it will contain its own IP address. If a device has multiple IP addresses, it has usually been seen to use the IP address from which the message is transmitted. An alternative to this behavior has also been seen. In that case, a device may be configured to send all messages using a single source IP address regardless of the interface from which the message is sent. This will provide a single consistent HOSTNAME for all messages sent from a device. The TIMESTAMP field is a RFC3339-TIMESTAMP. A sender MUST format the timestamp as a RFC3339-TIMESTAMP. A receiver MAY accept additional formats, for example an RFC3164-TIMESTAMP. When accepting additional timestamp formats, the receiver MUST ensure the most correct interpretation. It can use external configuration files to obtain a correct interpretation (e.g. a config file that stores sender's time zones). The RFC3339-TIMESTAMP is as specified in [RFC3339]: The following syntax MUST be used when using a RFC3339-TIMESTAMP. This is specified using the syntax description notation defined in [ABNF]. date-fullyear = 4DIGIT date-month = 2DIGIT ; 01-12 date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on ; month/year time-hour = 2DIGIT ; 00-23 time-minute = 2DIGIT ; 00-59 time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap second ; rules time-secfrac = "." 1*DIGIT time-numoffset = ("+" / "-") time-hour ":" time-minute time-offset = "Z" / time-numoffset partial-time = time-hour ":" time-minute ":" time-second [time-secfrac] full-date = date-fullyear "-" date-month "-" date-mday full-time = partial-time time-offset date-time = full-date "T" full-time Other than in RFC3339 - the 'T' and 'Z' characters in this syntax MUST be upper case. - usage of the 'T' character is mandatory. It MUST NOT be replaced by any other character (like a space character). - the sender SHOULD include time-secfrac (fractional seconds) if its clock accuracy permits so A sample of this format is: 1985-04-12T23:20:50.52Z This represents 20 minutes and 50.52 seconds after the 23rd hour of April 12th, 1985 in UTC. For complete details and samples see RFC3339. A single space character MUST follow the TIMESTAMP field. The HOSTNAME field will contain only the hostname, the IPv4 address, or the IPv6 address of the originator of the message. The preferred value is the hostname. If the hostname is used, the HOSTNAME field MUST contain the hostname of the device as specified in STD 13 [4]. It should be noted that this MUST NOT contain any embedded spaces. The Domain Name MAY be included in the HOSTNAME field. If the IPv4 address is used, it MUST be shown as the dotted decimal notation as used in STD 13 [5]. If an IPv6 address is used, any valid representation used in RFC 2373 [6] MAY be used. A single space character MUST also follow the HOSTNAME field. 4.1.3 MSG Part of a SELP Packet This needs probably some work regarding the character set... 4.1.4 TRAILER Part of a SELP Packet The TRAILER MUST be CRLF. Each SELP message MUST be terminated by a valid trailer. Receivers MAY accept a single LF as a valid trailer. Recivers MAY accept a recived message without a trailer as valid if the TCP stream is closed without receiving a TRAILER. Author's Note: This is a very rough first draft intend to get discussion started. This draft is incomplete and most probably inconsistent. It should not serve as a reference for implementation. Security Considerations Security considerations ane not discussed in this memo. Acknowledgements The following people provided content feedback during the writing of this document: Balazs Scheidler Darren Reed Andrew Ross Mikael Olsson (mikael.olsson@clavister.com) others to follow... Need to look up the mailing list ;) Eric Allman is the original inventor and author of the syslog daemon and protocol. The author of this memo and the community at large would like to express their appreciation for this work and for the usefulness that it has provided over the years. References [1] Lonvick, C., "The BSD Syslog Protocol", RFC 3164, August 2001. [2] New, D. and Rose, M., "Reliable Delivery for syslog", RFC 3195, November 2001. [3] Rose, M., "The Blocks Extensible Exchange Protocol Core", RFC 3080, March 2001. [4] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997. [5] Rose, M., "On the Design of Application Protocols", RFC 3117, November 2001. Author's Address: Rainer Gerhards Adiscon GmbH Mozartstrasse 21 97950 Grossrinderfeld Germany email: rgerhards@hq.adiscon.com phone: +49-9349-92880