NetFef

NetFef is three layered network protocol, capable to be adopted on various network solutions.

The basics of the protocol is intended to have small footprint, to be capable to run on low memory smart devices.

The default implementation based on serial communication keeping RS485 (Four Eight Five – FEF) in mind. This is a centralized solution with one master device instrumenting the other devices. But the devices are functioning autonomous.

NetFef is currently available in C++ (Arduino) and Java languages.

Source code is at: https://github.com/prampec/NetFef

Architecture

The NetFef network designed in layers. Each of the layers are mostly independent from each other, and can be replaced by other variations. E.g. an RS485 physical layer is implemented, but can be replaced by other physical network layers. Or the communication protocol “Obsidian” (specified below) implements a centralized network, that can be replaced by another protocol.

The network layers are (in order of logical depth):

  1. Data layer – The data travels in the channel is specified by this layer. The data frame must contains message subject, command, and capability to add more parameters.
  2. Network protocol – This layer specifies handshake and modes to prevent (or at least handle) data collision.
  3. Physical layer – All the above layers are logical layers. The Network protocol uses the Physical layer to perform actual communication in the physical channel.

Data Frame

On the network bus Frames are traveling. Each frame has a small header and message parameters, as described below.

The idea is that payload may contain named variables, structures and lists.

Frame: <frame_length><target_address_length><target_address><sender_address_length><sender_address><parameter_count><parameter:subject><parameter:command>[<parameter>*]<sum>

  • frame_length – A 2×8 bit unsigned integer representing the length of the frame. We need to keep the size in hand, as small-memory devices will definitely have a maximum in length that can be handled. If a frame arrives with longer than can be processed, these devices may just ignore it. The length should be counted with this 2 bytes inclusive. So the maximal frame size is 65025 bytes.
  • target_address_length – A 8 bit unsigned integer for storing the length of the address, followed in the next field. Addresses are normally 2 bytes long. Network may contain routers to connect more devices, this is supported by handling dynamic length for the address field. If a standard device recognizes a frame with and address longer than two, it will simply ignore that frame.
  • target_address – This field of <target_address_length> long bytes. Representing the address. The address is usually a 2 byte field, expect when routers are forwarding the traffic. There are some special addresses.
    • 00 – Broadcast address – This message should be processed by anyone who is interested.
    • 01 – Master server address
  • sender_address_length – Similar to target_address_length field.
  • sender_address – Similar to target_address, expect there is no Broadcast sender.
  • parameter_count – A 8 bit unsigned integer. How many parameters are provided inclusive the subject and command parameter.
  • parameter:subject – This is a subject parameter (parameters are described below). The name of the subject parameter is always ‘s’, this way the parameter name ‘s’ is dedicated to this parameter, and should not be used for further parameters. A subject is recommended to be a ‘c’ type parameter, some application might use other types.
  • parameter:command – Like the “subject” parameter. The name of the command parameter is ‘c’. The ‘c’ name is also reserved.
  • parameter – A message can contains parameters. Parameter has a type may have a length, and a value. So parameter is defined as: Parameter: <parameter_name><parameter_type>[<length>]<value>
    • parameter_name – One byte “name” for the parameter. If a command had optional parameters it is nice to see if it is provided or not. Parameter names may be reused, see lists.
    • parameter_type – However both sides should know the type for the name, it is nice to have it for monitoring the traffic. Parameter type can be: B|b|i|I|l|L|s|S, where:
      • B – boolean value
      • b – 8 bit unsigned integer (aka. byte) value
      • i – 16 bit unsigned integer
      • I – 16 bit signed integer
      • l – 32 bit unsigned integer
      • L – 32 bit signed integer
      • c – One character
      • s – String value followed by one byte of length
      • S – String value followed by two bytes of length
      • t – Struct – See below. Followed by one byte length
      • T – Struct – See below. Followed by two bytes length.
    • length – Some parameter types like String types are followed by a length value. The length of the string should also count the trailing ‘\0’ character.
    • value – The actual value of the parameter. String values must always end with ‘\0’.
  • Check sum – Summary of all the preceding bytes (modulo 256) in one byte.

Struct

Type ‘t’ or ‘T’ represents a complex type of parameters. The type struct has a format: <parameter_count>[<parameter>*]

Struct may have at most 255 parameters.

The length parameter of the type will represents the actual byte count of this particular complex item.

It is allowed to nest struct parameters into a struct parameter.

Lists

It is allowed to create a list with providing the same name for more properties. In this case all properties must be the same type, but s and S type may be mixed, as well as t and T types.

Protocol

For the NetFef Data frame infrastructure various protocol implementations can be implemented. There can be anarchical or centralized networks. There can be various methods to let peers join the network and to prevent collision.

Now I would like to introduce the protocol called “Obsidian”.

Obsidian

In Obsidian protocol there is one master server and unlimited count of peer devices.

The idea is, that the master asks all the peers one-by-one for interaction, and the peers will respond to the master with the information to be populated. The master keeps on polling a peer, if reply was not received. This way it is likely that only one party as about use the bus for communication.

Master can always send messages (commands) to a peer according to the business logic, but when waiting for a reply. This kind of ad-hoc commands also must be reply-t to the master with some kind of acknowledge message.

A peer can send instant message to any other device (likely to the master), but in this case the collision is not handled, and the message is not guaranteed to be received.

The master regularly offers for new peers to join the network.

Specification

The timings in the specification very much depends on the physical implementation of the protocol. So instead of concrete numbers, references are used here, and the configuration of physical layer contains the actual numbers.

From now on all network management messages has the subject: ‘n’

0. Hand shake

If a message needs to be replied, a reply reference parameter ‘r’ (type ‘i’) is added to the message. The reply should contains the same reply reference number, but this time with name ‘R’.

In Obsidian implementation reply-able messages can be only requested by master.

If a reply-able message arrives, it needs to be replied within REPLY_MAX_DELAY_MS milliseconds. The master waits for the reply, and does not send any other messages while waiting. If no answer received by the master, the send will be repeated at most REPLY_REPEAT_COUNT times. The repeat may be postponed upon other ad-hoc command needs to be sent.

1. Joining the network

j – The master sends a broadcast message with command ‘j’ to offer new devices to join the network in every JOIN_OFFER_REPEAT_SECS seconds. The message contains a parameters:

  • ‘n’ (type ‘l’) – Network identity, that is unique for the network managed by this master.
  • ‘w (type ‘i’) – The peers should wait at most w seconds before before responding to this broadcast message.

– A new peer responses the command ‘j’, as it wants to join the network. The message must contains a parameter ‘i’ registration id (type ‘l’), if has any (see later). The new peer is free to choose any kind random of network address (except for 0x00/0x00 and 0x00/0x01), when joining a new network (the ‘n’ network id is unknown by the peer), and must provide a random registration id as well.

J – The master sends the command ‘J’ to the previous address with the following parameters:

  • ‘d’ (type ‘c’) decision – Value can be ‘a’ accepted, or ‘d’ declined. A request is declined, if address is already in use. In this case, a new random address should be generated, and the join should be tried after the next ‘o’ command.
  • ‘i’ (type ‘l’) registration id – The registration id provided by the peer (as a pseudo random id), is sent back to peer. See note below.
  • ‘r’ reply reference – In case the join is accepted, the accept should be confirmed by the peer.

Note that this message can be received by other peers, that accidentally has the same address. Thus the message with command ‘J’ should be ignored, if the ‘i’ registration id does not matches ours.

The accepted address and the registration id should be saved for later use.

– The command ‘J’ should only be replied by the peer, if registration was accepted. In the reply message the following parameters should be included:

  • parameter ‘d’ description (type ‘s’) should be used to help identifying the device type by the maser (and humans).
  • The parameter ‘v’ (type ‘s’) informs the master about the network protocol version implementation found in the peer.
  • Parameter ‘n’ (type ‘i) – Next poll interval. See Polling peers.

2. Polling peers

The master ask all the peers one by one for asking any messages that should be handled by the master.

p – Command ‘p’ says to the peer: it’s your time tell something.

If there is nothing to be sent from the peer command ‘p’ is replied with a parameter ‘n’ (type ‘i’): next poll should arrive after n seconds. The value of ‘n’ must not be less than NEXT_POLL_MIN_SECS, and must not be greater than NEXT_POLL_MAX_SECS seconds.

– The reply message to ‘p’ in any active cases should not be subject ‘n’. The subject, and the command of the message will be changed according business logic maintained by the peer device. The value ‘n’ (type ‘i’) is still reserved for telling the master about the next poll delay, as described above.

The time of the last ping should be saved for later use.

3. Failures

If a device does not respond to the poll, the poll will be repeated for NEXT_POLL_MAX_SECS seconds with POLL_RETRY_DELAY_SECS seconds spaces. After 2xNEXT_POLL_MAX_SECS seconds of inactivity the peers will be inactivated at the master, and no other polls will be performed until the peer does a re-registration.

At the peer side, if the last poll was received more than 2xNEXT_POLL_MAX_SECS seconds ago a re-registration should be started for the command ‘j’ with the previous address and registration id.

4. Network reset

Master may send a reset message for the peers joined the network. This is a broadcast message. After receiving the reset sequence, the peers should act as if they were not being joined the network. This is especially useful, upon the restart of the master. Reset command is ‘r’.

Network commands overview

Command Direction Description
j master->peers (broadcast) Join offer
j peer->master Join request
J master->peer (registrationId match!) Join response
J (response)peer->master Join acknowledge
p master->peer Poll
p (response)peer->master Poll response (in case no other info needs to be populated)
r master->peers (broadcast) Reset

Physical layers

Physical layers are to put data on the channel and receive arrived data.

NetFefRs485

This is an implementation of sending data on RS458 bus.

TODO

Specific protocol values

REPLY_MAX_DELAY_MS 1000
REPLY_REPEAT_COUNT 3
JOIN_OFFER_REPEAT_SECS 2×60
NEXT_POLL_MIN_SECS 20
NEXT_POLL_MAX_SECS 5×60
POLL_RETRY_DELAY_SECS 30