The Bird Modem:

This document describes the Bird modem, a layer 1/2 protocol for the unreliable transmission of data cells over a low-bandwidth audio channel such as an FM voice radio. It is intended to serve as a replacement for for the Bell 202 AFSK commonly used in packet radio and APRS, primarily in ham radio applications. The Bird modem offers improved bit rate (approximately doubled) and error detection, though at the expense of inferior noise immunity.
 - Raw bit rate 2450bps, usable approximately 270B/s after overheads. Bell 202, for comparison, has a throughput of 120B/s when used with HDLC framing.
 - Cell-based transport with integrity checking.
 - 3KHz bandwidth - suitable for use with unmodified voice radios.
 - Zero-length preamble: Minimum transmission duration limited only by transmitter stabilisation time.

This is not the fastest data mode for the VHF/UHF channels, but it may be the fastest which can be transmitted using an unmodified voice-only radio. The popular G3RUH modem, for example, can achieve 9600bps but only on a radio which allows direct acccess to the FM modulator and demodulator, a feature only found on more expensive radios. Please understand that a voice radio repurposed for data will never be able to offer performance anywhere close to that of a radio purpose-designed for data communication in terms of spectral efficiency, bitrate or noise immunity.

This specification document describes the physical and data-link layers, along with the c++ reference implimentation. This can be used alone to transport data in the form of a serial byte stream, or as a foundation for a higher-layer protocol to impliment packet-based protocols.


1. Overview.
2. Modulation.
3. Timing recovery.
4. Demodulation.
5. Higher layers.


---
Overview.

In this document I describe a means for modulating data onto an audio signal for transmission over unmodified FM radio designed for voice communication. These radios are band-filtered - the blocking of low frequencies prevents the use of simple baseband data, and requires the use of a subcarrier. The most common solution to this is a FSK signal, typically based upon the Bell 202 modem standard - this works very well at 1200bps, but is unsuited to higher speeds within the tight bandwidth constraint. The obvious alternative, PSK, can achieve higher data rates but requires the use of elaborate carrier recovery and symbol classification techniques.

The alternative proposed here is a form of phase-coherent, filtered and band-limited BPSK with a demodulation solution that is highly resistant to the specific distortion introduced by the band-pass filters used in voice radios. Testing shows it easily achieves a raw bit rate of 2450bps. Some of this is required as overhead for framing and alignment. The usable data rate can be approximately 2400bps, but is never perfectly defined due to the use of bit stuffing.

The 2450bps rate was chosen for ease of implimentation: When using a standard PC sound interface to generate or decode the subcarrier signal, at a 2450bps line speed, every symbol is precisely 18 samples long.

One problem with this modulation is that, once it has been through the band-pass filter of a radio, carrier recovery for syncronisation is nearly impossible. There is a very simple solution to this described below.

---

Modulation:
During development of this idea attempts were made using BPSK with bandwidth filtering. These were semi-successful, but lead to the accidential discovery of an improved means of demodulation.

Before modulation, the data is divided into cells, a checksum included in each, and differential coding applied. The reasons for these will become apparent when considering timing recovery and demodulation.

The generated signal is in the form of a binary phase-shift keyed square or sine wave - I am uncertain if one offers any advantage over the other - in which the carrier frequency is equal to the symbol rate (ie, one bit per cycle). This is then filtered with a 3KHz low-pass filter. A BPSK modulated square wave where carrier frequency matches symbol rate is identical to manchester encoding, so the signal may be regarded as filtered BPSK or filtered manchester encoding interchangably.

It is essential that this low-pass filter has a minimum of phase distortion - even a small amount of phase distortion can render the signal impossible to demodulate. For this reason it is very difficult to carry out in analog, and the radio's internal filters cannot be depended upon. Fortunately, while this form of zero-phase-shift filter is very difficult in analog, it is one of the most trivial filters to impliment in DSP: A simple convolution with the sync function. Once the signal is suitably filtered of all high-frequency components, it passes through typical radio filtering with ease. Testing utilised a Baofeng UV-5R and a Yaesu FT-897: Data passed without issue. Testing over an audio link, speaker to microphone, also proved possible with only a slightly higher error rate.

One small complication during modulation is the special case required at start of transmission: As it takes some time - up to a second - for radios to activate VOX and stabilise operation, a modulation program must be able to ensure any new transmission is prefixed with a sufficient preamble of padding bytes (0x7E repetitions) to allow for start-up time. This preamble plays no role in timing recovery, it is simply to allow transmitter and receiver to stabilise operation.

---

Timing recovery.
One practical problem with this approach is that, once the signal has passed through the band-pass filter of the radio channel, the lower frequency components essential for carrier recovery are lost. The data is still there, but cannot be recovered without a timing reference. As it is intended to be generated and decoded using PC audio hardware, it must be assumed that the clocks may have some slight variation in speed - PCs are not equipped with a TCXO. The typical accuracy for a low-cost, commodity quartz crystal of the type used within budget consumer audio hardware is 50ppm calibration accuracy, with another 50ppm stability. It must be assumed the clock at transmitter and receiver may differ by as much as 100ppm.

There is a very simple solution to this: Brute force. With a 44100Hz sample rate, standard in digital audio, the symbol duration is eighteen samples. It is trivial to operate eighteen demodulators in parallel, each offset by one sample from the last. One of these will always be in near-perfect syncronisation. Thus the need for carrier recovery is replaced with the need to determine which of these eighteen demodulators is providing a valid output - a much simpler challenge which can be addressed by incorporating a checksum into the transmitted data. CRC16 is not sufficiently robust, so CRC32 is used. The CRC should be placed within the last four bytes of a frame, though may be omitted if a higher level protocol provides its own error detection or correction capability.

Due to the expected clock difference of up to 100ppm, it must also be assumed that phase drift will occur causing the optimal choice of demodulator to shift over time. To address this the data is transmitted in short cells, each incorporating its own checksum. In worst case, up to 1/8 of a symbol drift - enough to impair demodulation significently - may occur after only 157 bytes. This is the reason for transmitting data in cells rather than frames: To ensure resilience against clock drift, large frames must be broken into small cells for transmission. The largest supported cell size is 255 bytes, but to ensure reliability cells larger than 160 bytes should be avoided.

The use of cell transmission is key to the design of this modem. As well as solving the clock recovery issue it also provides an erasure channel to facilitate the use of error correction mechanisms at a higher level. As it means no byte can exit the receiver that did not first enter the transmitter - noise manifests only as lost frames - it can aid in the operation of a transported protocol using flag bytes for syncronisation.

---
Demodulation.

Extracting data from the received signal is remarkably simple.

First, the signal (sampled at 44,100Hz) is filtered using a sinc convolution identical to that used in generation. This removes all components above 3KHz. This filtered signal is then passed into each of the eighteen demodulators, each offset by one sample from the last.

The individual demoulator can be implimented in a single line of C code. Once the filter is run, there is no need for advanced techniques such as least-squares estimate or correlation calculations to identify the transmitted symbol. It can be done purely by looking at the instantainous gradient mid-symbol and classifying it as either positive or negative.

This makes it possible to distinguish the two symbols, but not to identify which is intended as a one and which a zero. This is easily overcome by using a differential encoding.

The need for framing and differential encoding is familiar to anyone who has implimented the lowest level components of packet radio, and may be addressed in precisely the same manner: HDLC-style framing with bit stuffing and differential coding where a repeated symbol indicates a '1' and a change of symbol a '0'.

Each demodulator detects the presence of the 0x7E bit sequence used for frame syncronisation. A valid frame consists of the bits between two flags, providing these bits are a multiple of eight in length, of the length range allowed by the carried protocol, and have a valid checksum.


---
Higher layers.

The Bird modem specification here specifies only how to move cells from a transmitter to a receiver. It does not cover the higher level protocols required for packet fragmentation and reassembly or forward error correction (if any), as the choice of design in these areas is highly application specific. In simplist use cell boundries may be ignored entirely, in which case the modem emulates the behavior of any byte-orientated channel.

The Bird modem can be used as a user interactive protocol, such as for a BBS. In this situation just make sure to use line buffering. Higher overheads (five bytes per frame, plus FEC), but nessicary for real-time communication.



=====
Using the reference implimentation.

The reference birdmodem implimentation consists of two C++ include files and an example application. Please excuse the poor code: I usually write in C, seldom C++.

HDLC.cpp
  hdlc_composer
    Generates an HDLC-frames bitstream, including checksum verification. Does not support FEC, but this may be added by setting a callback.
  hdlc_receiver
    Indentifies frames within an HDLC-bitstream, including checksum verification. Does not support FEC, but this may be added by setting a callback.
birdmodem.cpp
  convolver
    3KHz low-pass filter DSP. Used internally.
  birdmodem_RAW
    Primitive bit-orientated modem support. Used internally.
  hdlc_receiver_modem
    Used internaly - subclass of hdlc_receiver. A birdmodem_RX creates eighteen of these.
  birdmodem_RX
    Main receiver class.
  birdmodem_TX
    Main transmiter class.


Of these classes, most applications will need to create only birdmodem_TX and birdmodem_RX. All others exist only for internal use by these classes. Those in HDLC.cpp are completely separate from the DSP and birdmodem-specific code, and may be used to create and decode standard AX.25 Bell-tone modulation together with suitable modulation and demodulation code.

  birdmodem_RX:
    bool stdio_mode
      Used for testing    
    bool finished
      set true to cleanly exit.
    birdmodem_RX(uint32_t max_size)
      Constructor. Initialise with the maximum cell size you expect to support, including overheads of checksum and FEC data. Frame sizes in excess of 160 bytes are inadvisible due to possible clock syncronisation problems.
    void set_callback(void (* newGotFrameCallback)(hdlc_receiver*))
      Sets the callback function. This function will be called upon detection of a valid cell. This is required if the object is to do anything useful.
    void set_post_receive_function(void (* post_receive_function)(hdlc_receiver*))
      A function which is called when a cell has been detected that may or may not be valid, and which may modify the potential cell. This is intended as a hook in which forward error correction can be added.
    void set_checksum(int checksum)
      Set the type of checksum to use. Currently the only supported checksum is the default, HDLC_CSUM_CRC32B.
    void receive_thread()
      Call when ready to receive - whichever thread call this will become the receiver thread, and wil not return unless the 'finish' variable is set true.

  birdmodem_TX:
    birdmodem_TX(uint32_t max_size)
      Constructor. Specify the largest cell size you expect to support, including checksum and FEC overhead. Frame sizes in excess of 160 bytes are inadvisible due to possible clock syncronisation problems.
    uint8_t *frame
      Contents of the frame to send.
    uint32_t frame_length
      Length of the frame to send.
    void send_frame()
      Sends the contents of the *frame buffer, length frame_length bytes.
    uint8_t checksum
      Set the type of checksum to use. Currently the only supported checksum is the default, HDLC_CSUM_CRC32B.
    void (* pre_send_function)(hdlc_composer*)
      Sets a callback for a function which will be called immediately prior to sending. This is intended as a hook in which forward error correction can be added.