Parsing UDP in Elixir with Binary Pattern Matching

By Alex Solo on May 9th 2016

code

In your (clandestine) consulting work for ACME Spy Corporation, you've been tasked with the following:

  • Listen for UDP packets on port 21337
  • Parse said messages according to the specification
  • Log the message contents for later review

The specification of each message is as follows:

Message Header, 30 bytes
Message Body, 45 bytes
  - Priority Code, 1 byte, string
  - Agent Number, 4 bytes, unsigned integer
  - Message, 40 bytes, string

Bytes are in little-endian format.

Let's Get Started

First, acquaint yourself with Elixir. Then when you're ready, make a new project with Mix:

mix new acme_udp_logger
cd acme_udp_logger

Implement the Application and Supervisor patterns in the AcmeUdpLogger module. This will allow us to supervise the code that will handle the UDP packets. If that code crashes, we can restart it atomically (a great feature of Elixir, by the way).

defmodule AcmeUdpLogger do
  use Application

  def start(_type, _args) do
    import Supervisor.Spec, warn: false

    children = [
      # We will add our children here later
    ]

    # Start the main supervisor, and restart failed children individually
    opts = [strategy: :one_for_one, name: AcmeUdpLogger.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

You will also have to add your application to the mix.exs file, within the application/0 function.

def application do
  [mod: {AcmeUdpLogger, []},
  applications: [:logger]]
end

Listening for UDP Packets

Elixir makes it very easy to start listening for UDP packets. You'll need to use the :gen_udp Erlang module.

First we will need a module that will handle this task for us. Let's call it MessageReceiver. This module should implement GenServer, so it can be supervised by the application and run on its own process.

defmodule AcmeUdpLogger.MessageReceiver do
  use GenServer

  def start_link(opts \\ []) do
    GenServer.start_link(__MODULE__, :ok, opts)
  end

  def init (:ok) do
    {:ok, _socket} = :gen_udp.open(21337)
  end

  # Handle UDP data
  def handle_info({:udp, _socket, _ip, _port, data}, state) do
    {:noreply, state}
  end

  # Ignore everything else
  def handle_info({_, _socket}, state) do
    {:noreply, state}
  end
end

Don't forget to add this module to the list of supervised children in AcmeUdpLogger (line 7):

At this point, you can test to see if everything is setup correctly:

  1. Add a IO.puts inspect(data) on line 14 of the MessageReceiver
  2. Start your application with mix run --no-halt
  3. In a separate terminal session, use netcat to send UPD packets to localhost, port 21337 nc -u 127.0.0.1 21337. After you run this command, netcat will allow you to send messages via UDP by simply typing some text and hitting Enter. You should see the logged message appearing in your first terminal session (running mix run).

Parsing UDP with Binary Pattern Matching

Okay, here's the good stuff. In the first handle_info/2 function of MessageReceiver, we will parse the UPD binary data using pattern matching, and log it to the console using Logger:

defmodule AcmeUdpLogger.MessageReceiver do
  use GenServer
  require Logger

  def start_link(opts \\ []) do
    GenServer.start_link(__MODULE__, :ok, opts)
  end

  def init (:ok) do
    {:ok, _socket} = :gen_udp.open(21337)
  end

  # Handle UDP data
  def handle_info({:udp, _socket, _ip, _port, data}, state) do
    message = parse_packet(data)
    Logger.info "Received a secret message! " <> inspect(message)

    {:noreply, state}
  end

  # Ignore everything else
  def handle_info({_, _socket}, state) do
    {:noreply, state}
  end

  def parse_packet(data) do
    <<
      _header        :: size(240), # 30 bytes * 8 = 240 bits
      priority_code  :: bitstring-size(8),
      agent_number   :: little-unsigned-integer-size(32),
      message        :: bitstring-size(320)
    >> = data

    message = %{
      priority_code: priority_code,
      agent_number: agent_number,
      message: String.rstrip(message),
    }
  end
end

Here is what a test for the above would look like: messagereceivertest.exs.

In the binary pattern matching block (<< >> ), the left side is the variables I am assigning from values on the right side. The order of the expressions corresponds to the order of the bytes in the message. You can see how nicely this lines up with the specification at the beginning of this post. For more on bitstring/binary pattern matching syntax, check the Elixir documentation here.

Getting Ready for Prime Time

Before deploying code like this into production (with potentially massive amounts of incoming packets), I would recommend separating the receiving and parsing code. You will likely want to have a pool of parsers with a library like poolboy. The receiver would hand off a packet to an available parser, which would process it asynchronously. This would prevent backups from occurring if the parsing code takes too long (or crashes). You should also check the length of each packet to see if it's the length your code expects (otherwise pattern matching will fail), and handle invalid/irrelevant packets appropriately.

For expert help, drop us a line.

Wrapping Up

I hope this primer has been helpful to you in your UDP parsing endeavors. I think that Elixir is a great language for this sort of work, with its tasteful syntax and highly functional process architecture.

If you have any questions, feel free to DM me on Twitter @civilframe

Code for this guide: https://github.com/civilframe/acme-udp-logger

RokkinCat

is a software engineering agency. We build applications and teach businesses how to use new technologies like machine learning and virtual reality. We write about software, business, and business software culture.