Monitoring your Elixir Application on Heroku with Librato

By Jason Stiebs on June 5th 2015

One of the most powerful parts of the BEAMn ecosystem is that many of the libraries come with industrial tooling and ways to connect. Erlang and by proxy Elixir comes with powerful tools for visualizing processess, messages and mailboxes(queues)n.

Another typical way of monitoring an Erlang application is to setup a tool to collect and graph metrics and pipe data into it via exometer or folsom.

In this case I am using Heroku, so I cannot connect directly to a running instance, and I cannot connect multiple instances to each othern. There is an add-on available for Heroku called Librato and they have many libraries available to help get the data into Librato so they can graph/analyze it.

In this case though, Erlang and Elixir does not have a library. Librato has got an excellent solution for us, Custom Log Based Metrics, basically if we use the builtin Elixir Logger correctly we can pipe all the metrics we like into Librato.

Ecto Monitoring

So lets begin with an easy example, Ecto. Ecto has a built in function called Repo.log/1n. This function is overridable and you probably see the default implemenetation when it logs queries it executes.

defmodule MyApp.Repo do
  use Ecto.Repo, otp_app: :my_app

  def log(entry) do
    Logger.info(fn ->
      [log_time("measure#database.query", entry.query_time, true), log_time("measure#database.queue", entry.queue_time, false)]
    end)
    super(entry)
  end

  defp log_time(_label, nil, _force), do: []
  defp log_time(label, time, force) do
    ms = div(time, 100) / 10
    if force or ms > 0 do
      [?\s, label, ?=, :io_lib_format.fwrite_g(ms), ?m, ?s]
    else
      []
    end
  end
end

In this example I am overriding the built in log function to also log some info. In this case I am building an iolist of measure# database.query=10ms measure# database.queue=0.1ms which Librato will pick up and treat as a measurement.

Some important things to note.

  1. Use the function style of logging. Using this style it makes sure the Logger.info call does not block the process. You will find this behaviour in most of the elixir libraries where performance is key, like the original Ecto repo.log
  2. I am using an iolist. Basically whenever you have some data, usually a binary or string that is going to go directly to IO you want to generate an iolist. It is an array of characters, binaries, or anything that can be flattened and joined. It saves allocating extra memory and is good for performance
  3. The log time function is from the Ecto library and mostly formats floats.
  4. I am calling super so I can also still have the dev debug level logging.

How can you use this data?

This will generate several metrics you can graph and display in Librato, such as min, max, mean, stdev, 95th percentile and so on. If, for example, you see a spike in the database.queue you can check many things:

  1. Is my database down or hanging?
  2. Do I have long running queries holding up my connection pooling? (Check database.query)
  3. Do I have enough connections pooled for Ecto?

Hackney Monitoring

Lets look at a larger example, hackney. Hackney is a HTTP client library which offers a multitude of metrics. All you need to do is implement a single module and your data is made available to you. This is particularly useful because many time we rely on third party HTTP API's. And knowing when we are slow vs when the third party is slow, is handy.

Here is the module:

defmodule MyApp.Util.HackneyLibratoMetrics do
  require Logger

  def new(_, _) do
    :ok
  end

  def delete(_name) do
    :ok
  end
  def delete(_name, _value) do
    :ok
  end

  @spec increment_counter(any()) :: :ok | {:error, term()}
  def increment_counter(name) do
    increment_counter(name, 1)
  end

  @spec increment_counter(any(), pos_integer()) ::  :ok | {:error, term()}
  def increment_counter(name, value) do
    log(:count, name, value)
  end

  @spec decrement_counter(any()) ::  :ok | {:error, term()}
  def decrement_counter(name) do
    log(:count, name, -1)
  end

  @spec decrement_counter(any(), pos_integer()) ::  :ok | {:error, term()}
  def decrement_counter(name, value) do
    log(:count, name, -value)
  end

  def update_histogram(name, fun) when is_function(fun, 0) do
    begin = :os.timestamp()
    result = fun.()
    duration = div(:timer.now_diff(:os.timestamp(), begin), 1000)
    log(:measure, name, [:io_lib_format.fwrite_g(duration), ?m, ?s])
    result
  end

  def update_histogram(name, value) when is_number(value) do
    log(:measure, name, value)
  end

  @spec update_gauge(any(), number()) ::  :ok | {:error, term()}
  def update_gauge(name, value) do
    log(:sample, name, value)
  end

  @spec update_meter(any(), number()) ::  :ok | {:error, term()}
  def update_meter(name, value) do
    log(:sample, name, value)
  end

   @spec notify(any(), any(), atom()) ::  :ok | {:error, term()}
   def notify(_name, _value, _op) do
     :ok
   end

   defp name(name) when is_list(name) do
     Enum.map(name, fn (key) when is_atom(key) -> Atom.to_string(key)
                       (key) -> key end)
     |> Enum.intersperse(?.)
   end
   defp name(name) when is_binary(name), do: name

   defp log(op, name, value) when is_integer(value) do
     log(op, name, Integer.to_string(value))
   end

   defp log(op, name, value) when is_float(value) do
     log(op, name, [:io_lib_format.fwrite_g(value), ?m, ?s])
   end
   defp log(op, name, value) do
      key = [[Atom.to_string(op), ?#], name(name)]
      log(key, value)
   end
   defp log(key, value) do
    Logger.info(fn ->
      [key, ?=, value]
    end)
    :ok
   end
end

This module follows the spec defined in the folsom/exometer module in hackney.

In our case, we don't need to explicitly delete or create any metrics. We only need counts, measures and samples, hackney doesn't use the rest. It is pretty straight forward so I am not going to discuss at length. But I hope it is handy do you.

To configure simply add a line to your existing config.exs:

config :hackney,
mod_metrics: MyApp.Util.HackneyLibratoMetrics

Wrapping up

And this is just scratching the surface of what's possible. If you needed to keep an eye on certain processes you could simply have them log metrics. Or have a process that only watches the VM and logs specific information. Sky is the limit kemosabe!


  1. Name for the Erlang Virtual Machine

  2. See :observer.start()

  3. See Erlang Distribution

  4. Please note I am running v0.12.0-rc

RokkinCat

is a software engineering agency. We build applications and teach businesses how to use new technologies like machine learning and virtual reality. We write about software, business, and business software culture.