Monitoring your Elixir Application on Heroku with Librato
June 5th 2015
One of the most powerful parts of the BEAMn ecosystem is that many of the libraries come with industrial tooling and ways to connect. Erlang and by proxy Elixir comes with powerful tools for visualizing processess, messages and mailboxes(queues)n.
Another typical way of monitoring an Erlang application is to setup a tool to collect and graph metrics and pipe data into it via exometer or folsom.
In this case I am using Heroku, so I cannot connect directly to a running instance, and I cannot connect multiple instances to each othern. There is an add-on available for Heroku called Librato and they have many libraries available to help get the data into Librato so they can graph/analyze it.
In this case though, Erlang and Elixir does not have a library. Librato has got an excellent solution for us, [Custom Log Based Metrics](https://devcenter.heroku.com/articles/librato# custom-log-based-metrics), basically if we use the builtin Elixir Logger correctly we can pipe all the metrics we like into Librato.
Ecto Monitoring
So lets begin with an easy example, Ecto. Ecto has a built in function called Repo.log/1n. This function is overridable and you probably see the default implemenetation when it logs queries it executes.
defmodule MyApp.Repo do
use Ecto.Repo, otp_app: :my_app
def log(entry) do
Logger.info(fn ->
[log_time("measure#database.query", entry.query_time, true), log_time("measure#database.queue", entry.queue_time, false)]
end)
super(entry)
end
defp log_time(_label, nil, _force), do: []
defp log_time(label, time, force) do
ms = div(time, 100) / 10
if force or ms > 0 do
[?s, label, ?=, :io_lib_format.fwrite_g(ms), ?m, ?s]
else
[]
end
end
end
In this example I am overriding the built in log function to also log some info. In this case I am building an iolist of measure# database.query=10ms measure# database.queue=0.1ms
which Librato will pick up and treat as a measurement.
Some important things to note.
- Use the function style of logging. Using this style it makes sure the Logger.info call does not block the process. You will find this behaviour in most of the elixir libraries where performance is key, like the original [Ecto repo.log](https://github.com/benoitc/hackney# metrics)
- I am using an
iolist
. Basically whenever you have some data, usually a binary or string that is going to go directly to IO you want to generate aniolist
. It is an array of characters, binaries, or anything that can be flattened and joined. It saves allocating extra memory and is good for performance - The log time function is from the Ecto library and mostly formats floats.
- I am calling super so I can also still have the dev debug level logging.
How can you use this data?
This will generate several metrics you can graph and display in Librato, such as min, max, mean, stdev, 95th percentile and so on. If, for example, you see a spike in the database.queue
you can check many things:
- Is my database down or hanging?
- Do I have long running queries holding up my connection pooling? (Check
database.query
) - Do I have enough connections pooled for Ecto?
Hackney Monitoring
Lets look at a larger example, hackney. Hackney is a HTTP client library which offers a multitude of [metrics](https://github.com/benoitc/hackney# metrics). All you need to do is implement a single module and your data is made available to you. This is particularly useful because many time we rely on third party HTTP API’s. And knowing when we are slow vs when the third party is slow, is handy.
Here is the module:
defmodule MyApp.Util.HackneyLibratoMetrics do
require Logger
def new(_, _) do
:ok
end
def delete(_name) do
:ok
end
def delete(_name, _value) do
:ok
end
@spec increment_counter(any()) :: :ok | {:error, term()}
def increment_counter(name) do
increment_counter(name, 1)
end
@spec increment_counter(any(), pos_integer()) :: :ok | {:error, term()}
def increment_counter(name, value) do
log(:count, name, value)
end
@spec decrement_counter(any()) :: :ok | {:error, term()}
def decrement_counter(name) do
log(:count, name, -1)
end
@spec decrement_counter(any(), pos_integer()) :: :ok | {:error, term()}
def decrement_counter(name, value) do
log(:count, name, -value)
end
def update_histogram(name, fun) when is_function(fun, 0) do
begin = :os.timestamp()
result = fun.()
duration = div(:timer.now_diff(:os.timestamp(), begin), 1000)
log(:measure, name, [:io_lib_format.fwrite_g(duration), ?m, ?s])
result
end
def update_histogram(name, value) when is_number(value) do
log(:measure, name, value)
end
@spec update_gauge(any(), number()) :: :ok | {:error, term()}
def update_gauge(name, value) do
log(:sample, name, value)
end
@spec update_meter(any(), number()) :: :ok | {:error, term()}
def update_meter(name, value) do
log(:sample, name, value)
end
@spec notify(any(), any(), atom()) :: :ok | {:error, term()}
def notify(_name, _value, _op) do
:ok
end
defp name(name) when is_list(name) do
Enum.map(name, fn (key) when is_atom(key) -> Atom.to_string(key)
(key) -> key end)
|> Enum.intersperse(?.)
end
defp name(name) when is_binary(name), do: name
defp log(op, name, value) when is_integer(value) do
log(op, name, Integer.to_string(value))
end
defp log(op, name, value) when is_float(value) do
log(op, name, [:io_lib_format.fwrite_g(value), ?m, ?s])
end
defp log(op, name, value) do
key = [[Atom.to_string(op), ?#], name(name)]
log(key, value)
end
defp log(key, value) do
Logger.info(fn ->
[key, ?=, value]
end)
:ok
end
end
This module follows the spec defined in the folsom/exometer module in hackney.
In our case, we don’t need to explicitly delete or create any metrics. We only need counts, measures and samples, hackney doesn’t use the rest. It is pretty straight forward so I am not going to discuss at length. But I hope it is handy do you.
To configure simply add a line to your existing config.exs:
config :hackney,
mod_metrics: MyApp.Util.HackneyLibratoMetrics
Wrapping up
And this is just scratching the surface of what’s possible. If you needed to keep an eye on certain processes you could simply have them log metrics. Or have a process that only watches the VM and logs specific information. Sky is the limit kemosabe!
- Name for the Erlang Virtual Machine↩