reading time
5 min
Privacy and performance update

It’s been a year since we introduced metrics in mave.io, and we’ve been handling millions of views since then. We open sourced it for accountability to our promise of ensuring privacy of your customers. It runs on the Phoenix Framework based on Elixir, and Postgresql with TimescaleDB. A great cocktail to build a simply package that allows you to measure views and aggregate video usage data. We find Elixir and Phoenix to be really great for handling a lot of concurrent connections, using multiplexing websockets as you can handle data from multiple videos on your site, and Postgresql with TimescaleDB is great for handling that time series data. But it was time for an update.

The problem

Not all data was managed by TimescaleDB in our metrics product. We had a lot of data that was not time series, and we were using Postgresql for that. This was not a problem at the beginning, but as we grew, we started to have performance issues. We also had a lot of data that was not being used, and we were paying a performance penalty for it. We needed to structure and compress the data and decrease the size of our database, to improve the performance of the queries.

Besides the growing data, we also didn’t find an efficient way to query the seconds that people watched of the videos. Showing engagement is what differentiates video metrics from other metric tools. We needed to improve the way we were storing and querying this data.

Additionally, we noticed that some customers wanted to disable metrics completely, as they care even more about the privacy of their customers and really don’t want to track anything at all. Our efforts in ensuring the privacy of your customers, so we do also want to give you the ability to take it even a step further. We needed to provide a way to disable metrics completely.

The solution

It’s a bit technical, and it’s a lot, so here’s a bullet list to make it more digestible to understand what we’ve done so far:

  • For each video you watch (or start metrics on) we create a Phoenix Channel, which is represented as a session, combined with some normalized data about the browser and device (without being able to do a fingerprint on). We turned this into a hypertable in TimescaleDB to be compressed over time.
  • We capture for each session the play and pause events. Disconnecting (meaning: closing the tab) will also cause a pause event on the server, which is why we use websockets in the first place.
  • There is a genserver in Elixir that handles all events in batches, analyzing them and making sure it’s the right data before adding it to the database. Elixir can handle a lot of events concurrently, while the database can’t.
  • In that process we create this concept called Durations, which is the representation of how long someone watched a video, including the range of seconds they watched. Which is really cool! Postgresql provides a way to easily store this, which is of type int4range.
  • Based on that range we can also determine whether the session already has a duration that includes that range (or for some part) and determine the uniqueness of the duration.

In mave.io we show data aggregated per space, but also per video (obviously). It’s a simple representation of all those events to get a better understanding how your video is being watched. Besides this information we also provide, browsers, devices and urls on where you embedded the video.

Data Figure ↑ shows how we integrate metrics in the video view in mave.io

But now the question is: can you run your own metrics server, build your own aggregation and use it in combination with mave.io? Well, yes! We now provide a way to configure metrics and other generic things in our components library:

<script crossorigin type="module">
  import {
    Player,
    setConfig,
  } from "//cdn.video-dns.com/npm/@maveio/components/+esm";

  setConfig({
    metrics: { enabled: true, socket: "wss://your_own_metrics_instance" },
  });
</script>

As you can see, you can also disable metrics completely for all mave elements in your site. By doing this data doesn’t end up in mave.io - so this will remain empty, but you can start your own server and query what you want to see. You even don’t mave.io at all to use metrics:

Metrics
maveio/metrics on GitHub

We also changed it to be a mono-repo for both server and client, so you won’t can confused with what you need to get started if you want to run it yourself. Let us know if you have questions or are interested in running your own metrics server by reaching out to our support or chatting on our Discord.

Published on March 19, 2024
works with
Developer?
Our docs guide you through the process of embedding video, starting with simple steps for novices and advancing to manual configurations for experienced users. It outlines multiple hosting alternatives, including a default CDN, and highlights compatibility with popular web frameworks.
script
react
vue
1
2
3
4
5
🍪 Press 'Accept' to confirm that you accept we don't use cookies. Yes, this banner is just for show!
Accept