Speed Up Your Streaming With Low-Latency DRM

If you are in the business of selling encoders, organizing live events, developing media players, or simply a consumer who watches a lot of big games on the weekend, you, most probably know that Low Latency is a term that defines how fast the content starts playing after you hit the play button.

Why is that important you may ask? Well, the most obvious reason is the simple frustration of waiting or worse – missing crucial information or events as a result.

But there is more to it. Based on collected data by companies like Conviva - slow start up times will cause users to exit the playback session which may result in refunds for SVOD users and lost revenue for publishers in AVOD services because of lower playback volume.

Additional latency of the playback start may cause negative reviews on social media. The situation may lead to frustrated users turning to customer support more often than you would have wanted.

The many reasons to speed up the playback start are obvious. How do you achieve this Low Latency Streaming then? Of course, there is not one technology that we need to thank for it. You have

CDNs that take the content closer to users
protocols and content formats like LL-HLS, or LL-CMAF, WebRTC
players that rely on fine-tuned heuristics to optimize bandwidth conditions and deliver a great experience with as little delay from the real-time as possible.

What else is there?

Whilst server technologies, protocols, formats, players are indescribably important, other factors may be overlooked costing critical milliseconds, or even seconds, to stream start times.

One of the things that may add time is DRM. The DRM is the part of the playback in the sense that the player must first inspect the content and acquire a license for it if it is a DRM protected asset. Steps like end-user authentication/authorization, content checkup, security token generation and license acquisition provide significant milliseconds of delay which you can shave from the process

Walk Me Through The Playback Process:

To fix something, you first need to understand how it works. We will discuss how a player behaves in those first hundreds of milliseconds when it tries to determine whether the content is DRM protected or not and what happens next. We will try to generalize as much as possible i.e. not rely on some specific player’s or specific authentication/authorization model implementation but discuss in general what happens in most cases and what you can do to improve things.

Let us begin. Say the player got the manifest from the server next to you and is ready to start the playback. Here are things the player will do that are related to the DRM.

Check if the full DRM signaling is available in the manifest i.e. PSSH/PRO data is available for the MPEG-DASH/MSS as shown below (values are truncated):

CodeSnippet1

CodeSnippet2

Note: HLS always has full DRM signaling in either the master manifest, or in variant manifest. It may look as follows:

If full DRM signaling is available, the player starts the License Acquisition Process (LAP).
If full DRM signaling is not available in the manifest, the player downloads the initialization chunks, identifies the encrypted media and begins LAP.
Create Playback Session, or Key Session, or simply Session, and requests the license from a DRM Platform. The Session provides a context for message exchange with the CDM as a result of which key(s) are made available to the CDM.
Acquire a DRM license key and store it on within the Session context. This means that if non-persistent license is acquired, it is valid for one current Session until it is closed. If persistent license is acquired, it is stored locally and is persistent to some specific Session which has an ID and later player can use that ID to get information about the Session and license stored within context.

When everything above is complete, the playback starts.

The DRM on the stream may add anywhere from milliseconds to seconds to the playback start. Let us look into different general scenarios BuyDRM customers may implement and talk about how to improve them.

Please, note that within the KeyOS platform, customers must generate a security token which we call the Authentication XML. This security token is essential in a license request if you want to get the license from the KeyOS MultiKey License API. For security reasons, tokens may be generated on the server side only. For more information about this please visit our blog on this topic:

Generating and Importing Authentication XML Signing Keys for the KeyOS MultiKey Service

Direct License Acquisition (Security Token request from the Backend API)

Figure 1. Direct license acquisition

The first case is a Direct License Acquisition (DLA) by the client from the DRM Platform. The security token is not available to the player until some moment when it is requested from the backend API.

This may be the web page use case, or native player use case when Android or iOS application start, and they need the authentication XML in order to request a license. This may also be an STB, Smart TV etc. case. In other words, any case when customers cannot pass the Authentication XML into the player beforehand.

Let us describe each step on the diagram and look into bottlenecks and possible solutions.

The player is setup and contains the following attributes – the URL to the asset, the URL to the license service, the URL to the backend API that will return the authentication XML.
The player requests the authentication XML from the backend API based on some ID it sends to it.
The player downloads the manifest and checks it for the full DRM signaling.
The player downloads initialization chunks. Those contain valuable information like codecs used, bitrates etc. as well as PSSH boxes that contain DRM related data. If the manifest is missing full DRM signaling, the player will take this data from the initialization chunks.
In the case of Widevine DRM (web player) and Fairplay DRM (web and native players), player will require the public certificate to later generate the valid license request. It is requested, usually, from the DRM Platform.
The player generates the license challenge or simply the payload.
The player sends the payload + the custom POST header containing the base64 encoded authentication XML to the DRM Platform.
The DRM Platform generates the license and sends it back.
The player pre-buffers data based on settings.
The playback starts.

In the above case, the following factors may add to the stream startup time in order of how much time they may add to the startup time:

Security token request or asset request - Usually, in order to generate the Authentication XML, you need to authenticate/authorize your user. This may add to the playback startup time. A possible solution would be to use caching on the server side to go through the process quicker.
Player set up (pre-buffering) - Usually, players allow setting up how much data they must pre-buffer before starting the playback. For the Low Latency streaming this value must be as small as possible. But, please, note that having low data buffer may result in often buffering for the end user.
Manifest missing full DRM signaling - In case when the manifest contains the full DRM signaling with PSSH/PRO data available, the player may initiate the LAP immediately after it parsed the manifest. It will not need to rely on the data gathered from the initialization chunks and it will download those in parallel to making certificate/license requests. But, you cannot get by without the initialization chunks so, even if your manifest has the full DRM Signaling available and the LAP finished quicker than the player downloaded the initialization chunks, you would still need to wait for the player to download those to begin playback.
Public certificate request - As you saw from the diagram above, there are cases when the player requires the public certificate to generate valid license challenge. If this certificate is not available to the player, it will be requested, usually, from the DRM platform. But customer may speed up the process by embedding the certificate into the player.

This approach must be implemented with care though. By having invalid certificate, the player may generate a license challenge which will fail on the DRM Platform side. The player must sync its certificate periodically if this approach is chosen.

Indirect License Acquisition Through A Proxy

Figure 2. Indirect license acquisition using Proxy

Another case is the license acquisition through a Proxy. The player will be making license requests directly to the Proxy thinking of it as of the DRM Platform endpoint.

This approach is GDPR friendly, secure, robust and the fastest model to implement. You are in full control over who must get a license. You have one point of management for all DRM licensing.

It may be used with any player like a web or native player. It is especially valuable if players don’t allow you to set custom POST headers when you make a license request.

This model is also useful when you want to hide the Authentication XML, create your own statistics about licenses issued or failed and in general, secure your DRM processes for end-user playback sessions.

Though the setup is very flexible, you will need to know a thing or two about high-load distributed setups because end-users will be hitting your proxy before it will be sending the license request further to the DRM Platform.

Let us dissect the model step-by-step and check it for bottlenecks and possible solutions.

The player is setup and contains the following attributes – the URL to the asset and the URL to the license proxy.
The player downloads the manifest and checks it for the full DRM signaling.
The player downloads initialization chunks. Those contain valuable information like codecs used, bitrates etc. as well as PSSH boxes that contain DRM related data. If the manifest is missing full DRM signaling, the player will take this data from the initialization chunks.
In case of Widevine DRM (web player) and Fairplay DRM(web and native players), player will require the public certificate to later generate the valid license request. It is requested through the proxy.
The player generates the license challenge or simply the payload.
The player sends the payload to the license proxy. Note that the player sends the payload only, the security token is not necessary at this step.
The proxy authenticates/authorizes the request, generates the authentication XML and sends the request to the actual license service.
The DRM Platform generates the license and sends it back to the proxy.
The proxy sends it back to the player.
The player pre-buffers data based on settings.
The playback starts.

The main difference from the DAL is the method how the authentication XML is added to the license request. In the first case it is requested from the backend API and set in a custom POST header for the license request. In a second case, the authentication XML is generated on the way to the DRM Platform and is added to the license request in a proxy which is faster.

What are the different bottlenecks for this model? I have again listed them in a descending format starting with ones that may add most time to the startup.

Player set up (pre-buffering)
Authentication/Authorization in a proxy - as with the backend API, the authentication/authorization process may take time on the proxy side. Most of the time it may be resolved by adding some cached values and work against them instead of DB/Internal services. One more thing that you can do when implementing the Proxy model that you cannot do when doing the Direct License Acquisition is use static Authentication XML. If your business model allows issuing a license with the same set of rights, you can hardcode the authentication XML in a proxy and use it over and over again without generating a new one.
Server2Server communication - When the Proxy sends a request to the DRM Platform, the server2server communication is used. The connection between two servers may stay alive for some time to prevent initial handshakes on connection set up. If your Proxy is not called frequently and you wish to shave off those milliseconds the handshakes take, you can send HEAD requests from your Proxy to the DRM Platform once every 30 seconds, for example, to keep the connection alive.
Manifest missing full DRM signaling
Public certificate request

What Are Other DRM Bottlenecks?

We have talked about Playback related bottlenecks. But wait, there is more.

3^rd Party Content Encryption Key (CEK) acquisition process - When using products that encrypt your content on the fly you may find yourself in a situation when it takes more time for those to produce DRM protected output. The reason may be the CEK acquisition process. The DRM Platform may respond slower than expected and the encoder or a streaming server simply don’t have a CEK to use for encryption. That is why we suggest our partners to implement keys prebuffering and storage. In this case, if the product has issues contacting the DRM Platform, it has a buffer of unused keys to work with.
Packaging/Encryption VoD vs On-The-Fly (OTF) - When going OTF, the product you use may first take time to request the CEK, and next it may lag in producing the content because it doesn’t have enough resources because you run it on a slow server setup. Solution could be selecting appropriate setups for the product and CEK pre-buffering. Or you may go with the VoD if you in that particular case additional storage expenses are not a problem.
DLA or Proxy - How you acquire a license may make a difference. As we saw above, the proxy is a faster way to deliver a license to your end-users and may help shaving off some milliseconds by also adding security and flexibility to your setup.
Persistent vs Non-persistent licenses - The playback will start quicker if the device already has a license stored so use persistent licensing whenever possible.
Using key rotation for live streaming - The key rotation may be your choice when doing Live or it may be forced upon you, but it may add latency to the overall playback process if done poorly. Usually, end users don’t even notice that keys changed because players do that when still rendering data from the buffer. This works well with relatively big buffers like 4+ seconds, for example. But, in case of the Low Latency, you try to shave off those seconds to keep viewing experience as close to the Real Time as possible. Buffers in this case are smaller and your end users may find themselves in a situation when player will be halting for some time to get a new license before it will be able to continue the playback.
DRM Platform optimization
- Authentication/Authorization processes - Your DRM Vendors must be quick on authentication/authorization and generating licenses.
- Geographical and network proximity - Your DRM Vendors must be deployed around the world to make their license services closer to your end-users.
- Shared vs Dedicated vs On-prem - Some DRM Vendors like BuyDRM may offer multi-DRM Servers to run in your network. This may lower license generation and delivery latency as well.