Streaming has conquered the world.
According to Grand View Research, the global live streaming market has reached $50.1 billion in 2020, and is predicted to grow by on average 21% a year until 2028.
The more video content there is in the market, the higher the competition, and therefore, the higher the demands to quality. So not only content itself but also the broadcast should be excellent. Few people would finish watching a video if it loads slowly, plays in a poor quality or with huge delays. And all of these aspects largely depend on what streaming platform you choose.
What should advanced streaming systems be like in 2021? Below are the five key trends.
Imagine you’re an online quiz host. The players have exactly one minute to answer the question. You note the time, and stop accepting answers on exactly the last second. But half the players still haven’t filed their answers, because the quiz is broadcast with a 30-second latency. And when you think the time has run out, the participants think they still have half a minute to go.
To avoid such annoyances, the broadcast should be as close to real time as possible. This is particularly important for:
However, other spheres are also trying to minimize latency as well. Low Latency Streaming is gradually becoming a compulsory requirement for any broadcasts.
To understand this, one has to know how video broadcasts work in general.
To minimize the delay, it is necessary to try and reduce time at all stages.
Low Latency Streaming is streaming with a delay of no more than 4 seconds. The main mechanisms currently used for it include Chunked CMAF, HTTP CTE, Low Latency HLS, SRT, and Low Latency CDN.
CMAF is a video streaming protocol commissioned by Apple and Microsoft in 2017.
It then served as a foundation for an extended format, Chunked CMAF.
In CMAF, video is divided into segments (playlists), lasting 2 to 6 seconds. While one segment is playing, others are being loaded into the buffer, which usually includes 3 to 4 segments. Playback doesn’t start until the buffer is filled, which leads to the latency of 10 to 30 seconds.
In Chunked CMAF, segments are further divided into chunks that are much shorter and can be played back even before the entire playlist is transmitted. This reduces latency by several times.
CTE (Chunked Transfer Encoding) is a data transfer mechanism in the HTTP protocol, available in HTTP 1.1 and on.
Its principles are the same as those of Chunked CMAF: CTE divides a file into smaller elements of any size, even down to one frame.
This mechanism is very convenient if the overall size of the message is unknown. Without HTTP CTE, Content-Length would have to be stated for each package, so that the client is able to find its end. And in case of online broadcasts, it’s not always possible to precisely predict when it ends.
In CTE, fragments are sent with a mention of their size, and the end of each transmission is marked by the final fragment of zero length.
The updated version of HLS will support Low Latency. Its difference from the previous version is that, similarly to Chunked CMAF, playlists are divided into small parts. The minimum chunk length is 200 ms.
Besides, the new version has an updated system of working with playlists. They are updated immediately after an update appears, not after a request, and only a part of a fragment is sent instead of the full fragment. Thus, the initial playlist is saved, and the client only receives the updated part.
A combination of Low Latency HLS with CDN can help shorten the delay to 4 seconds.
This is an UDP-based data transfer protocol, designed by Haivision specifically for video transfer in unpredictable networks.
UDP is simpler than TCP. Its data transfer model does not contain any “handshakes” to order data and check their integrity, so it can send packages quicker than TCP.
But since UDP does not allow for testing the correctness of the order of packages, their integrity, and error correction, it might incorrectly transmit data in unpredictable networks. Some packages might be lost, and others might arrive in the wrong order.
SRT has been designed to solve this problem. It transfers data as fast as UDP, but also has the features of restoring lost packages, monitoring network condition, and correcting errors.
Integrity and correct order are ensured in the following way:
This principle is reliable, and at the same time it ensures a higher transfer speed than TCP. In a similar situation, TCP will work in a different way: just like SRT, it confirms the reception of packages, but it does so only after a particular series. If something is missing from the series, TCP will notify the server of the error, and the entire series will be sent again, instead of a single package.
SRT allows for 0.5 to 3 seconds of latency. And apart from reducing delays, it also ensures reliable data transfer.
CDN is a multitude of interconnected servers (points of presence) that receive data from the source service, cash them, and send to end consumers. The main goal, for which CDN was invented, was providing uninterrupted and maximally quick content delivery for a huge number of end users, be it 1 user, 1,000, or 1,000,000+ users simultaneously.
For example, if you’re broadcasting from Germany and you want your video to be available worldwide, with CDN all users will receive it equally quickly.
CDN servers take the video from the source server and send it to end users. Meanwhile, the points of presence are located as close to users as possible. The result is a distributed network with a large number of routes and nodes.
In case of static content, the information that the user requested previously is cached at CDN servers and quickly sent back at following requests. In case of online broadcasts, CDN works as follows:
The load is thus distributed evenly, and if an CDN server is overloaded, the clients will receive the video from another one nearby. That’s particularly important if your streams are watched by millions of people.
The standard CDN cache mechanism entails the content stored at HDDs as complete finished segments lasting 6 to 10 seconds. However, it’s not optimal for online broadcasts: it takes time to receive the content from HDDs, which leads to delays.
To avoid that, we cache the video in RAM and divide segments into smaller chunks, so that they are delivered to users faster.
Ultra-Low Latency is a delay of under 1 second. The main technology that can ensure such speed is called WebRTC.
It’s a communication standard that makes it possible to transmit videos directly between browsers, without any other extra extensions or apps.
The technology can be used both for video calls and for online broadcasts.
How it works:
WebRTC also uses UDP. In case of online broadcasts or video chats, the correct file order is not as important, because videos are sent in real time. So, using a less reliable but faster and simpler connection is the best solution in this case.
With WebRTC, it’s very easy to organize a video chat or an online conference, because there’s no need to install and setup extra software.
Since the standard was initially designed for video calls, it is designed so as to ensure minimum possible latency.
Any unique content must be well-protected—otherwise pirates will be able to easily copy your video, take possession of it, or post it online for free. In such case, your product will lose its uniqueness, and you will lose your profits.
A modern streaming platform should use efficient content protection mechanisms.
Some technologies that allow you to reliably protect information include:
You can read more about these technologies in the article “How to protect video content efficiently.”
An advanced streaming platform must make sure the broadcast is available at every internet connection quality, and stream high-quality video.
To make sure videos load quickly at any internet connection rate, adaptive bitrate is used. This method makes it possible to adjust video quality to each individual user’s connection speed.
If users have good connection, the broadcast will be in 4K/8K. If the connection is poor, the video will still be available, but at a lower quality.
At the same time, only some fragments of the video might be of lower quality. For example, if a user watches a video while traveling, some parts of his way will support 4G, and the video will be accessible in maximum quality. And along the way where the connection is worse, the quality will go down.
High-quality video files are very large. If they were transmitted “as is,” without compression, it would take a very long time to send them to viewers. Besides, there just would be no space to store them. To make sure videos are easier to transmit and store, they are compressed before sending.
There are different methods that allow one to reduce file volume. For example, some elements might be removed from it, like sound or color encoding.
That doesn’t mean the video will be of lower quality. Usually, the removed bits include the insignificant elements that a person does not perceive. For example, our eyes are more sensitive to brightness than to color, so the number of bits allocated for color encoding can be reduced. A regular viewer without professional-grade equipment won’t notice any difference.
Let us show you a simple example of how that works. Imagine we have a tower of blocks: 3 blue blocks in the bottom, then 4 green blocks, and 2 red ones on top—9 blocks overall.
Instead of showing all 9 blocks, we can leave just 1 block of each color and say how many of them will be there. Therefore, 3 blocks will be left instead of 9.
The object has become smaller. However, the figures allow us to realize what it was like originally and restore its original state.
The same happens to videos. Instead of showing every pixel, the algorithm determines the number of identical pictures in a row and their positions, saving only the data of unique pixels.
The video compression algorithm is called video codec. It reduces video size in megabytes while saving its visual component.
The most popular codecs at the moment are AVC (H.264), HEVC (H.265), VP8, and VP9. Also, AV1 and VVX (H.266) are increasingly discussed and used now.
1. AVC (H.264). Licensed video compression standard, designed by a group of ITU-T video encoding experts together with the ISO/IEC Moving Picture Experts Group (MPEG) in 2003. It can cut a file size by over 80%, without any damage to the video quality.
This codec is very important for low-latency broadcasting. It can transmit video at up to 10 Mbps.
The videos are compressed by unifying identical elements. Instead of transmitting every pixel of the unicolored background, AVC joins them in a macroblock and says that the entire fragment has the same color. At playback, the pixels are restored the same way they were positioned in the original. This technique is called intraframe prediction.
At the same time, the codec’s ability to combine fragments into macroblocks is not limited to a single frame. It can watch several frames and determine their areas that do not change. For example, if you’re broadcasting a webinar, and the speaker is seating at the table and reading a lecture, the only things that change are the speaker’s face and lips, while the table and background remain the same. AVC only determines the parts of the image that change and sends only them, leaving everything else unchanged. That is called interframe prediction.
The maximum size of macroblocks is 16×16.
AVC is one of the most widespread codecs right now, supported by all browsers and any devices.
2. HEVC (H.265). This codec was designed as a more perfect version of AVC. Its goal was to reduce video volume twofold while maintaining the same quality. It supports formats of up to 8K and definitions of up to 8192×4320 pixels.
The technology is particularly important for 4K broadcasts. It allows one to maintain high quality while sending the video stream as fast as possible.
Like AVC, HEVC uses inter- and intraframe prediction technologies, albeit much more efficiently. Macroblocks can reach the size of 64×64, many times bigger than in AVC. That means HEVC can send fewer fragments and keep the overall video size lower.
According to various trials, H.265 is 35% to 45% superior to its predecessor. The codec is supported by iOS and Android, most TV boxes and Smart TVs, Safari, Edge, and Internet Explorer, but not by Chrome or Firefox.
3. VP8. It was designed by On2 Technologies and announced in 2008.
The compression principle of this codec is similar to AVC, and its efficiency is approximately the same. It also uses the methods of intra- and interframe prediction. The maximum macroblock size is 16×16. So, just like AVC, it is much inferior to HEVC in the efficiency of Full HD and 4K video transmission.
However, VP8 is good at real-time compression, so it is used as the default WebRTC video codec.
VP8 is supported by all popular browsers and Android devices; however, on iOS, it only works in third-party applications.
4. VP9. Video codec designed by Google in 2013. The next step in the VP8 standard evolution. Supported by YouTube since 2014.
VP9 is largely similar to HEVC: they are equally efficient and suitable for 4K video transmission. Like HEVC, VP9 supports macroblocks of up to 64×64. But while in HEVC, the blocks have to be square, it’s not mandatory for VP9 that can also combine pixels into rectangular shapes. Such blocks can be processed more efficiently, which gives VP9 an advantage.
On the other hand, VP9 only has 10 prediction options, while HEVC has 35. The higher number of prediction options gives HEVC a visual advantage: it has a little better quality. However, VP9 offers equally fast video broadcast.
A 2014 trial showed that VP9, like HEVC, is 40–45% more efficient than AVC and VP8.
The codec is supported by all mobile devices, most set-top boxes, and Smart TV, and all popular browsers except Safari.
5. AV1. Next step in video compression technology after VP9. It was created in 2018 by Alliance for Open Media, which includes Google, Microsoft, Apple, Netflix, Amazon, and other large brands from the spheres of electronics, video production, and browser development.
In the development of the codec, they used VP9, as well as the groundwork of Cisco Thor Project and Mozilla Daala. Their goal was to create a codec that would be radically better than the existing solutions,—and they succeeded. In April 2018, Facebook Engineering ran a trial and found out that AV1 is 30% better than VP9 and 50% superior to AVC.
The rise in efficiency was achieved due to its improved intraframe prediction.
HEVC and VP9 both have very good interframe prediction. Remember our webinar example: the video starts with the speaker sitting at a table and delivering a lecture. The codec transmits the first frame in full, with the picture of the speaker, the table, and the background. In the following frames, only the changes are transmitted. This first frame is called the anchor frame. There might be a lot of them in the video, depending on how dynamic the action is and how often the scenes change.
And while the intermediate frames (with only the changes) are really light, the anchor frames are still quite massive. The AV1 developers decided they should focus on reducing the size of the anchor frames, and thus they paid more attention to intraframe prediction.
For example, AV1 predicts color by brightness. Only brightness data is encoded in full, and the colors are restored by brightness data. It allows reducing the volume of information about color and cutting the size.
To make sure the minimum information about color doesn’t lead to inaccuracies, the prediction is done fragment-by-fragment. For example, in a shot of the sky, some parts of it will be lighter, and some parts will be darker. In order to avoid encoding all the shades of blue, the sky is encoded as a single fragment. The brighter parts of it will be light blue, and lower brightness will encode darker shades, up to navy.
Such prediction makes it possible to combine pixels into larger fragments. The larger the fragments, the fewer of them will be, and the lighter the video will be in weight.
The only disadvantage of this codec is that it’s not as widespread yet. There are few set-top boxes and Smart TVs that support it, too, because the codec is relatively young. But it already works in the most popular browsers, including Mozilla Firefox, Edge, and Chromium-based browsers.
There are different ways to earn money on streaming.
One can sell videos or access to broadcasts for money, or one can make free content and earn money on advertising. If you chose the latter way, the question is, how to make sure Adblock doesn’t block your ads.
There are several ways how ad blocks can be incorporated into videos:
Ad blocks can be launched by DTMF/SCTE-35 markers in the stream or by schedule in the chunks. VAST/VPAID protocols can be used as well.
Modern video display technologies allow you not only to insert ads but also to obtain the statistics of how viewers interact with it. This way, you’ll be able to understand how many people have watched the video, what ads they skip more often, what ads they finish watching, and whether the ad positioning in the video affects that.
There are dedicated anti-Adblock apps, too. Their mechanisms are different: some just block the blocker itself, and others are set up to show the viewers a banner asking them to turn off the Adblock to support content manufacturers.
These methods partly help fight ad blocking. But there’s a more efficient mechanism: insertion of ads in the video.
Server-Side Ad Insertion is a dedicated module that allows you to show video ads even to AdBlock users.
The mechanism puts ads inside the video content, so that the blocker can’t discern between the original content and advertising. It is more technologically complex, but it provides a lot of advantages and monetization methods.
If your streaming platform doesn’t support this feature, most users just won’t see your advertising, and you’ll lose a part of your income.
Do you need a full-scale platform supporting all stages of broadcast, from video capture to playback? Or maybe you have a system already, but you want to add certain technologies to modernize it?
You should always have a choice: whether to buy an entire system or just to connect singular features, not paying for what you don’t need.
For example, our Streaming Platform supports all stages of broadcast:
But if you don’t need the whole platform, we can integrate separate modules into your business.
Our platform meets all modern requirements. We use state-of-the-art tech for video broadcast and latency reduction. We can ensure latency within 4 seconds or 1 second.
We protect content from illegal access and copying via AES, DRM, CORS, and secure links.
We use all the modern codecs described in this article, and we can broadcast video at the qualities of up to 8K. At the same time, thanks to adaptive bitrate, your broadcast will be accessible from any device and at any internet connection.
We organically insert advertising, bypassing Adblock. We support all four ad modes: pre-roll, mid-roll, post-roll, and pause-roll.
1. Transcoding. We convert the video from one format to another, so that it meets the demands of the device on where it’s watched.
For example, smartphones or tablets do not support some formats. In order not to create two broadcasts, we use transcoding, converting the videos to the required formats for different devices.
The transcoding system also adjusts video quality to various users. For example, a viewer might have an old PC that doesn’t support Full HD,—for them, the video would be transcoded to HD. Or somebody is watching the broadcast through a poor connection,—then, in order to avoid lags every ten seconds, the quality is reduced to the optimal level.
Our streaming platform can deliver videos to any device in HD, Full HD, 4K, and 8K. The transcoding takes place at our powerful servers.
How this works:
Your broadcast will be accessible to anyone, on any device, and with any connection.
2. HTML5 player.
3. DVR. Records and rewinds live video. If a viewer has skipped an important part, don’t worry: they’ll always be able to rewind the broadcast and rewatch the part they missed. It’s very convenient for viewers.
The feature makes it possible to record up to 4 hours of live video.
4. Single control panel for Streaming and CDN. If you already have our CDN, you don’t need a dedicated account for the streaming platform. Control the products through a single panel with an intuitive interface.
5. Statistics and analytics. Learn the number of viewers or unique users, list of referrals, and other important information. You can regularly obtain any statistics through your account.
6. Rebroadcast to any social media. You can host your stream on several social media websites at once. You won’t need to connect anything extra—just add the required social media, and the video will be broadcast there automatically.
7. 24/7 technical support. In organizing online broadcasts, it’s crucial that everything should work well: all users must have good sound and clear image, video should load quickly and never lag. And if any issues arise, they should be eliminated as fast as possible.
Our specialists will quickly solve any issues. No matter at what time of day or night you’re streaming, our support works 24/7.
Test our streaming platform, check how it works, and see for yourself that it is an efficient solution. Or just start with a free consultation session.