Monday, January 6, 2025

AV1 Adventures

I've been trying to use AV1 for recompressing home videos and blurays, to keep them available on the network. My Home Videos are taken with a Moto G7 Potato, which requires a 4k 40MB HEVC source to look decent at all. After a light denoise filter, they look ok enough. My goal is to save these as ~3MB 1080p and ~1.5MB 480p for live use and archive the originals offline.  This is my first attempt at understanding AV1 quality. I had hoped to just jump in with suggested settings and get good results, but this was not the case.

Considering everything I've read, these were the assumptions I started out with:

  1. Most use-cases need an RF between 25-35. Start at 35 and work from there.
  2. Lower Resolutions need lower RF(higher quality settings), as there's less source data to work with to keep a good picture.
  3. AV1 is a slower codec overall, but at the same speed as x264/x265 it will still give better results. AV1 will improve even more against x264/x265 if you encode with slower settings. Ergo, lacking another concern, AV1 should be used.
  4. Constant Quality(RF) settings give better results than average bitrate, but source material will drastically impact final bitrate.
  5. Mixed Messaging on 10bit. Some say use it in all cases; it has little overhead but helps reduce banding due to less rounding. Others say it's a waste on 8 bit source.

To this point, here are my experiences with those assumptions

1. Most use-cases need an RF between 25-35

    This remains to be seen. The Blu-ray source I'm using certainly does not bear this out. Using SVT-AVC Preset 5 RF25 only got to 1.5mbps, and had terrible issues with dark sections/shadows. This included flat blocks/banding across dark zones, and dancing blockiness in shadowed areas. Yet this used the highest quality in the normal range? I eventually used RF21 to get decent(but not really great), quality near 3mbps. A 720p encode at RF21 Preset 4 handled the dark sections better than the 1080p version, using almost half the bitrate(1.7mbps).

    I then tried starting with a far better source, a never compressed 10 second 1080p yuv 4:4:4 clip, encoded with 1080p RF21 Preset 5. It resulted in 60mbps video...wth. This does verify another assumption though: Constant Quality(RF) settings give better results than average bitrate, but source material will drastically impact final bitrate. Unfortunately, this seems like AV1 will be a rather finicky codec to use. In the past, using crf settings on x264, most would be around the same bitrate but some content would use as much as twice as expected. Going from 1.5mb to 60mb is a whole other story. I could use the maximum bitrate option, but that would end up with the highest quality sources always maxing out their bitrate(when in fact they should need the least, as they don't have prior artifacts to reencode against). It seems there's just no good encoding options that can be used  blindly against a set of videos.

2. Lower Resolutions need lower RF(higher quality settings)
 
    480p definitely needs lower RF than 720p and 1080p. I ended up using RF9 to get ~1.7mbps out of 480p, yet it still isn't as good as x264 abr at 1.7mbps on the dark areas(the rest of the frame beats x264 though). 
    720p needed RF21 to hit 1.7mb using 10bit with Preset 4. This looked better than the 480p 8 bit RF9 at ~1.7mb, and fully beat x264. It seems without preset 4, AV1 cannot handle dark zones as well as x264. This is a problem on 1080p as the encoding time for preset 4 is way too much on my antiquated system.
    1080p needed RF21 to hit 3.2mbps, using 10 bit with Preset 5. This video looked worse than the 720p version which used Preset 4 but half the bitrate. Once again, the brighter sections of the frame killed x264, but many dark sections contained distracting artifacts. It's sad that the RF setting does not result in any consistent output across other settings. It is apparently required for good results yet offers zero predictive power of actual results across inputs or other settings. Presenting this as a "Constant Quality" mode, we might understand that quality will differ with inputs and resolutions, while expecting the same quality output at differing sizes when changing only encoding speed through presets. Instead we get differing sizes *and* quality. 
    Clearly, this means any encode must be a cycle of encode, change settings, encode, change settings...until you find the quality/size tradeoff you're looking for. I don't yet understand why using bitrate alone for this can't do the same. Complexity does indeed spike for some scenes, requiring higher bitrates; a constant quality metric would handle that while an average bitrate over a short span may not. But isn't that just a matter of tuning the min/max bitrate range wider and doing better analysis with a first pass? Or perhaps selectively turning on slower encoder features during complex scenes?

    
3. AV1 is a slower codec overall, 
but at the same speed as x264/x265 it
 will still give better results. 
    This may in fact be the case if we don't include the dark scene problems. I have yet to have dark scenes and shadows exceed x264 without Preset 4 and RF mode, while using x264 in an abr mode with a similar encode time. This is also comparing 10bit AV1 vs 8 bit x264. These dark regions are such a nuisance that it ruins what really is a far crisper video overall. It might just be that AV1 without P4 or better is just not worth it; yet that is the opposite of every source I've read.

4. Constant Quality(RF) settings give
 better results than average bitrate...
     This has been the story of the day, so I won't belabor it here. It is true, but it is weird that it's true.

5. Mixed Messaging on 10bit. 
    I've seen cases where 10 bit did not help with banding or blockiness at all. I've seen cases where it definitely seemed to help, but resulted in a somewhat low detailed smooth region. It was less jarring than the banding, but not good. There is a limit to what can be done at low bitrates though, so tradeoffs must occur. So far the penalty for 10bit seems around 10% encoder speed, and insignificant on the output size. I do believe I'll use 10 bit for everything.  

No comments:

Post a Comment