I have 1 second H.265-encoded videos at 30 fps coming into a server for processing. The server needs to decode the videos into individual frames (lossless). These videos are coming in very quickly, so performance is of utmost importance. The server has a H.265 compatible Nvidia GPU, and I have built
ffmpeg with support for CUDA. The following is the configuration output from
ffmpeg version N-100479-gd67c6c7f6f Copyright (c) 2000-2020 the FFmpeg developers built with gcc 8 (Ubuntu 8.4.0-3ubuntu2) configuration: --enable-nonfree --enable-cuda-nvcc --enable-nvenc --enable-opencl --enable-shared --enable-pthreads --enable-version3 --enable-avresample --enable-ffplay --enable-gnutls --enable-gpl --disable-libaom --disable-libbluray --disable-libdav1d --disable-libmp3lame --enable-libopus --disable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --disable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --disable-videotoolbox --disable-libjack --disable-indev=jack --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64
I decode the videos into PNGs, and am using the following command:
ffmpeg -y -vsync 0 -hwaccel cuvid -hwaccel_output_format cuda -hwaccel_device 0 -c:v hevc_cuvid \ -i 0.mp4 -vf hwdownload,format=nv12 -q:v 1 -qmin 1 -qmax 1 -start_number 0 f%d.png
This command successfully leverages the hardware acceleration for the H.265 decode. But, the PNG encode is done by the CPU.
Does CUDA have support for encoding of lossless images? The format does not need to be PNG, but it does need to be lossless. CUDA has a nvJPEG Library, but JPEG is a lossy format. Is there a similar image encoding library in CUDA for a lossless format (that is also integrated with
Edit: Some more context....
I am currently using PNGs because of their compression-ability. These images are 2560x1280 in size, btw. On one hand, it is this compression that costs the CPU cycles. On the other hand, I am also limited by the throughput of how fast (and how much aggregate data) can I upload these frames to the upstream consumer. So it's basically a tradeoff between:
- We want to extract these frames as quickly as possible.
- We want efficiency regarding the image size.