NB: Try to make all cuts at I-frame keyframes, if possible.
Links: 1) PiP Pt II 2) capture commands 3) settings
Editing video in Linux becomes a mental health issue after a decade or more of teeth grinding with Linux GUI video editors. There are basically two backends:ffmpeg and MLT. After a lost 10 years, some users like me resign themselves to command line editing with ffmpeg and melt (the MLT CLI editor).
This post deconstructs a simple PiP screencast, perhaps 6 minutes long. A small project like this exposes nearly all the Linux editing problems which appear in a production length film. This is the additional irony of Linux video editing -- having to become practically an expert just to do the simplest things; all or nothing.
At least five steps are involved, even for a 3.5 minute video.
- get the content together and laid out, an impromptu storyboard. What order do I want to provide information?
- verify the video inputs work
- present and screencapture - ffplay, ffmpeg CLI
- cut clips w/out render - ffmpeg CLI
- assemble clips with transitions - ffmpeg CLI
The command-line PiP video setup requires 3 terminals to be open, 1) for the PiP, 2) for the document cam, 3) for the screen capture. Each terminal has a command. 1) ffplay, 2) ffplay, 3) ffmpeg.
1. ffplay :: PiP (always on top)
The inset window of the host narrating is a PiP that should always be on top. Open a terminal and get this running first. The source is typically the built in webcam, trained on one's face.
$ ffplay -i /dev/video0 -alwaysontop -video_size 320x240
The window always seems to open at 640x480, but then resized down to 160x120 and moved anywhere on the desktop. And then to dress it up with more brightness, some color sat, and mirror flipped...
ffplay -i /dev/video0 -vf eq=brightness=0.09:saturation=1.3,hflip -alwaysontop -video_size 320x240
2. ffplay :: document cam
I start this secondly, and make it nearly full sized, so I can use it interchangeably with any footage of the web browser.
$ ffplay -i /dev/video2 -video_size 640x480
3. ffmpeg :: screen and sound capture
Get your screensize with xrandr, eg 1366x768, then eliminate the bottom 30pixels (20 on some systems) to omit the toolbar. If the toolbar isn't shown, it can be used during recording to switch windows. Syntax: put the 3 flags in this order:
-video_size 1366x738 -f x11grab -i :0
...else you'll probably get only a small left corner picture or errors. Then come all your typical bitrate and framerate commands
$ ffmpeg -video_size 1366x738 -f x11grab -i :0 -r 30 output.mp4
This will encode a cleanly discernable screen at a cost of about 5M every 10 minutes. The native encoding is h264. If a person wanted to instead be "old-skool" with MPEG2 (codec:v mpeg2video), the price for the same quality is about 36 times larger: about 180M for the same 10 minutes. For MPEG2, we set a bitrate around 3M per second (b:v 3M), to capture similarly to h264 at 90K.
Stopping the screen capture is CTRL-C. However: A) be certain CTRL-C is entered only once. The hard part is, it doesn't indicate any change for over a minute so a person is tempted to CTRL-C a second time. Don't do that (else untrunc). Click the mouse on the blinking terminal cursor to be sure the terminal is focused, and then CTRL-C one time. It could be a minute or two and the file size will continue to increase, but wait. B) Before closing the terminal, be certain ffmpeg has exited.
If you CTRL-C twice, or you close the terminal before ffmpeg exits, you're gonna get the dreaded "missing moov atom" error. 1) install untrunc, 2) make another file about as long as the first but which exits normally, and 3) run untrunc against it.
Explicitly setting the screencast bitrate (eg, b:v 1M b:a 192k) typically spawns fatal errors, so I only set the frame rate.
Adding sound...well you're stuck with PulseAudio if you installed Zoom, so just add -f pulse -ac 2 -i default...I've never been able to capture sound in a Zoom meeting however.
$ ffmpeg -video_size 1366x738 -f x11grab -i :0 -r 30 -f pulse -ac 2 -i default output.mp4
manage sound sources
If a person has a Zoom going and attempts to record it locally, without benefit of the Zoom app, they typically only hear sound from their own microphone. Users must switch to the sound source of Zoom itself to capture the conversation. This is the same with any VOIP, of course. This can create problems -- a person needs to make a choice.
Other people will say that old school audio will be 200mV (0.002), p-p (peak-to-peak). Unless all these signals are changed to digital, gain needs to be set differently. One first needs to know the name of the devices. Note that strange video tells more about computer mic input at than I've seen anywhere.
Link: Cuts on keyframes :: immense amounts of information on cut and keyframe syntax
Ffmpeg can make non-destructive, non-rerendered cuts, but they may not occur on an I-frame (esp. keyframe) unless seek syntax and additional flags are used. I first run $ ffprobe foo.mp4 or $ ffmpeg -i foo.mp4on the source file: bitrate, frame rate, audio sampling rates, etc. Typical source video might be 310Kb h264(high), with 128 kb/s, stereo, 48000 Hz aac audio. Time permitting, one might also want to obtain the video's I-frame (keyframe) timestamps, and send them to a text file to reference during editing...
$ ffprobe -loglevel error -skip_frame nokey -select_streams v:0 -show_entries frame=pkt_pts_time -of csv=print_section=0 foo.mp4 >fooframesinfo.txt 2>&1
- no recoding, save tail, delete leading 20 seconds. this method places seeking before the input and it will go to the closest keyframe to 20 seconds.
$ ffmpeg -ss 0:20 -i foo.mp4 -c copy output.mp4
- no recoding, save beginning, delete tailing 20 seconds. In this case, seeking comes after the input. Suppose the example video is 4 minutes duration, but I want it to be 3:40 duration.
$ ffmpeg -i foo.mp4 -t 3:40 -c copy output.mp4
Do not forget "-c copy" or it will render. Obviously, some circumstances require this level of precision, and a person has little choice but to render. $ ffmpeg -i foo.mp4 -t 3:40 -strict 2 output.mp4
This gives cleaner transitions.
- save an interior 25 second clip, beginning 3:00 minutes into a source video
$ ffmpeg -ss 3:00 -i foo.mp4 -t 25 -c copy output.m4
...split-out audio and video
$ ffmpeg -i foo.mp4 -vn -ar 44100 -ac 2 sound.wav
$ ffmpeg -i foo.mp4 -c copy -an video.mp4
...recombine (requires render) with mp3 for sound, raised slightly above neutral "300", for transcoding loss
$ ffmpeg -i video.mp4 -i sound.wav -acodec libmp3lame -ar 44100 -ab 192k -ac 2 -vol 330 -vcodec copy recombined.mp4
Ffmpeg doesn't allow for frame number cutting. If you set a time without recoding, it will rough cut to a number of seconds and a decimal. This works poorly for transitions. So what you'll have to do is recode it and enforce strict time limits, then take it time the number of frames. You can always bring the clip into Blender to see the exact number of frames. Even though Blender is backended with Python and ffmpeg, it somehow counts frames a la MLT.
other effects (+1 render)
Try to keep the number of renders as low as possible, since each is lossy.
fade in/out
...2 second fade-in. It's
covered directly here, however, it requires the "fade" and "afade" filters which don't come standardly compiled in Arch, AND, it must re-render the video for this.
$ ffmpeg -i foo.mp4 -vf "fade=type=in:duration=2" -c:a copy output.mp4
For the fade-out, the location must be made in seconds, most recommend using ffmprobe, then just enter the information 2 seconds before you want it. This video was 7:07.95, or 427.95 seconds. Here it is embedded with some other filters I was color balancing and de-interlacing with.
$ ffmpeg -i foo.mp4 -max_muxing_queue_size 999 -vf "fade=type=out:st=426:d=2,bwdif=1,colorbalance=rs=-0.1,colorbalance=bm=-0.1" -an foofinal.mp4
text labeling +1 render
A
thorough video 2017,(18:35) exists on the process. Essentially a filter and a text file, but
font files must be specified. If you install a font manager like
gnome-tweaks, the virus called
PulseAudio must be installed, so it's better to get a list of fonts from the command line
$ fc-list
...and from this pick the font you want in your video. The filter flag will include it.
-vf "[in]drawtext=fontfile=/usr/share/fonts/cantarell/Cantarell-Regular.otf:fontsize=40:fontcolor=white:x=100:y=100:enable='between(t,10,35)':text='this is cantarell'[out]"
... which you will want to drop into the regular command
$ ffmpeg -i foo.mp4 -vf "[stuff from above]" -c:v copy -c:a copy output.mp4
...however this cannot be done because streamcopying cannot be accomplished after a filter has been added -- the video must be re-encoded. Accordingly, you'll need to drop it into something like...
$ ffmpeg -i foo.mp4 -vf "[stuff from above]" -output.mp4
Ffmpeg will copy most of the settings, but I do often specify the bit rate, since ffmpeg occasionally doubles it unnecessarily. This would just be "q:v "(variable), or "b:v "(constant). It's possible to also run multiple filters; put a comma between each filter statement.
$ ffmpeg -i foo.mp4 -vf "filter1","filter2" -c:a copy output.mp4
saturation
This great video (1:08), 2020, describes color saturation.
$ ffmpeg -i foo.mp4 -vf "eq=saturation=1.5" -c:a copy output.mp4
1. slow entire, or either end of clip (+1 render)
The same video shows slow motion.
$ ffmpeg -i foo.mp4 -filter:v "setpts=2.0*PTS" -c:a output.mp4
OR
$ ffmpeg -i foo.mp4 -vf "setpts=2.0*PTS" output.mp4
Sometimes the bitrate is too low on recode. Eg, ffmpeg is likely to choose around 2,000Kb if the user doesn't specify a bitrate. Yet if there's water in the video, it will likely appear jerky below a 5,000Kb bitrate...
$ ffmpeg -i foo.mp4 -vf "setpts=2.0*PTS" -b 5M output.mp4
2. slowing a portion inside a clip (+2 render)
Complicated. If we want to slow a 2 second portion of a 3 minute normal-speed clip, but those two seconds are not at either end of the clip, then ffmpeg must slice-out the portion, slow the portion (+1 render), then concatenate the pieces again (+1 render). Also, since the single clip temporarily becomes more than one clip, a filter statement with a labeling scheme is required. It's covered here. It can be covered in a single command, but it's a big one.
Suppose we slow-mo a section from 10 through 12 seconds in this clip. The slow down adds a few seconds to the output video.
$ ffmpeg -i foo.mp4 -filter_complex "[0:v]trim=0:10,setpts=PTS-STARTPTS[v1];[0:v]trim=10:12,setpts=2*(PTS-STARTPTS)[v2];[0:v]trim=12,setpts=PTS-STARTPTS[v3];[v1][v2][v3] concat=n=3:v=1" output.mp4
supporting documents
Because of the large number of command flags and commands necessary for even a short edit, we can benefit from making a text file holding all the commands for the edit, or all the text we are going to add to the screen, or the script for TTS we are going to add, and a list of sounds, etc. With these three documents we end up sort of storyboarding our text. Finally, we might want to automate the edit with a
Python file that runs through all of our commands and calls to TTS and labels.
basic concatenation txt
Without filters,
file lists (~17 into video) are the way to do this with jump cuts.
python automation
Python ffmpeg scripts are a large topic requiring a separate post; just a few notes here. A relatively
basic video 2015,(2:48) describing
Python basics inside text editors. The IDE discussion can be lengthy also, and one might want to
watch this2020, (14:06), although if you want to avoid running a server (typically Anaconda), you might want to run a simpler IDE (Eric, IDLE,),
PyCharm, or even
avoid IDE's2019,(6:50). Automating
ffmpeg commands with
Python doesn't require
Jupyter since the operations just occur on one's desktop OS, not inside a browser.
considerations
We want to have a small screen of us talking about a larger document or some such and not just during recording
- we want the small screen PiP to always be on top :: use -alwaysontop flag
- we'd like to be able to move it
- we'd like to make it smaller than 320x240
link:
ffplay ::
more settings
small screen
$ ffplay -f video4linux2 -i /dev/video0 -video_size 320x240
OR
$ ffplay -i /dev/video0 -alwaysontop -video_size 320x240
...then to keep it always on top
commands
The CLI commands run long. This is because
ffmpeg defaults run high. Without limitations inside the commands,
ffmpeg pulls 60fps, h264(high), at something like 127K bitrate. Insanely huge files. For a screencast, we're just fine with
- 30fps
- h264(medium)
- 1K bitrate
flag | note |
b:v | 4Kb if movement in the PiP is too much, up this |
f | x11grab must be followed immediately with a second option "i", and eg, "desktop" this will also bring h264 codec |
framerate | 30. Some would drop it to 25, but I keep with YouTube customs even when making these things. Production level would be 60fps |
b:v | 1M if movement in the PiP is too much, up this |
Skype | 1-1, MicroSoft data collection for the US Govt |
video4linux2
This is indespensable for playing one's webcam on the desktop, but it tends to default to highest possible framerates (14,000Kbs), and to a 640x480 window-size though the latter is resizeable. The thing is, it's unclear whether this is due to the
vidoe4linux2 codec settings, or upon the
ffplay which uses it. So is there a solid configuration file to reset these?
This site does show a file to do this.
You might want to run a series of commands.The key issue is figuring the chaining. Do you want to start 3 programs at once, one after the other, one after the other as each one finishes, one after the other with the input of the prior program as the input for the next?
Bash Scripting (59:11) Derek Banas, 2016. Full tutorial on Bash scripting.
Linking commands in a script (Website) Ways to link commands.
$ nano pauseandtalk.sh (don't need sh, btw)
#!/bin/bash
There are several types of scripts. You might want a file that sequentially runs a series of ffmpeg commands, or you might want to just have a list of files for ffmpeg to look at to do a concatanation, etc.
Sample Video Editing Workflow using FFmpeg (19:33) Rick Makes, 2019. Covers de-interlacing to get rid of lines, cropping, and so on.
Video Editing Comparison: Final Cut Pro vs. FFmpeg (4:44) Rick Makes, 2019. Compares editing on the two interfaces, using scripts for FFmpeg
Text-to-speech has been covered in another post, however there are commonly times when a person wants to talk over some silent video. $ yay -S audio-recorder. How to pause the video and speak at a point, and still be able to concatenate.
inputs
If you've got a desktop with HDMI output, a 3.5mm hands-free mic won't go into the video card, use the RED 3.5mm mic input, then filter out the 60hz hum. There are ideal mics with phantom power supplies, but even a decent USB mic is $50.
For syncing, you're going to want to have your audio editor running and Xplayer running same desktop. This is because it's easier to edit the audio than the video, there's no rendering to edit audio.
Using only Free Software (12:42) Chris Titus Tech, 2020. Plenty of good audio information (including Auphonic starting at 4:20; mics (don't use the Yeti - 10:42) and how to sync (9:40) at .4 speed.
Best for less than $50 (9:52) GearedInc, 2019. FifinePNP, Blue Snowball. Points out that once we get to $60, it's an "XLR" situation with preamps and so forth to mitigate background noise.
Top 5 Mics under $50 (7:41) Obey Jc, 2020. Neewer NW-7000Compares editing on the two interfaces, using scripts for FFmpeg
find the microphone - 3.5mm
Suppose we know we're using card 0
$ amixer -c0
$ aplay -l
These give us plenty of information. However, it's still likely in an HDMI setup to hit the following problem
$ arecord -Ddefault test-mic.wav
ALSA lib pcm_dsnoop.c:641:(snd_pcm_dsnoop_open) unable to open slave
arecord: main:830: audio open error: No such file or directory
This means there is no "default" configured in ~./asoundrc. There would be other errors too, if not specified. The minimum command specifies the card, coding, number of channels, and rate.
$ arecord -D hw:0,0 -f S16_LE -c 2 -r 44100 test-mic.wav
subtitles/captions