Saturday, March 9, 2013

[solved] incorrect duration converting wav's to mp3's

problem

User creates a 3:15 screencast "smite.mp4". They extract the WAV soundtrack to opitimize or edit it. After this, suppose they wanted to convert the WAV to a space-saving MP3 before recombining w/video? In this case, imagine a simple 192k continuous bitrate is OK, that no Variable Bit Rate (VBR) audio is required. Also, they may want to keep the separate audio file and slow it down or speed it up or change the quality. These typically can be done with lame and/or sox.

We do our conversion, say with:

$ ffmpeg -i smite.wav -ab 192k -af "volume=1.1" smite.mp3

Note: The volume was tweaked because it's sometimes decreased in conversion. Great resource.

but the duration is incorrect

Now we run a test of the MP3 file in some player, perhaps Audacious. On occasion, we'll find that the duration stamp is corrupted, perhaps appearing as 28:15. Typically, the slider can't move in a corrupted timeline either.

Incorrect duration stamps in media files are an occasional problem. The original media file might have it. More often if a person accelerate or slows-down the WAV, which we can do between half-speed (0.5) or double speed (2.0) with a filter.

$ ffmpeg -i smite.mp4 -af "atempo=0.85","volume=1.1" -vn -ar 44100 -ac 2 smiteslower.wav

the reason

Ffmpeg and avconv use the bitrate setting as part of their duration calculations. Both ffmpeg and avconv will calculate the duration correctly if we don't specify a bitrate. Unfortunately, if we don't specify the bitrate, ffmpeg and avconv will use their native bitrates, which both happen to be the low quality of 128Kb. So how do we achieve the 192K bitrate we desire in the example above and still obtain a correct duration stamp on the resulting MP3?

solution

Install lame. For example to achieve the 192Kb, with a correct time stamp, and with the conversion volume setting just a bit above 100% (scale), we could use:

$ lame --scale 1.2 -b 192 smite.wav

I can change the bitrate to whatever I want, even into a VBR, and the resulting duration stamp is accurate. With respect to the volume, if I wanted to double it, the scale I would select would be "2", and so on. Finally, the output file name, in this case smite.mp3, will be created automatically using the input WAV file's name. Alternatively, one can force an output name. Now, when we re-render our audio back to our video, they will be properly synced, since the timestamps are correct.

solution going mp3 to wav

$ lame --decode file.mp3 output.wav

memory issues

Sometimes there's not enough /tmp space to handle processing a large media file. You'd imagine the solution is to increase the size of /tmp beyond the Gb's of installed RAM, so that the system overflows into SWAP. This will not work. This is because they now have both /tmp AND /tmpfs. Tmpfs is what is actually being used. Its default is half the GB of RAM. Systems put tmpfs in RAM to save on resources -- it makes the system more efficient. However, when dealing with a large media file, I modify /etc/fstab as below, and then reboot.

# nano /etc/fstab
tmpfs /tmp tmpfs rw,nodev,nosuid,size=10G 0 0

When the media work is complete, I comment out the following fstab and reboot again.

slowing tempo

links: sox cheat sheet

Can be done with ffmpeg, lame or sox. Lame only takes WAV and MP3 files as inputs. Sox can read and manipulate almost any filetype, though it needs to be specified as a flag, and it will output a WAV.

In this example, I take an input OPUS file, slow it to 80% of original, and boost the volume 10%, while converting to MP3 at a bitrate 320K. The syntax is counter-intuitive, IMO. Eg, there's no hyphen before "speed".

$ sox -v 1.1 foo.opus -C 320 foo.mp3 speed 0.8

If the result clips in a few places due to the increased bass of slower speed, the output can be equalized to decrease the bass.

video note

When converting the video of a screencast, the only way I've found to get the proper duration is to be sure to use the switch:
-target ntsc-dvd

No comments: