"X"...in a box: 2021

Tuesday, October 26, 2021

slowing video and audio

If possible, concatenate the entire video project MP4, and then do a final pass for speed and color changes.

concatenation method

To complete by simple concatenation of clips, with hard cuts, a list of clips in a TXT file and the "concat" filter will do the trick without rendering.

$ ffmpeg -f concat -safe 0 -i mylist.txt -c copy output.mp4

Put the text file in the directory with the clips or do absolute paths to them. The file is simple; each video is one line and takes the syntax:

$ nano mylist.txt
# comment line(s)
file video1.mp4
file '/home/foo/some-other-video.mp4'

Obviously, transitions which fade or dissolve require rendering; either way, starting with a synced final video, with a consistent clock (tbr), makes everything after easier.

concat codec

If using ffmpeg, then mpeg2video is the fastest lib, but also creates the largest files. Bitrate is the number one file-size determiner and what libx264 can do at 3M, takes 5M in mpeg2video. Videos with a mostly static image, like a logo, may only require 165K video encoding. A 165K example w/125K audio, 6.5MB for about 5 minutes.

biggest problems

If I'm going to change the speed of the video, and I only need the audio to stay synced, without a change in pitch, then it's trivial. But if I want the audio to drop or rise in pitch and still stay synced, the video typically must be split apart and the audio processed in, eg. sox. That said...

video artifacts: sometimes revealed. Interlace lines or tearing. ffmpeg yadif filter
playback: will recoding still playback on the original device? For ffmpeg, mpeg2video is the most reliable.
complexity and time investment: speed ramps, portions, complex filters -- all difficult with linux software
audio pitch: easiest is to change speed without changing audio sound. Pitch is more complicated. sox changes pitch when slowing, apts and atempo do not. Which will you need?

verify the streams in containers

We need this step to sort which streams our nav packets, video, and audio, reside on. In Linux, ffmpeg is nearly unavoidable, so assuming its use. Solciting ffmpeg information results in two numbers, the file number input, and the stream number. For example, ffmpeg -i result "0:0", means first file, first stream, "0:1", means first file, second stream, and so forth. MP4 files can contain multiple streams. Examining an AVI will only result in a single audio and video stream, per file.

deinterlace and tear solutions

Deinterlacing is the horizontal lines problem, tearing is the chunks of stuff. There are extremely good, but extremely slow deinterlace filters, but for me, the low-CPU filter yadif is typically fine. For tearing, I increase the bitrate to at least 4M. Here's a way to do both. For simplicity, not showing audio here.

$ ffmpeg -i input.mp4 -f dvd -vf "setpts=1.8*PTS,yadif" -b 4M -an sloweroutput.mp4

Although multiple filters, I didn't need 'filter complex' because only one input.

single command slowing

The decision point here is about audio pitch -- do you want the pitch to change when changing the audio? If no pitch change, either asetpts or atempo work fine. I like atempo. If need to recode, the basic mpeg2video video codec is the fastest, lightest lib going.

pitch change: So far, I can't do it. To get it right with a pitch change, I have to split-out audio and video, use sox on the audio, and recombine. I can never get ffmpeg filter, "atempo", to accmplish the pitch change. Eg. I slow the video to half speed using "setpts=2*PTS", then attempt to drop the audio pitch in half, "-af atempo=0.5". It processes without errors, and syncs but with zero pitch change.
no pitch change: "asetpts". It will adapt the audio, sometimes with strange effects, to the new speed, but the pitch will still be whatever.The timebase and so on will still be perfect sync with audio.

pitch change with sox (audio only)

Sox default is to change pitch when slowing or accelerating. Speeding-up audio is easier to match with video, since slowing the video is a repeating number. Sox can do four decimal places of precision(!)

17% slowing, w/pitch change.

$ ffmpeg -i input.mp4 -vf "setpts=1.2*PTS,yadif" -b:v 5M -vcodec copy -an slowedoutput.mp4
$ sox input.wav slowedoutput.wav speed 0.8333
$ ffmpeg -i video.mp4 -i audio.wav -vcodec copy recombined.mp4

30% slowing and pitch change.

$ ffmpeg -i input.mp4 -vf "setpts=1.5*PTS,yadif" -b:v 4M -vcodec copy -an slowedoutput.mp4
$ sox input.wav slowedoutput.wav speed 0.667
$ ffmpeg -i video.mp4 -i audio.wav -vcodec copy recombined.mp4

An output made 20% faster, and pitch change, that syncs perfectly.

$ ffmpeg -i input.mp4 -vf "setpts=0.8*PTS" -b:v 4M -vcodec mpeg2video fasteroutput.mp4
$ sox input.wav fasteroutput.wav speed 1.25
$ ffmpeg -i video.mp4 -i audio.wav -vcodec copy recombined.mp4

bitrate C flag

Sox defaults to 128Kb, so need the "C" flag. Eg to get 320K and slow it to 92%...

sox foo.wav -C 320 foo90.mp3 speed 0.92

pulling from DVD

Be sure to install libdvdcss

single title, "VTS_01_2.VOB".

vobcopy -i /run/media/foo/SOMETITLE -O VTS_01_2.VOB -o /home/foo/SOMEFOLDER

entire disc:

problems

On occasion, if you backup a DVD, you don't need the nav stream anymore and you'll have extra work on filters if you leave it in.

$ ffmpeg -i input.mp4
Input #0, mpeg, from 'input.mp4':
 Duration: etc
 Stream #0:0[0x1e0]: Video: mpeg2video (Main), yuv420p(tv, etc) 
 Stream #0:1[0x80]: Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
 Stream #0:2[0x1bf]: Data: dvd_nav_packet

Delete the audio and data streams:

ffmpeg -i input.mp4 -map 0:v -c copy output.mp4

Thursday, October 21, 2021

local machine (localhost) LAPP, email

Both of these must be entered for the monitor to remain on during inactivity. Neither command by itself is enough to keep the monitor alive.

$ xset -dpms
$ xset s noblank

With these two entered, the screen will still be receiving a signal, but it's just for the backlight, not for any display conent. If we want the display content to remain during inactivity, we must do the two above AND add the following.

$ xset s off

Do we even need this project? Re the LAPP: "no, but helpful". Re the email: probably "yes, for understanding dashboard alerts". LAPP or any other monolithic CMS (XAMP, LAMP) that require learning PHP might be a waste if we can chain cloud services and so on (eg. read comments under this video).

Since LAPP elements are servers, they typically require user switching, individual configuration, or other permission issues. A separate post will deal with production configuration like that. I wrote this one aiming for a localhost light development environment (besides Docker). Additionally, I've attempted to view each LAPP element independently, in case we learn of an app that requires only one element, eg the PHP server, or just a relational database. I also subbed out the Apache "A" for another "L", lighthttp: LLPP. More commonly though, even browser based apps (eg Nextcloud - go to about 8:30) still use a CMS LAMP, LAPP, LNPP, etc. Electron, Go, Docker, won't be covered here.

LAPP

Linux (L)

For both LAPP and email, verifying /etc/hosts is properly configured for IPv4 and 6 for localhost smoothness. Otherwise we typically already login as some user, so we shouldn't have permission issues if everything else is well configured.

PostgreSQL (P)

# pacman -S postgreqsl
# pacman -Rsn postgresql

Lunatic designers really really really don't want a person running this on their own system under their own username. There's no security advantage to this, just poor design. This is almost more difficult than the design of the database itself.

After installing with pacman, add one's username to the postgres group in /etc/group, then initialize the database. Use...

$ initdb -D /home/foo/postgres/data

...instead of the default:

$ initdb -D /var/lib/postgres/data

Attempt startup.

$ postgres -D '/home/foo/postgres/data'

If starting-up gives the following error...

FATAL: could not create lock file "/run/postgresql/.s.PGSQL.5432.lock": No such file or directory

...run the obvious

# mkdir /run/postgresql
# chown foo:foo /run/postgresql

Just do the above, don't believe anything about editing service files or the postgresql.conf file in /data. None of it works. Don't even try to run it as a daemon or service. Just start it like this:

$ postgres -D '/home/foo/postgres/data'

Now that the server is up and running, you can open another terminal and run psql commands. If you run as user postgres (see below), you'll have admin authority.

$ psql postgres

If a lot of data has been written, a good exit phrase to make sure all is safely written:

$ psql -c CHECKPOINT && pg_ctl stop -m fast

PHP (P)

Link: CLI flags ::

# pacman -S php php-pgsql
# pacman -Rsn php php-pgsql

PHP server setup (8:04) Dani Crossing, 2015. Old, but not outdated.
php.ini basics (7:25) Program With Gio, 2021.

If we've got a dynamic website, we typically need PHP (server) and then some JavScript(browser/client) to help display it. The less JavaScript the better (slows browser). I consulted the Arch PHP Wiki .

standalone PHP - start/stop

Standalone PHP is easier to troubleshoot than when starting Apache simultaneously. Stopping is CTRL-C in the terminal where it was started. Startup with web-typical TCP port 80, however leads to a permission issue.

$ php -S localhost:80
[Fri Oct 23 11:20:04 2020] Failed to listen on localhost:80 (reason: Permission denied)

Port 80 works when used with an HTTP server (Apache, NGINX, etc), but not standalone PHP (not sure why). So, use any other port, eg port 8000, and it works.

$ php -S localhost:8000
[Fri Oct 23 11:20:04 2020] PHP 8.0.12 Development Server (http://127.0.0.1:8000) started

See this to determine when best to use index.html, or index.php. But there are at least 3 ways for PHP to locate the index.html or index.php file:

before starting the server, CD to the folder where the servable files are located

specify the servable directory in the startup command

$ php -S 127.0.0.1:8000 -t /home/foo/HTML

$ php -S localhost:8000 -t ~/HTML

edit /etc/php/php.ini to indicate file locations...
# nano /etc/php/php.ini
doc_root = "/home/foo/HTML"
... however this strategy can lead to mental illness and/or lost weekends: sometimes the ini file will not be parsed. Good luck.

First, take a breath. then verify which ini file is being used.

$ php -i |grep php\.ini
Configuration File (php.ini) Path => /etc/php
Loaded Configuration File => /etc/php/php.ini

If you have modified the correct ini file, hours and hours of finding the correct syntax for

standalone PHP - configuration and files

Links: PHP webserver documentation :: Arch PHP configuration subsection

PHP helpfully allows us to configure a document root, so I can keep all html files (including index.htm) inside my home directory. The open_basedir variable inside the configuration file (/etc/php/php.ini) is like a PATH command for PHP to find files. Also, when pacman installs PHP dependent programs like phpwebadmin or nextcloud it default links them to /etc/webapps, because this is a default place PHP tries to find them. Even though they are installed into /usr/share/webapps. So if I had a folder named "HTML" inside my home directory, I'd want to at least:

# nano /etc/php/php.ini
open_basedir = /srv/http/:/var/www/:/home/foo/HTML/:/tmp/

email local

Link: simple setup :: another setup :: notmuch MTA setup

$ mkdir .mail

We also need a mail server on the machine but which only sends to localhost and then only to one folder in /home/foo/.mail/ inside localhost. However, instead of setting up local email alerts, why not skip all of that (for now), and run a log analysis program like nagios?

We want to configure the lightest localhost system setup for email that we can read on a browser. Once we can reliably do alerts on, eg. systemd timers, or some consolidating log app (eg,Fluentd), or some Python/Bash script triggers, other things are possible. Our simplest model takes timer script outputs and sends email to the local system (out via SMTP port 25).

CLI suite

Linux has some default mail settings: mail to root is kept at /var/mail/root, user typically in ~/mail. It appears we'll need a webserver, no small email systems have this built-in. Appears Alot can be our MUA. It is available on the repositories, and can be configured with other light localhost services as described here.

cli configuration

browser MUA

Once the above configured, we can use a browser MUA. For Chromium, the Chrome store, has an extension for notmuch, called Open Email Client. configuration information is provided.

database options

Right tool for the right job. An easy case might be a password database, essentially a flat file database, Excel 2-D type of thing. A more difficult case is a video edit. You've got clips, which might expand in number, you've got recipes or methods for various transitions and combinations, and you've got an overall project view which pulls from these. Whatever we have we'd like it to be ACID compliant. Try to use umlet or some such to do the UML.

7 Database Paradigms (9:52) Fireship, 2020. Seven different ways of looking at data, with suggested applications.
issues univ of utah (19:49) Mark Durham, 2018. Has a great chart.

relational algebra, calculus

Relational Algebra is just a form of logical operators (most are shown below), easy to understand for those who've taken logic courses.

Linguistically, Relational Algebra imparted its first name, "relational", to databases constructed around its principles, ergo "relational databases". In other words, relational databases are databases built around Relational Algebra.

Relational Algebra itself was created by Edgar Codd. His main book on this is The Relational Model for Database Management, and his Turing Award paper on relational algebra (1981) is available online (PDF). The most important skill level to reach is the capacity for fast database normalizing. This takes practice, similar to the logic skill for taking statements, text or otherwise, and symbolizing.

Relational Algebra Lecture 1 (50:21) Database Psu, 2015. Penn State. The first in their amazing free online course. Lecture begins about 19:12.
Relational Algebra Pt 3 lecture (51:18) Database Psu, 2015. The various operators and their uses.

Relational calculus is a way of making queries using relational algebra to get what we want from our queries. It's less atomic than relational algebra.

relational keys (primary, foreign, composite, blank)

Link: postgres constraints ::

Relationals are a bit of a pain. See my later post on how to configure PostgreSQL for local access. Restrictions make searches easier. The main idea is a unique column (Cust ID), can then access a unique row in the next table. In the pic below, the primary is green, the foreign red. A column with a "references constraint" on it is a foreign key.

Primary and foreign keys (2:08) Udacity, 2015. Primary and foreign keys explained. Foreign key is column with references constraint. No lookup table is used, but it might be possible by indexing somehow.
Primary and foreign keys (5:24) Prescott Computer Guy, 2011. This has an intermediate lookup table. He wants every table to have an ID column, to be sure to have a unique value for that table.
Postgres version (5:42) Trent Jones, 2020. Unfortunately uses PgAdmin not DBeaver.
SQL from teaching perspective (10:12) Mike Dane, 2017. Exam configuration

primary key: Postgres/SQL constraint equivalent - NOTNULL, UNIQUE
foreign key: Postgres/SQL constraint equivalent - none. Verifies that a column from keyed table always has a value in another table (is linked).

constraints

PostgreSQL	MySQL
PRIMARY KEY	PRIMARY KEY
FOREIGN KEY	FOREIGN KEY
UNIQUE (column)	UNIQUE
NOT NULL/NULL (null allowed)	NOT NULL
CHECK	CHECK
DEFAULT	DEFAULT
CREATE INDEX	CREATE INDEX
EXCLUDE

start-up/shutdown (postgresql)

See my other post for how to configure as user. That done, it's a pretty simple....

$ postgres -D '/home/foo/postgres/data'

With the server is up and running, you can open another terminal and run psql commands. If you run as user postgres (see below), you'll have admin authority.

$ psql postgres

If a lot of data has been written, a good exit phrase to make sure all is safely written:

$ psql -c CHECKPOINT && pg_ctl stop -m fast

Wednesday, October 6, 2021

dashboard options

We'll first want to run a dashboard on our local system, using a light webserver and PHP server, before attempting it over a LAN or WAN. Step one is to inventory all our databases (including browser's IndexDB), logs, mail, and daemons (units/services) on our systems prior to attempting the dashboard. That's because a dashboard will add to what's already present. We need a system baseline. Because for example, the simplest MTA's add a database of some sort. SMTP4dev adds an SQLite.db in its directory, same with notmuch (or maybe an XML datatabase). If we go so far as Postfix, it requires a full relational database. So we need a pre-dashboard inventory of databases, logs, and mail, written on a static HTML page.

why dashboard

We may want to monitor certain applications or system parameters, financial counts, student attendance, anything. If we start with our own system as a model there are 4 things which regularly change: logs, timers/cronjobs, system values (temp, ram usage, hdd usage, net throughput, etc), and application information. We might even want to craft a database to store some values. Postgres is best, but we can get general theory from MySQL models.

MySQL table basics (14:26) Engineer Man, 2020. Great overview for the time invested.

Prior to a real-time dashboard, a slower process of daily emails with a summary are a good start, even if we just mail them to localhost disconnected from Web.

Once we can update a daily localhost email, we can attempt to expand that to internet email. Or/and, we can add a light local webserver + PHP. We need this to have dynamic webpages; opening an HTML webpage from a file directory is "load once", not updatable.

configuration models

Security camera configurations were created around updating pages, and these configurations might be adaptable to system monitoring. More than security models, the DevOps/BI models seem glean-able. These servers might be, eg Fluentd, Prometheus, and Grafana. Production server software, but localhost possibilities. Most are in the AUR. Prometheus is sort of a DAQ for software -- with email alerts possible -- and then Grafana is typically the display for it. But neither require the other. Grafana can display from any data source, or take its info from Prometheus. For Postgres built DB's, TimescaleDB has a lot of videos that might apply. We might even be able to modify a Moodle setup, now that we can upload quizzes using the Aiken model.

services

We can attempt several daemons on a local machine, and see which ones are too resource intensive. Also timer scripts to execute those services only as needed, stopping after.

local system dashboard

logs and timers

Nearly all logs are inside /var/log, but we need to evaluate our system carefully at least once for all relevant log locations. Some logs are ASCII, others are binaries that require an application or command to obtain their info. Once tallied, systemd timers and scripts are the simplest, with possible output via postfix. If we then add a webserver and PHP, we could run a systemd timer script every hour on logs and which updates a localhost webpage. To see running timers...

# systemctl list-timers

Fluentd is a log aggregator available in the AUR, but may also need a DB to write to.

Timer files (18:16) Penguin Propaganda, 2021. This guy makes one to run a script. However, what about running a program and then shutting down the service after? 10:00 restarting systemctl daemons and timers.
Systemd start files. (7:52) Engineer Man, 2020. How to make unit files, which we would need anyway for a timer file.
Dealing with system logs (10:00) Linux Learning, 2018. Mostly about /var/log. Explains some are ascii others binary.

email inside our system (mua + mta)

Link: localhost email setup - scroll down to email portion.

For a graphical MUA, I used (in Chromium) an extension called Open Email Client. Some configuration information is provided by the developer.

various

ISP Monitoring (8:04) Jeff Geerling, 2021. Jeff (St. Louis) describes typical frustrations and getting information on power and internet usage. Shelly plugs ($15 ebay) are one answer. However there are excursions into several dashboard options.
How Prometheus Works (21:30) Techworld with Nana, 2020. Why and how Prometheus is used.
Grafana Seminar (1:02:50) TimescaleDB, 2020. Avthar Sewrathan (S.Africa) Demonstation and some of his use cases
Grafana Seminar for DevOps (1:01:59) Edureka!, 2021. Grafana half of a Edureka seminar on DevOps with Prometheus and Grafana. Thorough description including how to create a systemd service file to run it locally.
Prometheus and Grafana (54:06) DevOpsLearnEasy, 2020. Adam provides description of a server deployment of Prometheus and Grafana. This guy even shows the setup of a VM on AWS. He seems confused but we learn, same as strike football penetrates the veneer.
Grafana Seminar (1:02:50) TimescaleDB, 2020. Avthar Sewrathan (S.Africa) Demonstation and some of his use cases
Grafana Seminar for DevOps (1:01:59) Edureka!, 2021. Grafana half of a Edureka seminar on DevOps with Prometheus and Grafana. Thorough description including how to create a systemd service file to run it locally.

Thursday, September 30, 2021

investment api dash -- various methods

sections -- dashboard, dividends, retirement

Having failed to create a trading bot, I look with interest on the middle ground of a dashboard. Think of it as BI for home finances. Google Sheets, API scrapers, and so forth. I ran across the ThinkStocks video below. The project seems worthy of a post here: there are building blocks within this video that I think will allow other business and investing analysis.

Equities dashboard - Google Sheets (1:06:24) ThinkStocks, 2021. Mileage,

api status

We'd like to have access to an API in all sectors: equities (incl. dividend), fixed, crypto, forex, commoditities. And also possibly to derivatives of these.


	equities and bonds: Firstrade Good discount BD, with fixed-income and retirement account options. Clearing handled by Apex Clearning. API keys = unknown.
	crypto: CoinbasePro Coinbase acquired GDAX. The former GDAX accounts qualified for the slightly upgraded (leveraged trades, etc) CoinbasePro. Maker/taker percentages reasonable. Explained here (18:58). API keys = yes.
	forex API keys = yes

dividends

Dividend information takes some digging. Which equities (even if REITS) and ETF's pay them? How much? Have they suspended payments?

investopedia best I've found so far for ranking top-tier ETF's.
marketbeat: dividend info well organized.

Dividend History (8:51) Passive Income Investor, 2019. Uses some tools such as marketbeat to get history info on dividends.
His Top REIT's (14:14) Passive Income Investor, 2021. These have probably gone down, but we can see his process for valuation.

retirement


social security	$1,200. They send a thing once a year showing what would come in on a monthly basis. According to this article, no one pays income tax on more than 85% of their total social security.
veterans	$1,200. missed it by 9 weeks.
california	$700. PERS
dividends	$100. quarterly. currently taxed

Saturday, September 25, 2021

csv, sms, postgres, dbeaver, python project?

I began this post 10 years ago only thinking about a LAPP or LAMP. Even that was too wide a scope. This post only covers importing a CSV into a stand-alone PostgreSQL instance. We can then take the data for additional processing or practice or whatever. Or for Business Intelligence (BI), since every database is an opportunity for BI meta-analysis. There are several pre-made tools for this, but they may not address our focus. They're still worth a look.

dash and plotly introduction (29:20) Charming Data, 2020. A bit outdated, but the fundamental concepts still helpful.
PBX - on-site or cloud (35:26) Lois Rossman, 2016. Cited mostly for source 17:45 breaks down schematically.
PBX - true overhead costs (11:49) Rich Technology Center, 2020. Average vid, but tells hard facts. Asteriks server ($180) discussed.

file/record plan issues

We can save psql and sql scripts as *.psql and *.sql files respectively. If we use, eg DBeaver, it will generate SQL scripts, which we can subsequently transform, with some work, into PSQL scripts. We may wish to learn more Math also, though this can be difficult on one's own.

Back to our project. At the lowest levels, we need a system which

provides document storage. folders receive a number relevant to their path, and we save documents in them without changing the file names. Of course this has to account for updates of new file addition or any which are retired/deleted.
queries either on meta information (dates), or upon document content (word search)
reports on the database itself; its size, category structures, number of files pointed to, and so on. This is good BI app territory.
back-up configurable or cloud hosting
if office access, possible GUI to delimit employees queries. browser friendly desktop development required for a GUI (PyGObject - GTK, or QT), maybe even Colab.

Python is now flexible enough, especially through become advanced enough to establish client side connections with a database which used to require PHP.


	Grafana apparently works well in a time-series but also could allow us perhaps display counts minutes or numbers of texts tied to various numbers. Pandas and even R have some visualization options.
	Dash is another API similar to Grafana. We can use one or the other, or GraphQL.
	Fluentd can help us if we need to manage several relevant logs around this database project.
	Logs are somewhat specific, but prometheus monitors selected system traits - memory use, HDD access. It can also do this over HTTP in a disributed system, as described here. These kinds of time series integrate well with grafana.

	No particular logo, but SMS related, being able to run some kind of programs against databases to find mean, mode, and standard deviation, even in a table. For example a report that showed average number of texts, word length, top 3 recipients and their tallies, etc. Easily run queries against a number and report to screen. Use browser to access DB.

SMS

A possible project for document managment may be to establish SMS message storage, query and retrieval. The data are row entries from a CSV, rather than disparate files with file names. A reasonable test case.

pacman postgresql dbeaver python. By using dbeaver, we don't have to install PHP and Pgadmin. Saves an extra server.
activate postgreql server
import CSV into Gnumeric and note column names and data types. This can even be codified with UML
populate database with correct column names and formats. Also check data file with text editor to verify comma numbers are accurrate, etc
Python script or manual dBeaver to import CSV file(s).
rclone backup of data files to Cloud

PostgreSQL - install and import CSV (11:19) Railsware Product Academy, 2020. Also PgAdmin install and walkthrough.
PBX - on-site or cloud (35:26) Lois Rossman, 2016. Cited mostly for source 17:45 breaks down schematically.
PBX - true overhead costs (11:49) Rich Technology Center, 2020. Average vid, but tells hard facts. Asteriks server ($180) discussed.

1 & 2. postgresql install/activate (1.5 hrs)

Link: Derek Banas' 3.75 hr PostgreSQL leviathan

See my earlier post on the steps for this. I rated the process at 2 hrs at the time. However, with the steps open in a separate page, 1.5 hrs seems reasonable.

At this point, we should have a PostgreSQL server (cluster) operating on a stand-alone system out of our home directory, with our own username, eg "foo". We can CREATE or DROP databases as needed as we attempt to hit upon a workable configuration.

$ postgres -D /home/foo/postgres/data
$ psql -U foo postgres
postgres=#: create database example;
postgres=#: drop database example;
OR
postgres=#: drop if exists database example;
postgres=#: exit

It would be nice to have a separate cluster for each import attempt, so I don't have to carefully name databases, but I don't want to run "initdb" a bunch of times and create a different data directory for each cluster, which is required. So I'm leaving everything under "postgres" cluster and when I get the final set of databases I like, I'll do a script (eg. *.psql) or some UML so I can re-install the solution after deleting the test project.

3. csv import

Most things on a computer can be done 100 different ways so I started with the simplest -- psql and moved to other ways. This is all covered below. But my first step was to pick a CSV of backed-up SMS's for the process, and clean it.

3a. csv cleaning

I first selected "backup.csv" with 7 columns and 456 rows. I simply opened it with Gnumeric and had a look. I noted advertising in the first and last rows and stripped these rows from Geany. This left me a 7x454, with the first row the column titles. What's interesting here is some of the texts had hard returns in their message contents, so that there were 581 lines. I therefore made a second version, 7x10, with no returns in the message contents; "backup02.csv", for simplest processing.

3b. data types

The efforts below have taught me that we need to understand several basic data types. There were a lot of failures until then. I found this video and this post helpful. Matt was only making a table for a two column price list, but it gives us a start.

And here the more complex scenario Raghav skillfully addresses with multiple tables.

3c. monolithic import attempt

Without considering keys or any normalization, let's try to just bring in an entire CSV, along the lines of Matt's import above. Of note, the times were in the format, "2021-03-07 21:40:25", apparently in the timezone where the call occurred.

$ psql -U foo postgres
postgres=#: create database test01;
postgres=#: \c test01
test01=#: create table try01 (between VARCHAR, name VARCHAR(30), phone VARCHAR(20), content VARCHAR, date TIMESTAMP, mms_subject VARCHAR(10), mms_link VARCHAR(20));
test01=#: select * from try01;

test01=#: COPY try01 FROM '/home/foo/backup01.csv' DELIMITER ',' CSV HEADER;

This accurately brought in and formatted the data, however it also brought a first row that was all dashes, and I don't really need the 1st, 6th, and 7th columns. I looked at the PostgreSQL documentation for COPY. The columns look easiest to fix so i created a smaller table without them.

test01=#: create table try01 (name VARCHAR(30), phone VARCHAR(20), content VARCHAR, date TIMESTAMP);

test01=#: COPY try01 FROM '/home/foo/backup01.csv' DELIMITER ',' CSV HEADER;

4. dbeaver tweaks (1 hr)

The most important thing in DBeaver, after we've connected to our server, is to remember to R-Click on the server, go into settings and select "Show all Databases". If you forget this step, you will go insane. I didn't know about that step and... just do it. The other thing is a helpful "Test Connection" button down in the L corner.

Finding that some columns have dates, times, numbers, and do we want to use the telephone number as the primary key? Once we have a working concept, we'll want to UML it.

dbeaver - some basics (16:23) QAFox, 2020. Begins about 10:00. Windoze install but still useful.
PBX - on-site or cloud (35:26) Lois Rossman, 2016. Cited mostly for source 17:45 breaks down schematically.
PBX - true overhead costs (11:49) Rich Technology Center, 2020. Average vid, but tells hard facts. Asteriks server ($180) discussed.

python

As noted above, however, we are now interested in letting Python do some of the scripting, so that we don't need two languages. To do this, we install PL/Python on the PostgreSQL side. Other options are available for other languages too -- for example if we want to run statistical "R" programs against a PostgreSQLdatabase, we'd install PL/R. Or we can write-out PL/pgSQL commands and put them into Python or R scripts if we wish.

On the Python side, we obtain modules from PyPi, such as Psycopg2 (also available in pacman repos in Arch). With PL/Python in the database, and Psycopg2 modules in our Python programs, our connections and commands to PostgreSQL become much simpler. And of course, one can still incorporate as much or as little PHP as they wish, and we'd still use judicious amounts of JavaScript in our Apache-served HTML pages. To summarize, in a LAMP we might want:

server-side

PostgreSQL - open source database.
Python - general purpose language, but with modules such as Psycopg2, we can talk to the database and also serve dynamic webpages in HTML, replacing PHP, or even create some GUI application which does this from our laptop.
Apache - still necessary, since this is what interacts with the client. Pytahon or PHP feeds the dynamic content into the HTML, but Apache feeds these pages to the client (browser).
PHP - still available, if we want to use it.

Saturday, August 28, 2021

cell - motorola droid turbo2 - xt1585 4G

2024 Note: These can be used as WiFi Cams for video production. G Pure (xt2163) and G Play (xt2093) are updated versions of this phone with slightly stronger performance, but still without 5G; only VoLTE is available. Further, providers are no longer supporting these phones.

Need a new phone and not in the income category of an investment banker or a first responder? T-Mobile traditionally uses SIM cards (GSM)

These 2015 phones run about $60 on Ebay and have 32GB of storage for apps and storage, plus a person could add an SD storage card.The SD card goes in next to the SIM. The camera creates JPG pics, and handles low light reasonably well.

outward facing phone: 5344 x 3006 (5M per pic)
self-facing phone: is 2592x1458 (1.1M per pic)
video: 1920x1080, h264 high, 17,000kbs. (1M per second).
processor: Snapdragon 805, Krait 450 chipset
battery: a 3550 mAh lithium-poly battery (see below).

Battery FB55

tools: spudger and micro spudgers, torx t3

The phone seems to go about 2 years on a battery. Replacement battery is $10 delivered. They're not easy to access. It's about a one hour job, at a slow pace. Seventeen micro TORX screws. That's 17 chances to strip one. Keep downward pressure on the tool.

The little ribbon cable on the lower right is key. Some I've ordered arrive *without* enough length on that connector and are thus worthless.

Battery Replacement (2:12) Wit Rigs, 2016. by far the best video. All others go unnecessarily further. It's bad enough: the 17 x T3 screws cannot be avoided.

VoLTE status

We want to use the phone in LTE mode for fastest data and solid voice connection. The phone will otherwise default to GSM voice and EDGE data. This way we can talk on 3G and still text on EDGE simultaneously. But these are 3G type functions, so once providers sunset 3G in 2022, their will be no way to use voice and data at the same time, or GSM voice at all. We need LTE. How to ensure the phone is set for it?

I installed the Hidden Android Settings app. It doesn't require root b/c it doesn't have special functions. What it does is give quick access to functions which typically require menu after to menu to locate.

For LTE, I went into RadioInfo, then found the screen you see below. Although it appears relevant, just ignore the "Volte Provisioned" slider. Go instead to the drop-down arrow. There's 20+ options. I selected "LTE/CDMA/UMTS auto (PRL)". As you can see this resulted in both my Voice and Data Network types switching to LTE. For me, this is good enough

Tuesday, August 24, 2021

video - workflow

One has to back-out to a philosophical level to think about video work in common terms, so there is more of that in this post than my typical posts. There's recipes and there's ingredients, and there's logs of the work. Some of what might need to happen is to also have a database philosophy, since each edit could be a database entry. I should reference a database post.

Video is problematic. I outline further down the workflow elements which seem most consistent across projects. It's hackneyed though because I "meta-up and drill down" almost uniquely to each clip or overall project. So, I thought of listing similar project challenges rather than a list of postive steps. But even challenges shape-shift somewhat in each project. Difficult to patternize. Aieee.

Over the years, the one universal so far is -- the more vast and standardized one's storage and retrieval system is, the better. Storage for both video (ingredients) and production logs (recipes). An arguable second bedrock may be video vocabulary. Video thought seems unique to each person's visual imagination; a word, in place of an explanation, can be a small efficiency gain when dealing with others.

creative process

Sample a few other creative pursuits; writing, music, cooking. Of these, writing is the simplest, since we only need creativity and language. Creativity varies by time of day or time available, but compare this to video. Variation applies universally. Creative opportunties vary by time, inclination, premise, budget, location, expertise, chance, equipment, software etc. Also across functions. One video might have various percentages of creativty in its shooting, its editing, its vision of the outcome, and so on. And these variances might also differ in emphasis for a hobbiest versus a paid videographer. It's difficult to proceduralize video universally.

For me, most videos are sifting and reflection processes. The end result is a kind of report of an investigation upon the video, stills, text, sounds, and effects at hand. Rarely is the outcome (even in a commercial video) entirely known in advance, no matter how clear the sponsor's (implicit or explicit) instructions or desires.

project ingredients

Mostly there are 1) conceptual elements and 2) technical elements. The latter support the former. By analogy video creation is perhaps more like music or cooking than writing. The tech of writing is fairly simple; sometimes only pen and paper or simple software are necessary. Meanwhile the tech of video can become so complex that it prevents the video entirely, or mangles the intended outcome. Many many video ideas are thwarted by production limitations (software, skills, etc).

the shooting

We don't always know why we shoot. One pole of the continuum is, "shoot first and sift later". The other pole is "gather shots to support X concept". . Usually it's a mixture: a vague concept ("I'll bring my camera along on this trip") followed by a sifting process. Sometimes though it's a very clear concept ("I'm making a soda ad, and need to make several shots for my/producer's idea of the advert"). The latter might even turn the former on its head: "I need to make this trip with these cameras to get these shots for the project".

Ideally we'd prefer infinite latitude, time, and/or money to change our shooting on the fly. Creative whims develop in situ with the events and cameras. We know some things in advance of editing. We know the type of camera(s) we used. We know why we brought them -- even if multiple reasons (financial limits, waterproof, low-light, etc). And we often remember some of our intention(s) when shooting X thing, though often times these will be layered: "bored", "capture X event", "capture spirit of X event", "document X event beauty highlights", "wow". So the shooting varies from desirably undefined to a conscious attempt to capture the blocks to complete a pre-concieved schema.

Sometimes the theme is a trip, but only assembled afterwards. Perhaps I was able to capture several meaningful sunrises or I was struck by a person's expression and tried to capture it. Other times, when sifting through footage later, some surpise nuggets turn into a major or supporting theme. This can be organized into the final product (the highest meta level) and several versions with different emphasis or support might be necessary.

scratch sheet documenting

Whether a person organizes their work in advance or lets it organically develop while sifting through clips, some scratch sheet, sometimes many sheets, will be sketched with time, effects, and transitions. Like writing, some schemes are eliminated, some are modified in progress, some can be done nearly all in one's head, and very few can be rigidly determined in advance. To me, each video is more like a sifting and investigation process. the very few can be done on the fly with experience. plan timing and nature of cuts, transitions, and other effects. In each clip, they will need to know the timing of what's taking place, either by frame number, time, keyframe, or some combination.

first cut at organizing result:
normalize clips and record results: evaluate raw clips (eg. ffmpeg -i, personal knowledge). Relevant features vary by project, but typically at least timebase and fps must match from one clip to the next. Keyframe times or frame number may also be relevant. In the maximum case, enter relevant details about all clips into a spreadsheet or other form, since renders will need to be optimized to a minimum number and we'll want to combine events. Alternatively to step 1, an editor can just pick up the relevant info on each clip as they go along.

storage

pantry analogy. a chef not only produces food, but needs to save his recipes and needs to buy ingredients. The problem with video is it has an eternal shelf life and might have nuggets which want to be drawn from later. And as for the recipes, they need to be kept indefinitely and sometimes refined.

Where to store gigs and gigs of images and video, and the various versions of rendered results? Where to store the meta documents of each edit? How to even find clips again, or the method that went with a

Monday, August 23, 2021

google scripting

Links: Google App Scripting

Another easy access from Drive. Go to "+" to create a new file, then down to "more" and you can open a script from there. The file extension is .gs.

This language is only a scripting level language, similar to, and based off of, JavaScript. The syntax is JavaScript similar although the methods are mostly unique to Google. Create a folder in your Drive for automations and then they'll handle these things for you. Each script appears to be attached to a particular file, so not sure what happens if one deletes the file itself.

Google Sheet interactions (22:50) Ryan Stewart, 2017. SEO firm apparently. Some technical features of Google interactivity may have changed however work-flow is well demonstrated. Underlying scripts are only implied and not shared.
Introduction to Document Control Video Preview (7:05) patonprofessional, 2010. Excellent overview of media types and uses within an organization.
Google App Scripts (23:53) saperis, 2020. Ways to get the Google sheets to spawn and combine. Chanel Greco
Sheet formatting for teachers (16:47) Hello, Teacher Lady, 2019. Starting around 11:00 gives complex report creation. Conditional formatting and merges.

Monday, May 17, 2021

page size - chromium printing

Most federal and state tax instructions have, over the past 3 years, been transformed from PDF's into (unhelpful) web pages.

This means that citizen taxpayers must now themselves save instruction webpages to PDF. Once saved to PDF, citizens may again accomplish quick searches, print worksheets, or preserve the PDF as a record of the instructions they used that year.

Sounds quick, but not always: how the webpage is saved to PDF determines how it can be printed, and computer settings determine how the browser may save webpages. All together then, three additional failure possibilities have been added to tax preparation. Perhaps the idea is to encourage citizens to purchase parasitic 3rd party tax preparation.

chromium solution

Let's solve this browser PDF issue if we can, starting with Chromium. Chromium saves/converts webpages to PDF's in a default A4 paper size. But most US printers can only print to letter-size. Chromium, as installed, provides no method for changing the paper size of how a PDF is saved.

With a significant time investment, I was able to determine that Chromium relies upon the locale variable LC_PAPER for its PDF paper size setting. Now, locale variables are complicated multi-effect variables which one doesn't wish to risk modifying for a simple paper setting. I was unable to find any other way to proceed.

1a. in-session

The hack below has an immediate effect in Chromium, allowing one to select the type of paper on a drop-down, whereas there was previously no option to change paper.

# localectl set-locale LC_PAPER=en_US.UTF-8

1b. in-session

If the above doesn't work, it may be important to know that Chromium takes its settings from GTK2, variable GtkPaperSize, of which we would like to at least be able to change between letter and A4.

2. persistent configuration

In Arch, I found two files, depending on whether I wanted global settings or user settings. Both must be created and there's a skeleton in /etc/skel/.config/locale.conf

user: ~/.config/locale.conf, which can be activated by logging out and then back in. However. whether or not Chromium bothers with user-level settings seems questionable.
global: /etc/locale.conf.

good information on the subject.

disable the LC_ALL and/or LC_CTYPE variables as they will override all subsequent individual variables such as paper size. But where? Arch prefers us to use locale-gen, which means uncommenting inside /etc/locale.gen. However, locale-gen sets monolithically: all variabless get the same value.

change what?

en_US

locale

Just as with the pernicious problem of pulseaudio, locale information is stored in several places and they need to be narrowed down to a single source. Further,they're exported to kernel as a group but we only want LC_PAPER to be corrected, otherwise we will have problems with fonts. Typically, they are held consistent across all variables.

/etc/locale /etc/profile.d/locale.sh
~/.config/locale.conf
/etc/environment (PAM only)
/

localectl

locale-gen

A worthless application. Causes problems familiar to the similarly abstracted GRUB configuration, once it began auto-generating. Locale-gen generates monolithic LC settings, not individual ones. No setting just one variable. Using # locale-gen after a user modifies /etc/locale.gen, it overwrites all the LC variables to one single language, so it's worthless attempting to just change paper size

Of particular note is users will be punished with A4 if they don't want to declare a locale, since declaring locales causes problems with various fonts..

Tuesday, March 23, 2021

fintech

Who is who in the payment platform war on cash. Fintech have Money Transmitter Licenses, not bank charters. The money must be kept in a partner bank, which has a bank charter. Regardless, they must collect plenty of AML and KYC information to do their ACH transactions, comply with FTC and international regulations, and with law enforcement.

some common fintech operators


	P2P. Partner/holding bank: Syncrony Financial (NYSE: SYF). Scroll down a bit in this article, for information that PayPal apparently doesn't want to be a bank. Also see video below.
	P2P, POS. Partner/holding bank: Sutton Bank (privately held Sutton Bancshares, Inc., Ohio).
	P2P. Partner/holding banks: Bank of Montreal (Canada), Banco Bilbao Vizcaya Argentaria SA (USA, NYSE:BBVA). Apparently, Google is likely mining purchase data for ad sales. As of April 5, 2021, will be confusing b/c can only have Google Pay on one device and no longer free P2P (transfer fee).
	P2P, B2C. Partner/holding bank: Green Dot Bank. However, Goldman Sachs backs the Apple credit card.
	P2P, C2B, C2G, and exploring POS (2021). Zelle is different because founded by banks, so chartered members, no apparent need for an MTL. JP Morgan, BofA, US Bank, Citi, Wells Fargo, Capital One, BB&C, and PNC Bank launched Zelle from clearXchange in 2017. Guessing they profit-share somehow via Early Warning Systems, which they own.

How Venmo Makes Money (11:29) CNBC, 2019. Startup to Braintree to aquisition by PayPal. Partner banking and scary AML/fraud obligations. How Zelle emerged.
Features Comparison (page)

Thursday, March 11, 2021

bonds

Dealing in bonds is a rich universe of calculations and ratings. Within such an environment, a large terminology set is expected. One can go in-depth and attempt to remember the vocabulary. IME, an easier start on useful bond terminology is to know the six elements we use for pricing a bond. Once pricing, the difficult kernel, is understood, the peripheral terms became clearer to me. The main pricing terms are shown in the graphic below, only lacking the "market price". The market price is another name for a type of benchmark, a concept slightly more complex than can be explained in the graphic anyway.

"Coupon rate" is also sometimes called the "contract rate". It's the annual interest rate the issuer pays on the bond. Accordingly, a "zero-coupon" bond pays no interest.

primary or secondary

With these elements, how does one determine how to purchase a bond? Unlike the equity market, we can calculate what to offer for a bond on any given day. This bond price will be moving if we're looking at the secondary market.

primary market: revenues from the bonds go directly to the issuing organization as capital. Typically, these are sold at par.
secondary market: the issuing organization receives no proceeds -- these are sales between customers. Prices fluctuate by the second, same as equities. The categories are discount, par, and premium.

discount, par, premium

The results of our price calculation will result in the bond's current price having a designation: discount, par, or premium, depending on how the price corresponds to the face value of the bond.

Bonds are considered interesting because "the price goes down when the yield goes up". This indirect relationship is what determines the bond's description that second. As we know from JHS math, when a fraction's denominator increases, the overall value of the fraction decreases. Yield is in the denominator, and the price is the overall quotient.

discount bond

Zero coupon bonds are alwasy discount bonds. One of the more straightforward descriptions of a discount bond (its YTM) comes from this Reddit post.

So this makes sense insofar as the yield (interest rate) is in the denominator. The larger the denominator, the smaller the quotient. The quotient is the selling price. So then the price will go down as the interest rate increases. This is not even mildly interesting mathematically. However it's pivotal in terms of trading spreads and so forth.

Pricing Bond w/YTM: Mithril 13 (11:04) MithrilMoney, 2013. The relationship of various features of bonds that lead to the interest rate up, price down, when pricing for Yield to Maturity.
More about YTM: Mithril 14 (14:29) MithrilMoney, 2013. Excel manipulations to iterate (basically Newton's method) to reverse engineer what the price of the initial offering should be to get to a target yield. The inverse pricing relationship: yield up, price down.

More interesting is why money is discounted the day it is sold. Why is the selling price less than the value of a bond? I have no idea.

Here's what I mean. A corporation (Ace) selling a 10 year $100 bond with a 2% interest rate not only pays less interest over 10 years ($20), but also receives $82.03 today. Whereas a corporation (Bass) that has to pay bond purchasers a 6% rate only receives $55.84 today and has to pay $60 over 10 years. Ace and Bass both selling $100 bonds, so you'd think they each receive $100 each today, but they receive different amounts for the bond, because the interest rates is factored into the price. Why? I don't know. In neither case is it simply a $100 IOU.

worse

Here's another way it's interesting. Say we both buy $100 10 year bonds. I paid $55.84 for the 6% Bass and you paid $82.03 for your 2% Ace. But to make this easier, lets round to $56 and $82.

Ok, you and I each start with $100. This means I have $44 left in my hand, after paying $56 for the 10 year Bass. You have $18 left in your hand, after buying the 10 year Ace. You're going to get $2 per year for 10 years or $20, plus your principle of $100, so 18+20+100 or $138 total $$ at the end of 10 years. I start with $44 in hand, receive $60 interest, plus $100 = $204. I made $66 more dollars than you.

The company is also a weird scenario. In your case, it received $82 and had to pay back $120, so that money cost them $38 (120-82) for 10 years' use of $82. In my case, it received $56 and had to pay back $160, so that money cost them $104 (160-56) for 10 years' use of $56.

This means Bass paid about 20% on a 6% IOU, and Ace paid about 5% on a supposed 2% IOU. Wow.

You'd think it worked much more directly where we each loaned them $100 for 10 years and in my case I made $60, and your case you made $20 and that's all it cost the company. So a bond is not exactly like a loan. They lose 5% on you and 20% on me.

explanation

There are three types of bond in the initial market, not secondary market: bonds issued at par, discount or premium. The scenario above, the most common one, where "yields go up when price goes down" is for, I believe, discounted bonds.

Discount, Premium, Par (6:41) Notepirate, 2015. Mileage,
Accounting on discounts and premiums (7:15) Prof. Elbarrad, 2019. Straight line accrual accounting on the premium.

secondary market

There may be a person who doesn't hold it to maturity.

Bonds explained: primary market (6:41) Killik & Co, 2013. Mileage,
Bonds explained: secondary market (10:28) Killik & Co, 2013. Straightforward description but without much math. It's noted at 4:26 that calculating a bond's Yield to Maturity (Gross redemption yield).

trading

Traders need to look at curves and know *which* curve. First is just the basics of understanding a yield curve. Second, understand yield spreads. Also we can trade options or futures in bonds, so their are derivatives.

yield curve

The plot is of bonds of equal credit quality and different maturity dates. A curve can therefore be drawn macro, or down to the curve for a particular company.

Yield curve (5:10) IronHawkResearch, 2021. Rudimentary but useful.
Streamlit and his coding (56:10) Part Time Larry, 2021. Using Streamlit to avoid a webserver when visualizing with Python.
Google Sheets version of Stock Dash (1:06:24) Think Stocks, 2021. Goes through all elements including coding of various formulas.
Accounting on discounts and premiums (7:15) Prof. Elbarrad, 2019. Straight line accrual accounting on the premium.

yield spread

We may wish to trade the yield spread. For this we'd want to understand the yield curve.

Yield Curve (8:58) The Plain Bagel, 2018. Bases a description in the secondary market.
Bond Trading 101 (7:31) Will Armstrong, 2018. Very simple description and a simple view of a website's purchase interface and how to use. Bonds often issued when organizations need more than a loan amount.

trading dashboard

Equities Dashboard Excel (27:05) MyOnlineTrainingHub, 2021. Although this is done in Excel365, it can nearly be duplicated in Google Sheets. This lady Minta has many vids on spreadsheet dashboards
Streamlit and his coding (56:10) Part Time Larry, 2021. Using Streamlit to avoid a webserver when visualizing with Python.
Google Sheets version of Stock Dash (1:06:24) Think Stocks, 2021. Goes through all elements including coding of various formulas.
Accounting on discounts and premiums (7:15) Prof. Elbarrad, 2019. Straight line accrual accounting on the premium.

api scrape

Just as with equities, we may wish to use an API to scrape information from websites to make various bond transactions.

Discount, Premium, Par (6:41) Notepirate, 2015. Mileage,
Accounting on discounts and premiums (7:15) Prof. Elbarrad, 2019. Straight line accrual accounting on the premium.

ratings

As you'll find if you open, or buy, a business. The business will eventually acquire a credit rating, for better or worse. It also will acquire a DUNS number and a Dun and Bradstreet rating used to rate your bonds.

yield curve/benchmark

The yield curve is also one of the benchmarks. Taken by itself, it's considered an economic indicator. But it also is an input into yield to maturity calculations. We need to consider the bond rating and some benchmark when, eg discounting zero coupon bonds, or when making the calculation for daily pricing.

Thursday, January 21, 2021

sample video edit (less than 1 minute)

Side Note: TIME CALCULATOR. This post is FFMPEG, but there's a UAV guy who has a YT channel, who works nearly exclusively with SHOTCUT. Some of the effects are amazing, eg his video on smoothing. There's also an FFMPEG webpage with pan tilt and zooming info not discussed this post. For smoother zooming, look here at pzoom possibilities. Finally, for sound and video sync, this webpage is the best I've seen. Sync to the spike.

Suppose I use a phone for a couple short videos, maybe along the beach. One is 40 seconds, the other 12. On the laptop I later attempt to trim away any unwanted portions, crossfade them together, add some text, and maybe add audio (eg. narration, music). This might take 2 hours the first or second attempt: it takes time for workflows to be refined/optimized for one's preferences. Although production time decreases somewhat with practice, planning time is difficult to eliminate entirely: every render is lossy and the overall goal (in addition to aesthetics) is to accomplish editing with a minimum number of renders.

normalizing

The two most important elements to normalize before combining clips of different sources are the timebase and the fps. Ffmpeg can handle most other differing qualities: aspect ratios, etc. There are other concerns for normalizing depending on what the playback device is. I've had to set YUV on a final render to get playback on a phone before. But this post is mostly about editing disparate clips.

Raw video from the phone is in 1/90000 timebase (ffmpeg -i), but ffmpeg natively renders at 1/11488. Splicing clips with different timebases fails, eg crossfades will exit with the error...

First input link main timebase (1/90000) do not match the corresponding second input link xfade timebase (1/11488)

Without a common timebase, the differing "clocks" cannot achieve a common outcome. It's easy to change the timebase of a clip, however it's a render operation. For example, to 90000...

$ ffmpeg -i video.mp4 -video_track_timescale 90000 output.mp4

If I'm forced to change timebase, I attempt to do other actions in the same command, so as not to waste a render. As always, we want to render our work as few times as possible.

separate audio/video

Outdoor video often has random wind and machinery noise. We'd like to turn it down or eliminate it. To do this, we of course have to separate the audio and video tracks for additonal editing. Let's take our first video, "foo1.mp4", and separate the audio and video tracks. Only the audio is rendered, if we remember to use "-c copy" on the video portion, to prevent video render.

$ ffmpeg -i foo1.mp4 -vn -ar 44100 -ac 2 audio.wav
$ ffmpeg -i foo1.mp4 -c copy -an video1.mp4

cropping*

*CPU intensive render, verify unobstructed cooling.

This happens a lot with phone video. We want some top portion but not the long bottom portion. Most of my stuff is 1080p across the narrow portion, so I make it 647p tall for a 1.67:1 golden ratio. 2:1 would also look good.

$ ffmpeg -i foo.mp4 -vf "crop=1080:647:0:0" -b 5M -an cropped.mp4

The final zeroes indicate to start measuring pixels in upper left corner for both x and y axes respectively. Without these, the measurement starts from center of screen. Test the settings with ffplay prior to the render. Typically anything with action will require 5M bitrate on the render, but this setting isn't needed during the ffplay testing, only the render.

cutting

Cuts can be accomplished without a render if the "-c copy" flag is used. Copy cuts occur on the nearest keyframe. If a cut requires the precision of a non-keyframe time, the clip needs to be re-rendered. The last one in this list is an example.

no recoding, save tail, delete leading 20 seconds. this method places seeking before the input and it will go to the closest keyframe to 20 seconds.
$ ffmpeg -ss 0:20 -i foo.mp4 -c copy output.mp4
no recoding, save beginning, delete tailing 20 seconds. In this case, seeking comes after the input. Suppose the example video is 4 minutes duration, but I want it to be 3:40 duration.
$ ffmpeg -i foo.mp4 -t 3:40 -c copy output.mp4
no recoding, save an interior 25 second clip, beginning 3:00 minutes into a source video
$ ffmpeg -ss 3:00 -i foo.mp4 -t 25 -c copy output.mp4
a recoded precision cut
$ ffmpeg -i foo.mp4 -t 3:40 -strict 2 output.mp4

2. combining/concatentation

Also see further down the page for final audio and video recombination. The section here is primarily for clips.

codec and bitrate

If using ffmpeg, then mpeg2video is the fastest lib, but also creates the largest files. Videos with a mostly static image, like a logo, may only require 165K video encoding. A 165K example w/125K audio, 6.5MB for about 5 minutes. That said, bitrate is the primary determiner of rendered file size. Codec is second but important, eg, libx264 can achieve the same quality at a 3M bitrate for which mpeg2video would require a 5M bitrate.

simple recombination 1 - concatenate (no render)

The best results come from combine files with least number of renders. This way does it without rendering... IF files are the same pixel size and bit rate, this way can be used. Put the names of the clips into a new TXT file, in order of concatenation. Absolute paths is a way to be sure. Each clip takes one line. The one here shows both without and with absolute path.

$ nano mylist.txt
# comment line(s)
file video1.mp4
file '/home/foo/video2.mp4'

The command is simple.

$ ffmpeg -f concat -safe 0 -i mylist.txt -c copy output.mp4

Obviously, transitions which fade or dissolve require rendering; either way, starting with a synced final video, with a consistent clock (tbr), makes everything after easier.

simple recombination 2 - problem solving

Most problems come from differing tbn, pixel size, or bit rates. TBN is the most common. It can be tricky though, because the video after the failing one appears to cause the fail. Accordingly comment out files in the list to find the fail, then try replacing the one after it.

tbn: I can't find in the docs whether the default ffmpeg clock is 15232, or 11488, I've seen both. Most phones are on a 90000 clock. If the method above "works", but reports many errors and the final the time stamp is hundreds of minutes or hours long, then it must be re-rendered. Yes it's another render, but proper time stamps are a must. Alternatively, I suppose a person could re-render each clip with the same clock. I'd rather do the entire file. As noted higher up in the post, raw clips from a phone usually use 1/90000 but ffmpeg uses 1/11488. It's also OK to add a gamma fix or anything else, so as not to squander the render. The example here I added a gamma adjustment
$ ffmpeg -i messy.mp4 -video_track_timescale 11488 [or 15232] -vf "eq=gamma=1.1:saturation=0.9" output.mp4

combine - simple, one file audio and video

I still have to specify the bitrate or it defaults too low. 3M for sports, 2M for normal person talking.

$ ffmpeg -i video.mp4 -i audio.wav -b:v 3M output.mp4

combine - simple, no audio/audio (render)

If the clips are different type, pixel rate, anything -- rendering is required. Worse, mapping is required. Leaving out audio makes it slightly less complex.

ffmpeg -i video1.mp4 -i video2.flv -an -filter_complex \ "[0:v][1:v] concat=n=2:v=1 [outv]" \ -map "[outv]" out.mp4

Audio adds an additional layer of complexity

ffmpeg -i video1.mp4 -i video2.flv -filter_complex \ "[0:v][0:a][1:v][1:a] concat=n=2:v=1:a=1 [outv] [outa]" \ -map "[outv]" -map "[outa]" out.mp4

combining with effect (crossfade)

*CPU intensive render, verify unobstructed cooling.

If you want a 2 second transition, run the offset number 1.75 - 2 seconds back before the end of the fade-out video. So, if foo1.mp4 is a 12 second video, I'd run the offset to 10, so it begins fading in the next video 2 seconds prior to the end of foo1.mp4. Note that I have to use filter_complex, not vf, because I'm using more than one input. Secondly, the offset can only be in seconds. This means that if the first video were 3:30 duration, I'd start the crossfade at 3:28, so the offset would be "208".

$ ffmpeg -i foo1.mp4 -i foo2.mp4 -filter_complex xfade=transition=fade:duration=2:offset=208 output.mp4

If you want to see how it was done prior to the xfade filter, look here, as there's still a lot of good information on mapping.

multiple clip crossfade (no audio)

Another scenario is multiple clips with the same transition, eg a crossfade. In this example 4 clips (so three transitions), each clip 25 seconds long. A good description.

$ ffmpeg -y -i foo1.mp4 -i foo2.mp4 \
-i foo3.mp4 -i foo4.mp4 -filter_complex \
"[0][1:v]xfade=transition=fade:duration=1:offset=24[vfade1]; \
[vfade1][2:v]xfade=transition=fade:duration=1:offset=48[vfade2]; \
[vfade2][3:v]xfade=transition=fade:duration=1:offset=72" \
-b:v 5M -s wxga -an output.mp4

Some additional features in this example: y to overwrite prior file, 5M bitrate, and size wxga, eg if reducing quality slightly from 4K to save space. Note that the duration mesh time increases the total offset cumulatively. I touched a TXT file and entered the values for each clip and its offset. Then just "cat"ted the file to see all the offset values when I built my command. Suppose I had like 20 clips? The little 10ths and so on might add up. Offset numbers off by more than a second will not combine with the next clip, even though syntax is otherwise correct.

$ cat fooclips.txt
foo1 25.07 - 25 (24)
foo2 25.10 - 50 (48)
foo3 25.10 - 75 (72)
foo4 (final clip doesn't matter)

multiple clip crossfade (with audio)

This is where grown men cry. Have a feeling if I can get it once, won't be so bad going forward but, for now, here's some information. It appears some additional programs beide crossfade and xfade.

fade-in/out*

*CPU intensive render, verify unobstructed cooling.

If you want a 2 second transition, run the offset number 1.75 - 2 seconds back before the end of the fade-out video. Let's say we had a 26 second video, so 24 seconds.

$ ffmpeg -i foo.mp4 -max_muxing_queue_size 999 -vf "fade=type=out:st=24:d=2" -an foo_out.mp4

color balance

Recombining is also a good time to do even if just a basic eq. I've found that general color settings (eg. 'gbal' for green) have no effect, but that fine grain settings (eg. 'gs' for green shadows) has effects.

$ ffmpeg -i video.mp4 -i sound.wav -vf "eq=gamma=0.95:saturation=1.1" codec:v copy recombined.mp4

There's an easy setting called "curves", like taking an older video and moving midrange from .5 to .6 helps a lot. Also, if bitrate is specified, give it before any filters; bitrate won't be detected after the filter.

$ ffmpeg -i video.mp4 -i sound.wav -b:v 5M -codec:v mpeg2video -vf "eq=gamma=0.95:saturation=1.1" recombined.mp4

Color balance intensity of colors. There are 9 settings - 1 each for RGB (in that order) for shadows, midtones, and highlights, separated by colons. For example, if I wanted to decrease the red in the highlights and leave all others unchanged...

$ ffmpeg -i video.mp4 -i -b:v 5M -vf "colorbalance=0:0:0:0:0:0:-0.4:0:0" output.mp4

A person can also add -pix_fmt yuv420p, if they want to make it most compatible with Windows

FFMpeg Color balancing(3:41) The FFMPEG Guy, 2021. 2:12 color balancing shadows, middle, high for RGB
why films are shot in 2 colors (7:03) Wolfcrow, 2020. Notes that skin is the most important color to get right. The goal is often to go with two colors on oppposite ends of the color wheel or which are complementary.
PBX - on-site or cloud (35:26) Lois Rossman, 2016. Cited mostly for source 17:45 breaks down schematically.
PBX - true overhead costs (11:49) Rich Technology Center, 2020. Average vid, but tells hard facts. Asteriks server ($180) discussed.

adding text

*CPU intensive render, verify unobstructed system cooling.

For one or more lines of text, we can use the "drawtext" ffmpeg filter. Suppose we want to display the date and time of a video, in Cantarell font, for six seconds, in the upper left hand corner. If we have a single line of text, we can use ffmpeg's simple filtergraph (noted by "vf"). 50 pt font should be sufficient size in 1920x1080 video.

$ ffmpeg -i video.mp4 -vf "[in]drawtext=fontfile=/usr/share/fonts/cantarell/Cantarell-Regular.otf:fontsize=50:fontcolor=white:x=100:y=100:enable='between(t,2,8)':text='Monday\, January 17, 2021 -- 2\:16 PM PST'[out]" videotest.mp4

Notice that a backslash must be added to escape special characters: Colons, semicolons, commas, left and right parens, and of course apostrophe's and quotation marks. For this simple filter, we can also omit the [in] and [out] labels. Here is a screenshot of how it looks during play.

Next, supposing we want to organize the text into two lines. We'll need one filter for each line. Since we're still only using one input file to get one output file, we can still use "vf", the simple filtergraph. 10pixels seems enough to separate the lines, so I'm placing the second line down at y=210.

$ ffmpeg -i video.mp4 -vf "drawtext=fontfile=/usr/share/fonts/cantarell/Cantarell-Regular.otf:fontsize=50:fontcolor=white:x=100:y=150:enable='between(t,2,8)':text='Monday\, January 18\, 2021'","drawtext=fontfile=/usr/share/fonts/cantarell/Cantarell-Regular.otf:fontsize=50:fontcolor=white:x=100:y=210:enable='between(t,2,8)':text='2\:16 PM PST'" videotest2.mp4

We can continue to add additional lines of text in a similar manner. For more complex effects using 2 or more inputs, this 2016 video is the best I've seen.

Ffmpeg advanced techniques pt 2 (19:29) 0612 TV w/NERDfirst, 2016. This discusses multiple input labeling for multiple filters.

PNG incorporation

If I wanted to do several lines of information, an easier solution than making additional drawtexts, is to create a template the same size as the video, in this case 1980x1080. Using, say GiMP, we could create picture with an alpha template with several ines that we might use repeatedly, and save in Drive. There is then an ffmpeg command to superimpose a PNG over the MP4.

additional options (scripts, text files, captions, proprietary)

We of course have other options for skinning the cat: adding calls to text files, creating a bash script, or writing python code to call and do these things.

The simplest use of a text files are calls from the filter in place of writing the text out each filter.

viddyoze: proprietary,online video graphics option. If no time for Blender, pay a little for the graphics and they will rerender it on the site.

viddyoze review (14:30) Jenn Jager, 2020. Unsponsored review. Explains most of the 250 templates. Renders to quicktime (if alpha), or MP4 is not.~12 minute renders

adding text 2

Another way to add text is by using subtitles, typically with an SRT file. As far as I know so far, these are controlled by the viewer, meaning not "forced subtitles" which override viewer selection. here's a page. I've read some sites on forced subtitles but haven't yet been able to do this with ffmpeg.

audio and recombination

Ocenaudio makes simple edits sufficient for most sound editing. It's user friendly along the lines of the early Windows GoldWave app 25 years ago. I get my time stamp from the video first.

$ ffmpeg -i video.mp4

Then I can add my narration or sound after being certain that the soundtrack is exactly a map for the timestamp of the video. I take the sound slightly above neutral "300" when going to MP3 to compensate for transcoding loss. 192K is typically clean enough.

$ ffmpeg -i video.mp4 -i sound.wav -acodec libmp3lame -ar 44100 -ab 192k -ac 2 -vol 330 -vcodec copy recombined.mp4

I might also resize it for emailing, down to VGA or SVGA size. Just change it thusly...

$ ffmpeg -i video.mp4 -i sound.wav -acodec libmp3lame -ar 44100 -ab 192k -ac 2 -vol 330 -s svga recombined.mp4

$ ffmpeg -i video.mp4 -i sound.wav -acodec libmp3lame -ar 44100 -ab 192k -ac 2 -vol 330 -vcodec copy recombined.mp4

I might also resize it for emailing, down to VGA or SVGA size. Just change it thusly...

$ ffmpeg -i video.mp4 -i sound.wav -acodec libmp3lame -ar 44100 -ab 192k -ac 2 -vol 330 -s svga recombined.mp4

For YouTube, there's a recommended settings page, but here's a typical setup:

$ ffmpeg -i video.mp4 -i sound.wav -vcodec copy recombined.mp4

ocenaudio - no pulseaudio

If a person is just using alsa, without any pulse, they may have difficulty (ironically) using ocenaudio, if HDMI is connected. A person has to go into Edit->preferences, select the ALSA backend, and then play a file. Keep trying your HDMI ports until you luck on the one with EDID approved.

audio settings for narration

To separately code the audio in stereo 44100, 192K, ac 2, some settings below for Ocenaudio: just open it and hit the red button. Works great. Get your video going, then do the audio.

Another audio option I like is to create a silent audio file exactly the length of the video, and then start dropping in sounds into the silence, hopefully in the right place with audio I may have. Suppose my video is 1:30.02, or 90.02 seconds

$ sox -n -r 44100 -b 16 -c 2 -L silence.wav trim 0.0 90.02

Another audio option is to used text to speeech (TTS) to manage some narration points. The problem is how to combine all the bits into a single audio file to render with the audio. The simplest way seems to be to create the silence file then blend. For example, run the video in a small window, open ocenaudio and paste at the various time stamps. By far the most comprehensive espeak video I've seen.

How I read 11 books(6:45) Mark McNally, 2021. Covers commands for pitch, speed, and so on.

phone playback, recording

The above may or may not playback on a phone. If one wants to be certain, record a 4 second phone clip and check its parameters. A reliable video codec for video playback on years and years of devices:

-vcodec mpeg2video -or- codec:v mpeg2video

Another key variable is yuv. Eg,...

$ ffmpeg -i foo.mp4 -max_muxing_queue_size 999 -pix_fmt yuv420p -vf "fade=type=out:st=24:d=2" -an foo_out.mp4

To rotate phone video, often 90 degree CCW or CW, requires the "transpose" vf. For instance this 90 degree CCW rotation...

$ ffmpeg -i foo.mp4 -vf transpose=2 output.mp4

Another issue is shrinking the file size. For example, mpeg2video at 5M, will handle most movement and any screen text but creates a file that's 250M for 7 minutes. Bitrate is the largest determiner of size, but also check the frame rate (fps), which sometimes can be cut down to 30fps ( -vf "fps=30") if it's insanely high for the action. Can always check stuff with ffplay to get these correct before rendering. Also, if the player will do H264, then encoding in H264 (-vcodec libx264) at 2.5M bitrate looks similar to 5M in MPEG2. Which means about 3/5 the file size.