Showing posts with label backup. Show all posts
Showing posts with label backup. Show all posts

Saturday, December 2, 2023

media backup - checklist

I have 2.5 other posts on this backup, so nothing in-depth is covered here. The steps I can recall and a few tips. A repeat at the top of the challenges.

The Challenges

It's likely not a mistake that there's a laborious manual process instead of a simple software solution for the common need of backing-up media. I smell entertainment attorneys.

  • every backed-up CD becomes a folder of MP3's. To recreate playing a CD, a person would have to sit at their computer and click each MP3 file in CD sequence, or else make the entire CD into a single large file.
  • M3U files play MP3 files in sequence, eg in the same sequence as the CD. A large catch (also probably DRM-related) is that M3U files must contain hard links -- a complete system specific path -- to media files for the M3U to function. Thus, any portable, relative link solution is prevented. Further, entering hard links into M3U's must be done manually, and these long links increase the chance for fatigue and entry errors.
  • Most browsers disable (probably due to industry pressures) M3U files from opening, and will only download the M3U without playing these laboriously entered MRL links

NB Time: if a person has the real estate, the industry has made it easier to simply leave media on the shelf and pull it off when a person wants to listen. Backing-up a 100 CD collection takes about 75 hrs (4500 mins), ie, about 2 work weeks. It's worth it, of course, if there's any attachment to the collection.

NB Hard Links: an HTML interface will provide access similar to the original physical disks, with a 'forever and in a small space' fillip. However, the first job is to find a browser that will open links to M3U's. This is probably a moving litigation target, but currently Falkon opens them, albeit with an additional confirmation step in each instance.

NB M3U's: these carry a lot of information, in addition to links. Additional listens over the years allow a person to flesh comments on every track, as much as they want, without affecting playback, or being displayed. They are a private mini-blog for the listener to add info, times, additional versions, or to make new M3U mixes, etc. Protect at all costs.


configure (possibly 1 work day)

  • partition(s) for external backup disk, probably using XFS these days (2023), and a micro usb.
  • fstab is a PITA. It has to be modified to mount the drive you've got the media on, in order for the hard links in M3U's to work. However, a modified fstab will cause boot to fail into maintenance mode if I boot/reboot the system without that drive (I usually specify /dev/sdd for the USB drive) connected.
    So at boot, return fstab to default. After boot, remodify fstab with the dev and run "mount -a". Anyway, that's how f'ed up these WIPO organizations have made having a media drive.
  • Touchscreen and a 3.5mm connector(speakers) + usbc (backed up external)
  • consider file structure: music,images, m3u's, booklets art, video versions, slower versions (for mixes, etc).
  • configure XDG for xdg-open to open M3U's with preferred player
  • review/establish ~/.config/abcde.conf and verify an operational CDDB. The CDDB saves at *least* 5 mins per disk. In lieu, must enter all track names and artists, etc.

backing up (30mins per CD)

  • abcde the shit to 320kb
    $ abcde -d /dev/sr0 -o mp3:"-b 320" -c ~/.config/abcde.conf
  • while abcde, scan (xsane) cover art and booklet to 200 or 300 dpi, square as possible
  • create PDF of booklet/insert (convert jpgs), and save front for easytag and HTML thumbnails
  • << above 3 should be done simultaneously, aiming for 15 mins per disk >>
  • easytag the files and clean their names, attach cover jpg/png.
  • create m3u's for each disk (geany, gedit, whatever)
  • << above 2 roughly 15 mins per disk >>

post-processing (15 mins per CD)

  • download a browser that will not block M3U's, eg Falkon.
  • start building your HTML page
  • enter each file's relevant info into the schema
  • create 100x100 thumbnails for faster webpage loading
    $ mogrify -format png -path /home/foo/thumbs -thumbnail 100x100 *.jpg

tips

  • keep MP3 file names short, since they have to be hand entered into the M3U. Longer names can be in the ID3 tag, and/or M3U file. Both the ID3 info, and especially the M3U file, can accept additional information later, at one's leisure.
  • typos waste a lot of time and break links. Cut and paste whenever possible for accuracy.
  • leave the HTML interface file continually open in the same text editor as for the M3U's. Geany is an example. There are continual modifications and touch-ups to the HTML page, even as I open and close various M3U's. And a lot of copying and pasting from the M3U's into the HTML file.
  • Geany has a search and replace function. When creating the M3U for a CD, a person can copy an existing M3U into the folder of the CD they are working on, and use it for a template. Just rename it for the current CD, and then use the search and replace function to update the all the links inside the M3U with a single click. A person can then start editing the song names without having to do all the hard link information again. Saves time.
  • run the scanner every so often without anything and look at result to see if glass needs cleaning
  • make an M3U template else continually waste 5 mins eliminating prior entries from the one copied over. Every CD will need an M3U to play all of its songs.
  • This is good software cuz it has melt included for combining MP4's

Monday, November 27, 2023

xdg mime, usb mounts

If we have to jettison our physical CD's and DVD's in the name of space, we unfortunately must back them up first. At that point, we lose the convenience of...

  1. easy playback (pull the CD/DVD off the shelf, put it in a player, and press 'play')
  2. in lieu of pressing 'play' in a player, how do we play an entire set of MP3's from a CD with a single click?
  3. global selection is lost (how do we easily observe our entire collection of CD's/DVD's, as we used to on on a shelf?)
  4. the portability of a bookshelf CD player is gone, and we now require a device with an interface to select and play the music

solution

Turns-out that 1, 2, and 4 are related questions. We can create an M3U for each CD (not a trivial task), then create an HTML page with hyperlinks to the M3U's. So when we click the HTML link from our browser, the M3U is opened by the default application (eg. VLC) which plays the CD's MP3's in the order they used to be on the CD.

This fundamentally solves problems 1 and 2. And since HTML pages open on nearly any device with a web browser, we have a good start on solving problem 4.

To solve problem 3, perhaps we can also eventually add thumbnails -- a thumbnail for each CD -- to our HTML page, and then embed an M3U link into the thumbnail: see a thumbnail for a CD, click the thumbnail. Since we can place infinite thumbnails on an HTML page, we can likely see our entire collection on a single webpage. At that point, we'd only need to consider what device and speakers to connect, the hardware.

This is a fairly simple schema, and attainable, but it's a significant investment of work: we must create an intuitive HTML page, and multiple M3U's. The CD's song order and file locations cannot be determined by the application (eg, VLC) without an M3U, so an individual M3U must be created for each CD, and for any mixes.

nested additional problem

We want to open our HTML file in a browser to click on a link to the CD's M3U. However links to M3U's have no default application and thus do not natively work when clicked in browsers. So now our job is two-fold.

  • We must create functional M3U files
  • We must configure our browser or OS to make hyperlinks to M3U's click-to-play. That is, we must associate an application with the HTML link. The OS uses XDG to manage these associations.

xdg 'desktop' files

The XDG system is a script which connects file types and applications. Suppose our browser is Chromium and we click on a website link to a PDF. Chromium makes a call to the XDG system (xdg-open). If we've registered an app for our PDF fileswith XDG, the application (eg. Evince) opens the PDF.

It's a chain, so if we haven't registered a default for PDF's in XDG, Chromium's call to XDG produces no information. In these circumstances, Chromium simply downloads the PDF. XDG itself has its own files and file types with which it makes these connections. We'll configure XDG to connect the M3U to VLC, the same way it connects a PDF to Evince.

This seems simple, but later we will find out that Chromium refuses to open M3U's even when XDG is properly configured for it. See "troubleshooting" further down the page.

m3u xdg registration

Our clickable schema depends on M3U's being played from the browser. However, XDG does not typically have a default application for M3U's. Until we configure one, browsers that contact XDG get no information. As noted above, browsers typically just download the M3U file. In order for the browser to process a click on an M3U hyperlink (without downloading), we must create an association between M3U's and an application. XDG manages this.

add a file type (arch) scroll down to xdg-open and perl-mime-types. Perl mime types is straightforward, and this worked IME. informative arch page. see also their additional page.
add a file type (stack) add an existing file type.
add a file type (stack) the most thorough description. Includes syntax for any file type.
add a file type (superuser) another method, slightly more superficial, for existing file types. Create a desktop file then add to mimeapps.list or run xdg-register.
add a file type (askubuntu) have an existing file type and need to associate it with an application.
list of associations (unix exchange) how to get a list of default file apps.

configure m3u

Verify M3U is already defined within the XDG system.

$ xdg-mime query filetype foo.m3u
m3u: audio/x-mpegurl

...or...

# pacman -S perl-mime-types [incl mimetype]
$ mimetype foo.m3u
m3u: audio/x-mpegurl

...then, to associate it to vlc, or whatever player....

$ mimeopen -d foo.m3u

...verify that (in this example) vlc was associated with it...

$ xdg-mime query default audio/x-mpegurl
vlc.desktop
# update-desktop-database
# update-mime-database

...or...

$ update-mime-database ~/.local/share/mime

verify file opens natively via xdg

$ xdg-open foo.m3u

it should open with vlc.

thumbnails

We need thumnails of CD insert/booklet art for our omnibus music page. Imagemagick is our friend for processing an entire directory of photos to provide us with thumgnails. NB: not sure which of its commands is destructive or additive resize, mogrify, or convert.

$ mogrify -format gif -path /home/foo/thumbs -thumbnail 100x100 *.jpg

troubleshooting

1. M3U access through browser

Install a browser such as Falkon which respects XDG settings for M3U's


Chromium will not open an M3U. Probably a DMCA protection, since M3U's can be built to do streaming, not simply play local files the way I use them. Priority (top of foodchain) is supposed to be the ~/.config/mimeapps.list, but Chromium does not honor any XDG M3U settings or files.

IME, the simplest, fastest solution to this Chromium problem is to install a browser such as Falkon, which respects xdg-open settings. For our music schema to work, we need a browser to open our HTML files.

$ cat .config/mimeapps.list
[Added Associations]
application/pdf=org.gnome.Evince.desktop;
image/jpeg=geeqie.desktop;
text/plain=org.gnome.gedit.desktop;
image/png=geeqie.desktop;
image/gif=vlc.desktop;geeqie.desktop;
video/mp4=xplayer.desktop;
video/mpeg=xplayer.desktop;
application/octet-stream=org.gnome.gedit.desktop;

[Default Applications]
application/pdf=org.gnome.Evince.desktop
image/jpeg=geeqie.desktop
text/plain=org.gnome.gedit.desktop
image/png=geeqie.desktop
image/gif=geeqie.desktop
video/mp4=xplayer.desktop
video/mpeg=xplayer.desktop
audio/x-mpegurl=vlc.desktop;

2. browser path requirements lead to permanent mount point naming

Create a mountpoint and identical /etc/fstab entry. Put it on all devices that need access to USB external drive. All links in our music setup will use these links.


Seems impossible but, when a browser opens an HTML page, the links to M3U's cannot be just the file, eg. "foo.m3u", even if the M3U is in the same folder with the HTML file. We're not used to this. HTML files easily display photos in the same directory or in a subfolder such as 'images'. But for the M3U to open, it must be called with the complete path to the file starting from its mount point eg, "/run/media/[USER]/[LABEL]/music/foo.m3u".

This poses a problem for the user. Each computer has a different username, and the "run" mountpoint is temporary. Gvfs or fusermount inserts the USER and partition LABEL when it mounts the drive, eg, /run/media/[USER]/[LABEL]/. But we can't change the HTML links to our 100+ M3U files every time we mount the USB back-up drive in a different system.

To pass the environmental variable of '$USER' into our URL is also not easy due to security problems with URL's on non-local systems that connect to internet. I tried USER, $USER, %USER%, 'USER', '$USER', '%USER%', `USER`, `$USER`, and `%USER%`. None worked.

To obtain USER, we simply whoami or a larger list of environmental variables with printenv. To determine LABEL, we can of course use 'lsblk', or the more complete...

$ lsblk -o name,mountpoint,label,size,uuid

The next level is a udev rule or fstab configuration that I would place on any machine I use with the backup drive. But GVFS is extremely powerful and udev, fstab, etc may only unreliably/unpredictably override GVFS.

I decided to try an fstab addition since this post (scroll down) made it seem the simplest solution. If I had done the udev rule, the persistent naming setup would have been from kernel detection.

In either case, we basically want to override gvfs when the UUID or LABEL of the backup USB is detected. Unfortunately, we will never be sure GVFS might be fickle on some system and disallow being overriden by /etc/fstab. But we must attempt it, otherwise we cannot use HTML and a browser to manage the media collection. The process is from this post.

  1. Create a permanent mount point. "run/media" is a temporary file system used by GVFS. I decided to create /mnt/[label], where 'label' is the label of the partition. In this case...
    # mkdir -p /mnt/bigdata
  2. update /etc/fstab, then do systemctl daemon-reload
    # nano /etc/fstab
    # UUID=ba60a72e-0db3-4a5f-bea5-c3be0e04cda1 LABEL=bigdata
    UUID=ba60a72e-0db3-4a5f-bea5-c3be0e04cda1 /mnt/bigdata xfs rw,auto,user 0 0
    # systemctl daemon reload
    # mount -a
  3. With the "mount all", the device should load and at that directory with proper permissions. We can verify...
    $ cat /etc/mtab
    /dev/sdd1 /mnt/bigdata xfs rw,nosuid,nodev,noexec,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
    ...and of course try a test write and erase to the drive to verify user permissions.
  4. Now whenever we create a hyperlink in our music oversight HTML file, we can use a persisting, cross-platform, link. Eg, for the M3U, we might have an address of /mnt/bigdata/foo.m3u in the link. If we connect to any other systems, 1) create /mnt/bigdata, and 2) modify their fstab. All links to music and M3U's in our HTML page should then work.
  5. The USB drive will *not* appear in our temporary drive list in our file manager. We'll have to navigate to /mnt/bigdata to see or edit the drive's contents.

Saturday, November 25, 2023

crucial x6 (Micron - 0634:5602)

This is a $199 (early 2023)/$150 (late 2023) 4TB USBA (3.2) <-->USBC (Android) external SSD drive. We're capped at about 800MB of data transfer, but being external obviates annoying NVMe (PCI only) issues. With SSD's, only writing, not reading, decreases their lifespan. I'm thinking of these in terms of 10 year lifespans.

$ lsusb
Bus 003 Device 009: ID 0634:5602 Micron Technology, Inc. CT4000X6SSD9
$ lsblk
sdc 8:32 0 3.6T 0 disk
├─sdc1 8:33 0 128M 0 part
└─sdc2 8:34 0 3.6T 0 part /run/media/foo/Crucial X6
$ cat /etc/mtab
/dev/sdc2 /run/media/foo/Crucial\040X6 exfat rw,nosuid,nodev,relatime,uid=0500,gid=0500,fmask=0022,dmask=0022,iocharset=utf8,errors=remount-ro 0 0

FS considerations

The X6 comes with an extended FAT (exfat) 3.6TB partion and thus syncs to MSoft devices with an added file management area "System Volume Information" of about 128Mb. Should we keep the exfat? Extended FAT has an upper recommended partion limit of 512Tb, so the drive is well-within it's useful partition size, and it's also compatible with Apple devices. YMMV but, in my systems, I prefer zero MSoft either in applications, file systems, or anything else.

A person might next consider formatting to ext2. I've found it reliable for more than a decade. However 2038 is the final time-stamp date available for ext2 files -- ext2 was released in 1993 -- so ext2 is unfortunately nearing some expiration considerations.

Luckily, any FS we want will do fine: external drives don't need to boot, and we only need a single partition to store data. BtrFS supposedly has flexible inode size (so can manage smaller and larger files side by side), but Oracle was involved in its development. Currently, I believe xfs is worth a try, in absence of ext2. Xfs is the default on RedHat systems so it has active development. As for ext4, this Reddit post compares ext4 and xfs. We may also want to soon consider zfs, since it's gaining momentum with large stakeholders.

XFS

Pretty sure xfs will not natively mount in a Windows or iOS systems and has the option of encryption, but it also has a drawback: appropriates journaling space. RedHat has info, of course, and the Arch guide is also good. XFS.org also has its own FAQ.

# umount /dev/sdc
# gparted /dev/sdc [eliminate present stuff, add xfs partition]
# mkfs.xfs -L "bigdrive" /dev/sdc1
# xfs_repair /dev/sdc1
# chown 0500:0100 /run/media/foo/bigdrive

The reason for the xfs_repair is to verify it formatted successfully.

The reason for the chown: for some reason, gvfs mounts xfs as root:root (0:0) instead of the standard user:group setup. Since gvfs mounts other USB drives to group 100, I chown-ed the drive to UID:group, that is to 0500:0100. Can also chown to 0500:0500 if desired, that is user:user (supposing, eg one's UID were 500).

I only had to chown it once and gvfs subsequently auto-mounted the drive with the correct permissions.

If stripe and strip parameter warnings appear, read this and ignore them. That's for RAIDs.

Monday, November 20, 2023

media back-up

This post deals with audio CD's and DVD video -- I have no BluRay media. And it's sort of a worst-case scenario, one where a person can't physically save their media. I've learned an immense amount because it looks like an straighforward project, but it's not. I've written another blog post covering most of the remainder of it.


I recently cleared-out my storage area. I hadn't been in there in 17 years. A few boxes of CD's and DVD's were part of the contents, and the memories of the times when I purchased the media surged in an instant. They welled-up so quickly and powerfully that I choked-up. How had I become so weak and disfigured? Uncomfortable insights.

In spite of such anguish, a person is unlikely to trash the media. It's natural for a person to honor their history. At some point, I want to light a cigar, take a keepsake media disc off a shelf, and put it in a player for a rewatch/relisten.

If no home with a shelf, then very small storage areas can still be rented for what we used to pay for large ones. Ie, rent for what used to be an 8x10', might now rent a 4x4'. A 4x4 can hold a few CD's and a few papers, etc. Maybe $45.

Minus a millionaire's floorplan or an affordable storage area, a person realizes their options are down to one: back the media up, probably in some mediocre fashion/bitrate without cover art, and probably without individual tracks or chapters. Is it even worth it at all? As in every other post on this blog, the answer is "it's entirely up to you", and "the info here pertains to Linux". I found it was worth it for some discs, and others I just threw away. It's about 20 mins per CD, and about X mins per DVD.


TLDR

audio: 20 mins per CD

  • abcde to backup the files and to import online CDDB data for ID3 tags. Verify the fetching CDDB URL inside abcde.conf
  • Easytag (or some ppl prefer picard) to clean up ID3 tags.
  • Sometimes groups of MP3's will require renaming. Gprename can hasten this.
  • Cover art files do not always download and will not be the complete inner notes. While abcde is working, scan the inner liner at 200-300 dpi as close as possible to a square (1:1) crop
  • Later, can make thumbnails for the HTML oversight page.
$ abcde -d /dev/sr0 -o mp3:"-b 320" -c ~/.config/abcde.conf

dvd: handbrake to backup the files.


playback note

See my post, which suggests an HTML/M3U file solution. Some may wish to install LibreELEC and Kodi in an old laptop to work as a "CD player". All of the engineered "media manager" products are bullsh*t IMO -- more concerned with phoning home your collection than any other task. A simple HTML page works much faster, and more reliably and configurably.

If we formatted our external back-up drive with XFS, then it will not work with a Windows or Mac device. So we will at least need some device to access the files. And a 3.5mm jack in the device, to move the audio to a computer speaker system or some headphones. Alternatively, we could design some static web pages and make thumbnails of of the CD covers with built-in links to playlists or which open the folders and play the contents. The specs for m3u's are here.

Although I run icewm, my MIME settings are handled by xdg. So it will play an MP3 link, but download an M3U link.

$ xdg-settings get default-web-browser
chromium.desktop

Applications work as .desktop files. Not sure.

audio 90-100Mb per CD

Time and discwise, a CD like Fleetwood Mac's Rumours, about 40 mins long, takes about 12 min to rip and transcode at 320k. The 320K bitrate leads to a 90Mb folder of files, or roughly 2.3Mb per minute (at 320Kb). I find 90Mb per CD unproblematic at today's drive prices.

Quality wise, I prefer a full 320Kb MP3, esp if the music is nuanced with flanging or orchestration. IMO, 192K can be OK, but 128K is definitely not OK unless just a speech or some such.

Probably abcde is still the easiest attack on a stack of backup CD's. We need some way to save id3 information, and probably edit it (eg, easytag). Installing abcde will also pull-in python-eyed3. I installed glyph out of an abundance of caution for photo embedding.

$ yay -S abcde id3 python-eyed3
# pacman -S glyph id3v2

audio - abcde use and config

Abcde has an old school text conf file. Get the skeleton from /etc and copy it to wherever to edit it, eg ~/.config, then do edits like the CDDB URL and save. Next, adjust the URL where abcde finds ID3 information to populate the files.

$ cp /etc/abcde.conf ~/.config/abcde.conf
$ nano ~/.config/abcde.org
CDDBURL="gnudb.gnudb.org/~cddb/cddb.cgi"

When executing, the config file is called with the "c" flag and path.

$ abcde -c "~/.config/abcde.conf"

Abcde does not rip directly to MP3; it rips the entire CD to WAV's (10 mins), then converts each WAV to whatever number of other formats we want. From rip to conversion to one other format is about 12 mins per CD. Native MP3 conversion bit rate is 128k. Of course, if a person simply wants the WAV's, then use the defaults, eg $ abcde -d /dev/sr0.

For this project I'm likely to to specify MP3's at 320kb, though a person can specify what they like (or leave off the ':"-b 320"' for 128k).

$ abcde -d /dev/sr0 -o mp3:"-b 320"

audio - ID3 info

ID3 is no longer reliably being developed and there are different versions which might conflict. Of the versions, I consider ID3v2 compatible with nearly any player and therefor a reliable ID3 format for my MP3 files. ID3v2 is also supposedly the ID3 version EasyTag apparently uses. So after first tagging some MP3 files with EasyTag, I verified the ID3 version by installing id3v2. Then I ran its utility in a terminal, $ id3v2 -l [name].mp3, to verify my ID3 tags were version 2, ie, ID3v2.

$ id3v2 -l 02.Dreams.mp3
id3v2 tag info for 02.Dreams.mp3:
TIT2 (Title/songname/content description): Dreams
TPE1 (Lead performer(s)/Soloist(s)): Fleetwood Mac
TALB (Album/Movie/Show title): Rumours
TYER (Year): 1977
TRCK (Track number/Position in set): 02/11
TCON (Content type): Rock (17)
02.Dreams.mp3: No ID3v1 tag

Even with a functional CDDB URL in our ~/.config/abcde.conf , we often need to do the ID3 and art manually.

# pacman -S easytag
$ yay -S gprename

audio - ID3 art

It appears EasyTag only recognizes photos if gdk-pixbuf2 is installed.

# pacman -S gdk-pixbuf2

I arrange the scanner (xsane) for 300dpi, and preview the image close to a square. The resulting scan I then then scale. Get the WxH to about 1400x1400. This makes about a 500Kb file to add to each music file in that folder, so if 10 files in the CD, it adds 5MB to the entire folder. The 1K x 1K pic will display pretty sharp if running the file in VLC. Not sure other players. If I have an internal booklet in the CD, I scan it all and make a PDF to put in the folder.

Adding the art: put the image in the directory with the MP3's, open EasyTag in that directory, selecting all the files I want to receive that image (typically cover image). Then I go to the right and select "images" tab, add the pic with the "+" icon, and saving it to the files. A person will figure it out fast.

Removing art is fickle. I can do it in EasyTag, but it seems to stay in the VLC cache so I can never be sure. A sure way is to 1) Delete the directory for VLC album art: ~/.cache/vlc/ and, 2) go nuclear via "cd"ing into the music files directory and then...

$ id3v2 -r "APIC" *.mp3

audio - manual tasks

If I find old directories full of WAV's, like CD's I made back when, or whatever, I can batch convert the WAV's into 320k cbr MP3's.

$ for f in *.wav ; do lame -b 320 $f ; done

If I find a track that I want to slow it down (I don't mind pitch) and put a 0.8 speed version in the CD directory, perhaps to use with mixes, then...

$ sox foo.mp3 -C 320 foo8.mp3 speed 0.8

If I want to combine all the tracks in the CD into a blob, so I can play easily when driving or whatever...

$ sox foo1.mp3 foo2.mp3 foo3.mp3 output.mp3

There's a lot more about sox here.

Then I can get the art for it online or design something. Either way, put it in the folder with the files and add the art and info via EasyTag.

dvd 1.3G each

dvd - art

Use at least 300x300 ppi and some similar dimension. Give it the same name as the film file, but a JPG extension. When company arrives, they can easily peruse a collection with a file manager, or by browsing the photos in something like Geeqie.

dvd - content

Most movies (1.7 hrs) are 1.1GB (480P) and about 20 mins to back-up, assuming a person continues their other computer tasks throughout the encode. Episodes of TV shows are typically. If I have a higher-res recent DVD, I use the Fast(not Super-Fast)720p setting in Handbrake. It's 2 passes. Otherwise I used 480p Fast. run Fast The HandBrake software install (see picture below) is about 80Mb.

The rip is two parts, extraction and transcoding, and DVD structure is described here.


First, don't forget the obvious...

# pacman -S libdvdcss libdvdread libdvdnav

...since literally *none* of the players will do anything but spawn errors without explanation if these are lacking.

One fairly simple Arch solution is HandBrake.

# pacman -S handbrake

To execute, ghb which, along with icon, a person can find with ye olde...

$ pacman -Ql handbrake

...and there are flags of course, not sure where to locate.

I start HandBrake from a terminal (again, "ghb") even though it's a GUI. This is because the CSS keys take a minute or two to propagate and I can't see when they're finished within the application. Once I see the keys have propagated, I can return to GUI selection options.

dvd - storage

Probably need to think about file structure. At least like "movies, bbc, adultswim, sitcoms, detective". Or some such. Then include the scan of the cover art with the file

Tuesday, July 7, 2020

rclone details

In a prior post, I'd found that using rclone to upload RESTful (rclone uses REST, not SOAP) data had become more complex -- by at least three steps -- than two foundational videos from 2017:
1. Rclone basics   (8:30) Tyler, 2017.
2. Rclone encrypted   (10:21) Tyler, 2017.
These videos are still worthy for concepts, but additional steps --choices actually -- must be navigated for both encrypted and unencrypted storage, whichever one desires. Thus, a second post. Unlike signing in and out of one's various Google and OneDrive accounts, all are accessed from a single rclone client. Rclone is written in Go (500Mb), so that immense dependency must be installed.

across devices

To install rclone on multiple devices, including one's Android phone (RCX), save one's ~/.config/rclone/rclone.config. For each installed client, simply duplicate this file and one can duplicate the features of the original installation. If one has encryption, losing this file would be very bad.

deleted configurations

  1. ~/.config/rclone/rclone.config (client). If this file is lost, duplicate it from another device. If lost entirely, access must be re-established entirely from scratch, and the encrypted files will be lost permanently.
  2. scope (google) Google requires authentication for access details for which Google keeps. Documentation is difficult to find, other than the OAuth info in the prior sentence. It appears that users cannot directly edit any of the 11 access scopes (files) defined, but rather only through a Google dialog screen. When installing rclone, 5 of the 11 scopes are available, for which I typically like "drive.file".

command usage

For simplest use, to the root directory...

$ rclone copy freedom.txt mygoogleserv:/

Not all commands work on all servers, so use...

$ rclone help

instead of...

$ rclone --help

The former will display only those commands on the installed version of rclone. The latter shows all commands, but not every compilation has these.

$ rclone about mygoogleserv:
Total: 15G
Used: 10.855k
Free: 14.961G
Trashed: 0
Other: 40.264M
Of course, there's also the GUI, rclone-browser.

encryption notes

Rclone documentation notes strong encryption, especially if salt is used. Minimally, we're talking 256-bit. Of course governments can read it, but what can't they read?
  • unencrypted accounts must be established first. Encryption is an additional feature superimposed onto unencrypted accounts.
  • remember the names of uploaded encrypted files; even the names of files are encrypted on the server and the original filename is necessary for download.
  • keep the same encryption password on all devices on which rclone is installed.

glossary

  • application data folder (Google) a hidden folder in one's Drive (not in one's PC). The folder cannot be accessed directly via a web browser, but can be accessed from authorized (eg OAth) apps, eg rsync. The folder holds "scope" information for file permissions.
  • authorization (OAuth, JWT, OpenID) protocols for using a third party REST app (rclone) to move files in and out of a cloud server (Google, AWS, Azure, Oracle), there's an authorization process between them, even though you are authenticated in both.
    What is OAuth (10:56) Java Brains, 2019.
    What is JWT (10:34) Bitfumes, 2018.
  • scope (Google). the permissions granted inside Drive to RESTful data uploaded by users using, eg, rclone.
  • REST Representational State Transfer API for server to client data transfer. Wikipedia notes this as an industry term and not a copyrighted concept by Oracle or Google. It refers to data exchanged by user-authorized third party apps between applications or databases and applications. This is as opposed to data directly entered by users, or data that is not authorized by users between servers.

    REST API concepts and examples (8:52) WebConcepts, 2014. Conceptually sound on this HTTP API, even though dated with respect to applications. Around 7:00 covers OAuth comprehensibly.

  • SOAP Simple Object Access Protocol. This is the older API for server to client data transfer.

    SOAP v. REST API (2:34) SmartBear, 2017. Very quick comparison.


Google 15GB

Users can personally upload and save files in Google Drive through their browser as we all know. However, Google treats rclone as a third party app doing a RESTful transfer and uses OAth to authorize it. Additional hidden files are created by Google and placed into one's Drive account to limit or control the process.
Within that process, there are two ways to rclone with Google Drive, slower or faster. The faster method requires Google Cloud services (with credit card) and a ClientID (personal API key). The slower way uses rclone's generic API connection.

1. Slower uploads

Faster to set-up, but slower uploads. Users regularly backing-up only a few MB of files can use this to avoid set-up hassles. It bypasses the Cloud Services API, and uses the built in rclone ID to upload as directed
  1. $ rclone config
    ... and just accept all defaults. For scope access, I chose option "3", which gives control over whatever's uploaded.
  2. verify function by uploading a sample file and by looking in ~/.config/rclone/rclone-config to see that the entry looks sane

2. Faster uploads

This method requires a lengthier set-up but, once configured, rclone transfers files more quickly than the generic method above. Users need a credit card for a Google Cloud Services account, which in turn supplies them with a ClientID or API key for rclone or other 3rd party access into Drive.
  1. get a Google email
  2. sign-up for Google Cloud services
  3. register one's project "app". In this case it's just rclone) with the Google API development team
  4. waiting for their approval -- up to 2 weeks
  5. receiving a Client ID and Client Secret which allow faster uploading and downloading through one's Drive account

These two videos move very quickly however they have the preferred Client ID and Client Secret method that supposedly speeds the process over the built-in ID's.

Rclone with Google API (6:38) Seedit4me, 2020. The first four minutes cover creating a remote and the 5 steps in creating the Client ID and Secret.
Get Client ID and Secret (7:29) DashSpan.me, 2020. Download and watch at 40% speed.

OneDrive 2GB

This primer is probably the best for OneDrive, however it also applies to many of the other providers

metadata and scope

These are hidden files within one's Google Drive. is is part of the Google Drive API v.3, which is what rclone uses to connect and transfer files. In particular, you will want to know about the Application Data Folder
Google API v3 usage (5:28) EVERYDAY BE CODING, 2017.
Get Client ID and Secret (7:29) DashSpan.me, 2020. Download and watch at 40% speed.
RESTFUL resources and OAuth (55:49) Oracle Developers, 2017.

Tuesday, June 30, 2020

backup - texts (sms/mms)

Unlike years prior to Patriot Act, BSA, and DMCA, both criminal and civil agencies seem to operate assuming anything on citizens' cell phones or computers will eventually be discoverable, if not somehow prosecutable. These forces are much larger than us and there's little way we can protect ourselves from their creepiness.

In the face of this, we'd prefer to delete all of our data every day but our lives would be even more negatively impacted by these forces if we do. We'd lose track of birthdays, anniversaries, important receipts, and so forth. So we still need to retain some data for our daily affairs in these absurd times.

Understanding that protecting data is impossible for less than a team of experts, what can we individuals do to retain some data, and somewhat mitigate its exposure? If we can do some selective capture, we might be able to maintain our activities without gov't and info-agency stalkers into 100% of our private affairs. Texts are possibly a good test-case.

  • application my two primary considerations are 1) include all conversations in a single back-up, 2) have a clear text, non-propietary format. There was a great app, "Email My Texts" which collated a selected period of texts into TXT and included attachment file-names. Google mysteriously removed this app from the PlayStore. It seemed like a killer app that customers were choosing over all alternative$, and maybe Google disapproved, not sure. The remaining apps have awkward formats which parse conversations or use proprietary XTML, PDF's (immensely inefficient), and so on. The best option I've found of the remaining PlayStore apps in 2020 is...
    ... which is worthy of upgrading to Pro, for something like $4.
  • format CSV the only option in text apps 2020. If a TXT app returns that can select all conversations in a single file and names the attachment, I'd take it over CSV, but that's 2020 for you.
  • storage run rclone to encrypt CSV files to cloud storage. I cover this thoroughly in next month's rclone post. Meanwhile...
    $ rclone listservers
  • searching obviously, the reason to back-up texts is the same reason to back-up emails. You might need some information downstream. How can we search encrypted CSV files in such a way that we can easily find keywords, and then print all the date and parties to the interaction? Not easily. Perhaps a Python script which displays the results in a browser, sequentially as it processes a CSV file.

storage

Assuming this is encrypted via Watch these videos first...

1. Rclone basics (8:30) Tyler, 2017.
2. Rclone encrypted (10:21) Tyler, 2017.

...which can be followed verbatim. There are some new details since these videos, discussed in one of my other blog entries, but the core of the setup is the same. Additionally, this video (8:19, 2017) has some good basic commands.

search

In order to find information in a haystack of encrypted CSV files, each file must be decypted and greped individually. Since we have many encrypted CSV files, this is unlikely to be efficient. It's probably worthwhile to have a passle of encrypted CSV files on the Cloud, and then a local backup for parsing with one's Python script.

emails

A second question arises from text retention which is how to save emails which we want and to encrypt these. How to display them?

Tuesday, June 9, 2020

system - server - hosting

We want a system for learning management (LMS), and another for general usage. I like the Moodle LMS and Nextcloud. The problem is that, for years, both of these should be done locally (VPN), you can't really webface them. New solutions are making it possible to do both. I've previously had webhosting, and I think that's been part of the problem. This time around I want to do a VPS. I would still put Nextcloud on a VPN, but I think Moodle can reasonably be done on a VPS at this point with TOTP. So we can host Moodle on Google, but the question is which Tech Stack (see below). The idea is there re 3 layers: the hosting (Google), the http server (Apache), and the system (Moodle, NextCloud).

  • VPS - Virtual Private Server. Cloud server. Google, UpCloud
  • VPN - Virtual Private Network. Home server. Unlimited storage, only limited by HDD space. I am uninterested in the typical web usage of VPN's for anonymity and so on. These are mostly useless (see vid from Wolfgang's Channel below). Thinking here of the much more prudent usage of a home network for a VPN. It's possible to make it web-facing also, but this should not be done without 2FA and SSL.
  • Backup Critical files need this. Probably anything paper that's irreplaceable, eg, DD214, grades, etc. This shouldn't need to be more than about 1-5 GB anyway, but critical. Chris Titus uses BackBlaze. BackBlaze however relies on Duplicity, which in turn relies upon the dreaded gvfs, one of the top 5 no-no items (pulse audio, gvfs, microsoft, oracle, adobe). Use some other with rclone, rsync, remmina, cron.

plan

Current A-Plus costs: $5 month x 2 sites ($120) + annual 2 x domain w/privacy ($30), one site only MySQL.

  1. DNS - Google ($12 yr x 2 incl.privacy)
  2. rclone some criticals to Drive
  3. Moodle VPS on Google LXC
    • $ yay -S google-cloud-sdk 282MB
    • go to Google Cloud and provide credit card
    • follow Chris Titus' instructions in video below

    Host on Google (30:32) Chris Titus Tech, 2019. Do an inexpensive, shared kernel setup. Uses Ubuntu server and Wordpress in this case.
    Moodle 3.5 Install (22:47) A. Hasbiyatmoko, 2018. Soundless. Steps through every basic setup feature. Ubuntu 18.04 server.

  4. Nextcloud VPS on Skysilk ($60)

1. transfer DNS to Google

Chatted with old provider and obtained the EPP's for both domains, began registration in the new domain. Once these are established, we'll have to change the A-records, and pehaps "@" and "C" records to point to current hosting. Each possible VPS provider handles their DNS in different ways. Some providers manage the entire process under the hood, at others a person must manually make any changes to their A-records.

Rsync Backup on Linux (9:19) Chris Titus Tech, 2019. Great rundown plus excellent comments below.
New DNS Update (7:18) Without Code, 2018. Proprietary, but a transparent example of what is involved in the process.

server blend

Nextcloud is not an actual server itself, the underlying server should be something like Apache or Nginx. Nextcloud then overlays these and serves files via the server underlying it. The logins and so forth are accomplished in Nextcloud in the same way we used to do so with, eg. Joomla or Wordpress (optimized for blogs).

Nextcloud: Setting Up Your Server (17:43) Chris Titus Tech, 2019. Uses Ubuntu as underlying server on (sponsored) Xen or Upcloud. Rule of thumb $0.10 month per GB, eg $5 for 50G.
What are Snaps for Linux (4:47) quidsup, 2018. These are the apps that are installable across distros.

2. existing storage for backup

We can use free storage such as Drive or Dropbox to backup data. They key is it should be encrypted on these data mining, big tech servers.

RClone encryption (10:21) Tyler, 2017. Methods to encypt with rclone. Also good idea to download rclone-browser, for an easy GUI.
Rsync Backup on Linux (9:19) Chris Titus Tech, 2019. Great rundown plus excellent comments below.
Using Cloud Storage (22:55) Chris Titus Tech, 2019. Easy ways to encrypt before dropping into Google Drive, etc. (sponsor:Skysilk)

choosing a VPS

One can of course select Google, but what virtualization do they typically employ? Skysilk uses LXC containers via ProxMox.

Rsync Backup on Linux (9:19) Chris Titus Tech, 2019. Great rundown plus excellent comments below.
Using Cloud Storage (7:31) Wolfgang's Channel, 2019. Be sure to pick a provider that uses Xen or KVM, rather than OpenVz-based virtual machines.

tech stack

I used to use a LAMP stack, but I am trying to avoid MySQL (proprietary RDBMS), and use PostgreSQL (OODBMS), as a minimum update (LAPP), and have looked at some other stuff (see below). I may try a PERN stack if I can get it going with Moodle. Post

Various Tech Stacks (48:25) RealToughCandy, 2020. Decent rundown plus large number of comments below. Narrator skews "random with passion" over "methodical presentation", but useful. PostgreSQL around 38:00.
Using Arch as Server (33:11) LearnLinuxTV, 2019. He's running on Linode (sponsor), but the basics the same anywhere. Arch is rolling, but just keep it as the OS for one app.

Tuesday, June 23, 2015

extra laptop storage - hdd in optical slot, notes: external usb, thunar

Backups are a pain. They can be managed more easily in a laptop by purchasing a $10 drive caddy (photo) into which a backup SATA HDD can be placed.
Once the HDD is in the caddy, the laptop's DVD drive is removed, and the caddy (with the HDD drive) goes into its place.

software steps

First, determine the names of drives in your system. This is easily done with, say, fdisk -l. Using the name of drive, add a line for it in fstab:
# fdisk -l
[suppose result is "sdb1" for caddy drive]

# nano /etc/fstab
/dev/sdb1 /home/media ext3 rw,relatime 0 1

Now the drive will automatically mount each time the system is booted. Once mounted, the drive is an available repository for back-ups. Files can be transferred between the drives using a file manager, or a user might implement a backup schema or program (such as Rsync, eg. with chron).

Note: if you decide to put the optical drive back in the slot, comment the fstab entry for the removed HDD before rebooting, otherwise, it will seek the drive and take several minutes to boot.

external (usb) drives

For the setup above, no special applications are necessary. However, if one is going to use a USB stick or drive, the typical rule applies: you will need to install udisks2, fuser, gvfs or similar bullsh*t, if you don't want to deal with manually mounting these or moving in and out of root. Such applications cause a permission kludge, and may have memory hogging notification daemons that continually poll your system (I dislike .gvfs, notification-daemon is slightly better),but there's little doubt some permutation of these is necessary you're copying to a thumb drive or other USB block device regularly, and want to use a GUI file manager in user-space, without sudoing-up and some CLI skills. In Arch, I use udisks2 in tandem with udiskie (for userspace). Taken together, these are 20Mb:
# pacman -S udisks2 ntfs-3g parted dosfstools udiskie
With these, I can mount any format USB drive, including HFS (Mac).

udisk2 and udiskie note

Links: manual policykit/udiskie config :: systemctl udiskie config
This is a useful app for avoiding fuser, samba, .gvfs, and some others, not needed on stand-alone systems, but it requires configuring. First be sure you're in group "storage" then, for udisks2:
# nano /etc/polkit-1/rules.d/50-udisks.rules
polkit.addRule(function(action, subject) {
var YES = polkit.Result.YES;
var permission = {
// only required for udisks2:
"org.freedesktop.udisks2.filesystem-mount": YES,
"org.freedesktop.udisks2.filesystem-mount-system": YES,
"org.freedesktop.udisks2.encrypted-unlock": YES,
"org.freedesktop.udisks2.eject-media": YES,
"org.freedesktop.udisks2.power-off-drive": YES,
};
if (subject.isInGroup("storage")) {
return permission[action.id];
}
});
Then, for udiskie, somewhere near the end of .xinitrc, but ahead of dbus activation:
$ nano .xinitrc
udiskie &

Thunar note

Supposing this is your file manager for and you've connected a USB drive, you'll also need to install thunar-volman, and to set the permissions (see below) for Thunar to display it.


Select at least the two mounting options I've checked above. The path to this dialog box is: Edit, Preferences, Advanced (tab), Configure.

Finally, if you did install .gvfs, don't forget to exclude it from any dynamic backups or you're in for a world of pain.

Sunday, February 15, 2015

[solved] distributed install (Tex Live info also)

Typically, the details I need to operate within Linux are difficult to find on the Net, yet what I don't need seems written again and again nearly everywhere on the Net. I often must acknowledge to myself later that, what I couldn't find at the time was too simple for anyone to even bother typing.

Recent example. For years, I've wanted to backup my data directory quickly, so I could have a cron backup script to automate it. "Quickly" to me also meant "dd" (data destroyer, lol) instead of "rsync" or "cp". In turn, "dd" meant "unmounted" --- I don't want "live acquisition" for a directory as important as "/home". But I could not grasp how a separate, unmountable, partition for "/home" would work exactly.

The allocations for each partition seemed easy: I used...
$ du -ch
... to determine usage, and formulated a plan for splitting-up the drive: 10G swap, 30G install, the rest to /home. But I couldn't figure out how to do it. How would applications find the partition containing the data files? Dual booters run into that problem, for example.

fstab - the key

One day, I was struggling with the problem and I finally recalled a Linux basic: everything is a network, everything is a file. For example, how does a worker in Building A access his home directory on a server in Building B? Of course! fstab would simply mount it. Fstab was the solution and it was so simple it's little wonder no one had bothered to explain this in their distributed install instructions.

new Arch install

Solution in hand, the new install had 3 pieces: "/home", "/", and "swap" (some add a separate boot partition also). Using cfdisk, I sized each partition as noted above. Then...
# mkswap /dev/sda3
# swapon /dev/sda3
# mount -rw -t ext3 /dev/sda1 /mnt
# mkdir /mnt/home
# mount -rw -t ext3 /dev/sda2 /mnt/home
# genfstab -p /mnt >> /mnt/etc/fstab
...and all was good. The rest of the install was normal. Knowing the mounting commands and their order were the key. The root directory had to be mounted first, then other directories, such as /home.

I also put the TexLive distro (4.5G) into /home, since it's so large. I don't use the Arch repo version, since the full install is more complete. To install, create a directory called, eg., "/home/foo/latex" and, using command "D" during the install, supply the directory information. TL will create the necessary environment within your userspace, no root required. You will just have to update your PATH variables subsequently (see below).
$ cd /home/foo/latex/install-tl-20150525/
$ ./install-tl
command: D
/home/foo/latex/2015

After install, TexLive provides a reminder about paths.
Add /home/foo/latex/texlive/2015/texmf-dist/doc/info to INFOPATH.
Add /home/foo/latex/texlive/2015/texmf-dist/doc/man to MANPATH
(if not dynamically found).

Most importantly, add /home/foo/latex/texlive/2015/bin/x86_64-linux
to your PATH for current and future sessions.

Welcome to TeX Live!
You can test the PATH by attempting to compile,say,a small test TEX file with $ pdflatex test.tex". If the command isn't found, then bash needs the PATHs. You could 1) make a small executable to add paths in /etc/profile.d./ or, 2) add:
$ nano .bashrc
export PATH=/home/foo/latex/texlive/2015/bin/x86_64-linux:$PATH
export INFOPATH=/home/foo/latex/texlive/2013/texmf-dist/doc/info:$INFOPATH
export MANPATH=/home/foo/latex/texlive/2015/texmf-dist/doc/man:$MANPATH
Exit the X session and logout of the user (eg, "foo"), then log back in. The bash paths should be updated and TexLive normally available from non-X terminal, xterm, geany, etc.

backups

Simple now.The format of "dd":
dd if=[source] of=[target] bs=[byte size]
Essentially, "dd" goes from a device to a file. The easiest large file is probably an ISO. One other thing, "dd" copies the entire device, including the empty areas -- it's a copy -- so the target device has to be as large as the source, unless one compresses.
Steps: assume here that home directory /dev/sda2 is to be backed up to a usb drive, /dev/sdb1.
  • boot-up into CLI
  • determine the block/byte size of /dev/sda2 (typically 4k these days), by writing an extremely small file, far below the size of a full block (for example a file only containing the number "1"), and then checking its disk usage (du):
    $ echo 1 > test
    $ du -h test
      4.0 Kb   test
  • Verify the file system format, eg Reiser, ext3, etc. You can use "lsblk -fs /dev/sda2" or "file -sL /dev/sd*".
  • # umount /dev/sda2 (no writing to the partition; we want a clean backup)
  • attach the usb drive (/dev/sdb)
  • dd if=[source] of=[target] bs=[byte size]
  • # dd if=/dev/sda2 of=/dev/sdb1/20150210.iso bs=4k conv=sync,noerror
  • profit.
Profit unless you inverted your source and target drive names ("i" and "o" are next to each other on the keyboard) -- in which case dd wrote from the back-up drive to your HDD, destroying your HDD data.

Wednesday, April 30, 2014

cp, dd, dump, rsync -- diff

I want to back-up some items, and have 3 requirements, plus 3 hopefuls.

1.) Back-up to an ISO, eg to "20141231_bak.iso"
2.) Able to exclude some directories from the process.
3.) Fast
4.) If possible, backup an unmounted partition to avoid any "live acquisition" write errors
5.) If possible, update the ISO with file changes.
6.) If possible, prior to, or following backup, an app which locates duplicate files.

cp solution

For a mounted partition, cp is a reasonable option. We can take an entire data directory, say /home/foo, and safely back it up to an external HDD with a date. We would have to cycle the dumps, since there will be no indexes of changed files.
$ cp -a /home/foo/. /dev/hdb/20140430
Problem: again, slow. Doesn't write to a file, eg to an ISO. Doesn't allow excluded directories. Only works on mounted drives.


dd solution

We prefer backing-up an unmounted partition since there's no chance of new data being written during the backup -- dd fits this bill. Also it's fast. Thirdly, it does exactly what we want for our format: writes from a device to a file
$ dd if=/dev/hda2/ of=/home/foo/20141231_bak.iso
Problem: No directory exclusion since dd backs-up the entire partition. A second problem with duplicating the entire partition is the image includes all free space on the partition, requiring equivalent backup space to the partition. Solution: possibly use "conv=sparse" option to exclude blank space. This means, prior to backups, using something like zerofree to rapidly zero unallocated areas on the partion.

rsync solution

Rsync is not so fast the first time it's run, but subsequent incremental back-ups can be fast, and delta transfer (-W flag) may increase speed even on a USB (as opposed to network) backup. Problem: It's live acquisition; disk must be mounted. It doesn't natively write to a file.