Thursday, July 27, 2023

data retention

Most ppl want perpetual storage of all their personal data, and they would probably prefer that it is stored under European data privacy laws. Of course streaming one's data to a site out of the country means it will be NSA-repeated as it leaves the US. Still it is likely to have better protections once it arrives in the EU.

For those who cannot afford to physically fly their hard drives around and have them securely duplicated elsewhere, what can we do? Is there a reasonable plan? Not in any secure sense. However a partially secure solution could be to export a percentage of their data to a European Cloud server. Perhaps the line could be drawn at storing personal photos and documents on the Cloud. It's understood any government would be able to see them, run text analysis and facial recognition -- as is true anywhere data is stored -- but we'd at least have a portion of our data backed up against most any non-nuclear catastrophe.

TLDR; IME, PCloud (if one can afford it), since it's perpetual, with no subscription fees, is a reasonable unsecured solution. Here we can store photos and documents perpetually (one or two terrabytes), though can't afford enough space for video, audio, and so on. So our poor man's plan could look something like this...

  • photos and docs: PCloud. Managing docs and citations will need a database. In lieu of database, possibly can organize BIB files over collections and then use, eg jabref to keep it clean and manageable. It's unclear yet how to cite specific SMS or emails. A database or at least a spreadsheet, seems inevitable.
    Very small audio and videos (eg screen captures) which document something may possibly be kept cloud.
  • videos: HDD, SSD. There's no way for a poor man to store, or have available to edit, these media on the Cloud. What might be Cloud stored is some sort of filing spreadsheet or database table which allows the user to date-match vids (SSD), pics (cloud), and docs (cloud), if desired for a family reunion or forensics.
  • audio: HDD, SSD. Music and longer podcasts must be kept here. Too expensive for cloud storage.
  • sensitive: decision time. If documents have current PII, might want to keep on a USB key, backed-up to SSD, or something else off-cloud. Tough decision when it's OK to Cloud store. Of course, it's more convenient to have on the Cloud, but not sure that's recommendable from a safety persective.

oversight

It's as yet unclear how to database one's entire collection, but some attempt to herd the cats must be made. If one has the time and resources to implement NARA level storage plans, then some version can be followed

If database is possible, either through fiverr or some such, that's probably recommendable, since all one would need to do is occasionally make a diff database backup occasionally and keep it on the cloud. But personal files are a wide ranging "collection", and people often change file and folder names, and so forth. If that's correct, a system may be more important than capturing each file, not sure.

jabref and bibtex

$ yay -S jabref-latest

Let's take the example of emails. These are a horrible security risk and a person typically wouldn't want to archive them on the Cloud. It's equally true however that we sometimes need to archive them. Suppose we decide to archive some of them. Let's take an example for how we could manage the metadata without a database

Suppose we print-to-pdf an email conversation and and give it the filename 20230705_anytownoffice.pdf. For metadata, let's create a text file called emails.bib. This is BibTex file, using standard LaTeX formatting.

@misc{email000001,
title = "20230705_anytownoffice.pdf",
author = "homeoffice@gmail",
year = "2023",
booktitle = "{A}nytown {O}ffice {S}upply",
address= "Anytown, CA",
abstract="April-June. Regarding shipping some reams of papers. Contact was Joe Foo",
note= "folder: emails, file: 20230705_anytownoffice.PDF"

And then if a person opens the BIB file using jabref, they will have all the relevant info displayed as in a spreadsheet. So jabref can work for more than articles and books.

old DVD's

Standardize the format. A lot of old movies and episodes are 480p (DVD) and that's fine. However, they're often in MKV or WEBM containers with y9 encoding and so on. There's no way to reliably burn these back onto a DVD or even play them on most players.

Toying around, I've come up with...

ffmpeg -i oldthing.mkv -b:v 1.5M standard.mp4

... which yields about 1.3G and shows well on a large screenon the screen...

I prefer...

ffmpeg -i oldthing.mkv -b:v 2M standard.mp4

...but this yields perhaps 1.7G for a standard 1:40-1:45 film, which is a lot of space.

If a person has the time, it's even more interesting to break these larger films into a folder of 20 mins

No comments: