Monday, March 14, 2022

paperhater - classification issues (minus database)

We want to organize electronic media as much as possible, without a database, for at least two reasons: 1) if we can reach file organization and "findability" goals without the database, then we've saved the expense, 2) if our situation ultimately requires a database, the need for one becomes more cleary defined. For this reason, the first steps are the same, database or no ultimate database.

overview

We begin with an entirely non-homogenous mess of files, a deck of scattered cards but without names or suits. Our pile includes everything from receipts, code, diplomas, research articles, manuals, correspondence, old emails, and on. Some of these files are active, some are reference, some are family history, and on.

first cut

Our first cleavage is between all the articles, texts, and manuals which might be used as a reference, or cited in a thesis, etc. Call this our library. The second pile (receipts, forms, photos) is... everything else. We'll overlay a universal file and folder naming convention to both categories. However libraries require an additional handling step due to standardized citation requirements.

There's help for both categories. 1) for library items, ISBN/ISSNs, Dewey Decimal(since 1876), and/or a Library of Congress classifications, already exist. We can organize these using BibTeX. 2) for everything else, think back to the paper era. Records management had proven ways to organize these. This process has become electronic records management (ISO 15489), and is just as helpful. See the first video below.

records basics (25:31) US National Archives Records Management, 2009 Doubling info every two years. 11:00 paper's been lost -- we used to know how to do it. sample basic structure. 17:00 She recommends a file plan.
file management (1:23:57) Nicholas Andre 2013. Windows-based lecture, but clear thinking fellow gives good context for what we're after. Backgrounder. Corning (NY) Community College course.

naming - files

3 part naming convention. Subject, date, code. These vary in order depending on what's most important to the user of that file.

file naming conventions (10:00) Simpletivity, 2018. Ad first 1:36. Uses 3 part naming, succinctly described. Probably best at .75 speed. Comments excellent.

naming - folders(directories)

Mostly the naming is the same as the files, but arranged vertically. 3 layer naming convention. Function, subfunction, action. Thinking of the deck of cards, we can arrange by suit, by number, by color. What's the fastest way to find a card if they're in folders? Might depend on my style of play. NARA notes that granularity of folders depends on number of docs for that folder.

folders - website (8:19) John Morris, 2018. Standard website folder organization.

plan - file & folder

This step is the combination of the decisions on naming of files and folders. Government suggested in General records schedule from the NARM website.

records basics (25:31) US National Archives Records Management, 2014. Donna Read. informatioin doubling every two years. 11:00 paper's been lost -- we used to know how to do it. sample basic structure. 17:00 She recommends a file plan.
file plan basics (47:44) US National Archives Records Management, 2013. Jeff Benson. staff consensus, record retention,
holding to center (10:46) Luke Smith, 2020. Part of sticking to a file plan is understanding that what works for a person can be useful.
federal social media (43:37) US National Archives Records Management, 2013. Bethany Cron. Federal records, con

A. non-research example

This is the thing for say, billings or other saved items.

B. research/library example

We rarely seem to get New Yorker articles directly related to Bay Area unless it's a controversial topic: presumably editors don't want to research on their home NYC turf and alienate locals. Let's say I want to keep one such article to cite later. The information panel.

Step 1 - gather info

Very difficult without a physical copy. There's no online index for magazine volume and numbers cross-referenced with date, at least that I've found. With a physical copy of the New Yorker, I can get the info -- ISSN 0028792X, Volume 98, No 4, Mar 14, 2022. NYC, NY. The article of interest in this example: "The Access Trap", is from Nathan Heller, pgs 34-45. I've scanned these pages into a PDF, as yet unnamed. If online, we may also find other control numbers we want to include. Eg, if we had a dissertation to cite, we know that, "There is no single source for a comprehensive dissertation search." We might encounter a different control number at different sites and want to include them.

Step 2 - PDF name - first cut

Revisit the PDF name and evaluate the 3 part name. It should include a date, name, and subject code, in a way that at least hints at the file contents. In this case, we might, eg
20220314_NY_LowellEquity.pdf

Step 3 - BibTeX it

Information for storage, retrieval, and citation (chicago, mla, apa) could come from a bibtex BIB file. Bibtex files can be massaged in the document for any citation format. The question we'll want to ask later is how many BIB files ? since we can't have just one immense BIB file for ever article we have -- and of course we need to create a custom one for any document we create -- we're going to run into meta-problems. In this case we'd "@article" template, minimally...
@article{uniquecitekey,
author = "Heller, Nathan",
title = "The Access Trap",
journal = "The New Yorker",
year = 2022,
volume = "98",
number = "4",
month = "Mar",
pages = "34--45",
issn = "0028-792X",
doi = "20220314_NY_LowellEquity.pdf",
note = "Lowell HS equity clause"
}
We probably don't need 11 fields in a database, but BibTeX has these readymade. Review the DOI and see that it corresponds with the filename -- make any adjustments.

apa style bibtex (2:53) Charles Clayton, 2016. Also does IEEE. Important about making certain filenames match. Probably best at .75 speed. Comments excellent.
latex review (59:42) Derek Banas, 2019. Typical Banas killer review. He uses "TexShop", which appears to have autocomplete
bibtex citation (7:38) Center of Math, 2015 Have to run it twice, like a table of contents.

Step 4 - filename and folder again

Review the filename, folder, and BibTex information. Or, if a random receipt, review retreival issues again, such as the folder depth (no more than 3) and the filename (gives some info).

C. database or not?

ISSN number searchable here, topic, Somtimes we have subtopics

No comments: