Monday, December 15, 2014

boot crashes, x crashes, video crashes, mouse crashes, kernel panics

A friend has an old (c.2011) HP box that was running Arch.

boot hang

Endless possibilities. Here are some ways I've seen it solved
  • Solution: # systemctl disable dhcpcd.service. Symptoms: Boot hang with thousands of lines of fails including cgroup fails with appearance the OS or systemd are corrupt. Real source: dhcpcd.service attempts to autoconnect at startup. If it times out, eg, there's no ethernet cable, systemd interprets this nuisance as a fatal f*cking error. Nevertheless it continues to run its boot script, failing every subsequent step, and generating thousands of lines of spurious errors. Time: an entire Saturday, until lucked into it around line 1500 of an arch-chrooted (from USB) journalctl -r.

x crash

Buddy was uses Wikimapia and Google Maps in two open browser tabs. He thought "these are just websites, why does my system keep locking up", but of course he had selected perhaps the two most polygon intensive sites on the Web about which to make such a claim, and he was moving his mouse at video game speeds. All this on a 2011 HP with onboard G31 chip graphics. Little wonder the persistent failures, mostly of two varieties: 1) lockups, and 2) exit to terminal with kernel panics. Some possibilities:
  1. xorg setting for video card
  2. xorg setting for mouse
  3. browser leaks with multiple many-polygon pages open (maps)
  4. the old Intel 82G33/G31 integrated controller card being overwhelmed
  5. ?

logs

We're looking for two main log types: Xorg crashes, and Kernel crashes. Xorg is the easiest -- hasn't changed much over the years. We can still find Xorg logs in /var/log/Xorg.0.log.old and copy to our user directory to read them in any text editor. Failures stand out, and there is usually enough information to start modifying one's xorg.conf file to fix whatever's wrong. In the case of my friend, it appears there was nothing failing and writing to Xorg.log, at least on a first-look.

Kernel crashes. During crashes, writing can be inconsistent; we can't be certain we'll have information. Additionally, journald crash logs are written in binary and, further, compressed, and so require the journalctl client to read whatever's in there (and of course "whatever's in there" can become corrupted during a panic). So it's 50-50 if we can see anything in these logs. IMO, verifying a corruption-free log is the place to begin.
# journalctl --verify
...in the case of my friend's logs, this revealed a plethora of corruptions at times I could not decipher, but may have corresponded to the panics.

Given all of this, it appeared the best thing to do was get some text readable log information.
# nano /etc/systemd/journald.conf
Storage=none
ForwardToSyslog=yes
...next crash, I should be able to see if there's anything out there in human readable text. Meanwhile...
# journalctl -b
... and look for errata.

video card

On the video card front, his old Intel 82G33/G31 is onboard and shares memory (fatal), but also appears to have some VRAM. No GPU.

No comments: