"X"...in a box: dns

There's a mix of ipv4, 6, and duid stuff in here. On a home network -- our home LAN -- why would we ever need NSA level DUID (DHCP Unique IDentifier) fingerprinting of our system? If it's configured, but there's some conflict, our dhcpcd will time out, or make a connection but fail to browse (no DNS), and so on. This is just another level of possible failure on a home system; why would we ever want duid in a million years? DUID settings are further down, but they seem to be set no matter what a person does to remove.

pacman

One time, I began receiving 404 failures during # pacman -Syu. I quickly dug deeper using # pacman -Syu --debug 2>&1 |tee debugfile.txt and found that either curl itself (used by pacman to retrieve packages), or the way in which pacman was using curl, was the source. This is not always a bed of roses to determine and repair since they are often random glitches in what's become a decades-long, intermittent, annoying ipv6/ipv4 transition mess. This can involve tens of kernel and application settings and confusing conf files that often interact badly. For example, no longer can one edit /etc/resolv.conf directly, since it will be overwritten by resolvconf during dhcpcd initialization. One must now edit /etc/resolvconf.conf to indirectly get at /etc/resolv.conf, and resolvconf.conf has its own syntax, different from resolv.conf, so that both file syntaxes must be verified following any change. And so on.

which ipv is working?

The most obvious start is to verify both ipv4 and ipv6 curl function. Then, one can disable or activate the failed half or, possibly... force pacman to use the ip version that *is* working, (although I'm not sure if ipv type can be specified during a pacman operation).

First, the curl ipv functionality. Take a URL from one's mirror list and ping it to see that DNS is working. Then...

# curl aprs.ele.etsmtl.ca
Bienvenue | Welcome [curl works]
# curl --ipv4 aprs.ele.etsmtl.ca
Bienvenue | Welcome
[curl, and therefor pacman, is working ipv4]
# curl --ipv6 aprs.ele.etsmtl.ca
curl: (6) Could not resolve host: aprs.ele.etsmtl.ca
[curl, and therefore pacman, is failing ipv6]

In this case, it appears pacman is requesting both an ipv6 (AAAA) and ipv4 (A) handshake via curl, but, at various points in its development, pacman could only accomplish ipv4. Strace shows futher that IPV6 is not even able to open a socket.

Another way to check the kernel network stuff is to do a # sysctl -A, which will reveal all the kernel settings, but then grep it for, "net", "ipv4", or "ipv6" settings.

Users could also run # ip addr list and look to see there's an ipv6, and ipv4, or both.

~/.curlrc

As part of the curl man page, we can see some settings for the configuration file.

In the case above, I was getting proper ipv6 configuration, but still getting a curl fail on ipv6. It turns out that I was only able to find a single post on this, here. The post indicates that it's really a glibc issue, that could only be resolved by slowing down DNS resolution via manipulations inside /etc/resolvconf.conf to manipulate /etc/resolv.conf. We want /etc/resolv.conf to say "options single-request"
and so, to achieve this, we write...

# nano /etc/resolvconf.conf
# Solve ipv4/v6 fails
resolv_conf_options="single-request"

For a list of all resolvconf.conf options, see this man page.
Disabling IPV6 in Arch is no bed of roses -- one has either to put a boot line into their GRUB, or to create an effective /etc/sysctl.d/40-ipv6.conf (see bottom). Further, we may instead need to enable ipv6 more thoroughly across all applications, not further cripple it. It's a typical ipv4/ipv6 trial and error mess due to opaque reliance upon multiple programs (eg. curl)

old fix

This one used to work, but nowadays with built-in ipv6, doesn't always work. Add a couple lines in your blacklist file, eg...

# nano /etc/modprobe.d/blacklist
# stop pacman failure when encounter ipv6 sites
blacklist nf_conntrack_ipv6
blacklist nf_defrag_ipv6

grub

Use nano or whatever to add

/etc/sysctl.d/40-ipv6.conf

Here's one example, not guaranteed to work.

# Disable IPv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv6.conf.wlan0.disable_ipv6 = 1

duid

I've removed all duid references but I still get some duid when I connect with dhcpcd. Be sure too to delete all old "leases" in /var/lib/dhcpcd. Note also that they put a duplicate version of dhcpcd.conf inside /var/lib/dhdpcd/etc/, and that this is the one used by the system. You could probably even delete /etc/dhcpcd.So "duid" needs to be commented out at that location.

# rm /etc/dhcpcd.duid
# rm /etc/dhdpcd.secret
# rm /var/lib/dhdpcd/duid
# nano /etc/dhdpcd.conf [comment duid and clientid lines]

So then, here's a truncated version of etc/dhcpcd.conf. But even with al duid disabled, the router is apparently coaxed into collecting one.

# nano /etc/dhcpcd.conf

# A sample configuration for dhcpcd.

# See dhcpcd.conf(5) for details.



# Inform the DHCP server of our hostname for DDNS.

hostname



# duid



# Persist interface configuration when dhcpcd exits.

persistent



option rapid_commit



# A list of options to request from the DHCP server.

option domain_name_servers, domain_name, domain_search

option classless_static_routes



# Get the hostname

option host_name



# Most distributions have NTP support.

option ntp_servers



# Respect the network MTU. This is applied to DHCP routes.

option interface_mtu



# A ServerID is required by RFC2131.

require dhcp_server_identifier



# Prevent timeouts for ipv6 failures

noipv6

noipv6rs

Links: yum variables :: IPv4 address conversion :: yum commands
NB: This is complicated post. It first addresses IPv6 (mostly successfully), but a second problem is revealed specific to Fuduntu, that I could not circumvent. Since Fuduntu is defunct, I'm disregarding and posting "solved" above. Hopefully, there's plenty of info below for others working on what might be a similar Fedora flavor of the Fuduntu release problem.

Consider the following problem -- if I can ping, I should be equally able to curl, but I'm not:

$ ping www.websense.com
PING www.websense.com (204.15.67.11) 56(84) bytes of data.
64 bytes from www.websense.com (204.15.67.11): icmp_seq=1 ttl=49 time=27.9 ms
64 bytes from www.websense.com (204.15.67.11): icmp_seq=2 ttl=49 time=27.7 ms
^C
$ curl www.websense.com
curl: (6) Couldn't resolve host 'www.websense.com'

This did more than just raise my curiosity; rpm/yum relies on curl during access to repositories. First I checked for proxy and IPv6 settings. All looked normal: no proxy, IPv6 set to ignore, but not to block or forced resolution. Let's look under the hood.

tcpdump

Here are portions of dumps for the successful ping and struggling curl:

# tcpdump -nlieth0 -s0 udp port 53 -vvv
[during ping]
192.168.1.20.34097 > 192.168.1.254.53: [udp sum ok] 11891+ A? www.websense.com. (34)
192.168.1.254.53 > 192.168.1.20.34097: [udp sum ok] 11891 q: A? www.websense.com. 1/0/0 www.websense.com. [5s] A 204.15.67.11 (50)
192.168.1.20.58651 > 192.168.1.254.53: [udp sum ok] 51147+ PTR? 11.67.15.204.in-addr.arpa. (43)
192.168.1.254.53 > 192.168.1.20.58651: [udp sum ok] 51147 q: PTR? 11.67.15.204.in-addr.arpa. 1/0/0 11.67.15.204.in-addr.arpa. [9h53m39s] PTR www.websense.com. (73)

[during curl]
192.168.1.20.41050 > 192.168.1.254.53: [udp sum ok] 26082+ A? www.websense.com. (34)
192.168.1.20.41050 > 192.168.1.254.53: [udp sum ok] 54668+ AAAA? www.websense.com. (34)
192.168.1.254.53 > 192.168.1.20.41050: [udp sum ok] 26082 q: A? www.websense.com. 1/0/0 www.websense.com. [5s] A 204.15.67.11 (50)
192.168.1.254.53 > 192.168.1.20.41050: [udp sum ok] 54668- q: AAAA? www.websense.com. 0/0/0 (34)
192.168.1.20.58040 > 192.168.1.254.53: [udp sum ok] 47978+ A? www.websense.com.localdomain. (46)
192.168.1.20.58040 > 192.168.1.254.53: [udp sum ok] 42568+ AAAA? www.websense.com.localdomain. (46)
192.168.1.254.53 > 192.168.1.20.58040: [udp sum ok] 47978 NXDomain- q: A? www.websense.com.localdomain. 0/0/0 (46)
192.168.1.254.53 > 192.168.1.20.58040: [udp sum ok] 42568 NXDomain- q: AAAA? www.websense.com.localdomain. 0/0/0 (46)

Ping only queries the DNS server in IPv4 (A?) and has success. Curl initially requests in both IPv4(A?) and IPv6 (AAAA?). Although curl receives a proper response (204.15.67.11) to its IPv4 request, nothing is returned for IPv6 request. Apparently due to some bug, curl ignores the IPv4 resolution and requests a second time in both formats. It also mysteriously appends "localdomain" onto its query(!).

solution - /etc/hosts + release awareness

Links: IPv4 address conversion :: yum concerns :: cleaning old yum info

We should write a patch for curl and recompile it, but that's for programmers. I only know how to supply curl with the IPv6 information it wants. The site www.websense.com may not have an AAAA record in its DNS zone file, but I can still manually enter IPv6 info into /etc/hosts and force curl to use that.

# nano /etc/hosts
::ffff:cc0f:430b www.websense.com
204.15.67.11 www.websense.com

# nano /etc/host.conf
order hosts,bind

$ curl www.websense.com
[page loads normally]

Problem 1 solved. However, there is a second problem, one specific to Fuduntu, not curl. Fuduntu is a hybrid. It accordingly doesn't have typical Fedora values in its rpm variables, eg $releasever.

$ rpm - q fedora-release
package fedora-release is not installed

$ rpm -q fuduntu-release
fuduntu-release-2013-3.noarch

$ ls /etc/*release
ls: cannot access /etc/release*: No such file or directory

$ yum list fedora-release
Loaded plugins: fastestmirror, langpacks, presto, refresh-packagekit
Adding en_US to language list
Determining fastest mirrors
Could not retrieve mirrorlist http://packages.fuduntu.org/repo/mirrors/fuduntu-stable-rpms-2013 error was
14: PYCURL ERROR 6 - ""
Error: Cannot find a valid baseurl for repo: fuduntu

Glitches also cause this with Fedora users when version conflicts arise. In the case of Fuduntu however, the repos no longer exist -- one strategy might be to spoof Fuduntu version checking as if were being upgraded when it accesses third-party repos. If we eliminate the locally-stored repo files and the rpm release file, we might be able to override with third party information. First let's do a debug dump with the current info (in case we need it later), then remove local information.

$ yum-debug-dump
Output written to: /home/~/yum_debug_dump-local-2013-07-12_20:50:30.txt.gz

$ rpm -q fuduntu-release
fuduntu-release-2013-3.noarch

# yum remove fuduntu-release-2013-3.noarch
[screens and screens of removal]

$ rpm -q fuduntu-release
fuduntu-release-2013-3.noarch

# ls /etc/yum.repos.d
dropbox.repo fuduntu.repo
# rm /etc/yum.repos.d/fuduntu.repo
# ls /etc/pki/rpm-gpg/
RPM-GPG-KEY-fuduntu RPM-GPG-KEY-fuduntu-i386
RPM-GPG-KEY-fuduntu-2013-primary RPM-GPG-KEY-fuduntu-x86_64
# rm /etc/pki/rpm-gpg/*

# yum clean all

$ rpm -q fuduntu-release
fuduntu-release-2013-3.noarch

$releasever

Links:replace $releasever using sed :: yum variables

The orphaned Fuduntu release has no access to Fuduntu repositories because they no longer exist. Fuduntu must rely on third-party repos to move forward. Fedora-related repositories are arranged with "f[$releasever]-[$basearch]". In Fuduntu this variable was "2013-i386". This special variable worked in Fuduntu repos, but fails in 3rd party repos -- $releasever, needs to be changed to something standard, such as "17", to create a more Fedora-standard "f17-i386" variable.

But nothing worked. Not exporting the variable $releasever=17 to the kernel, not changing /etc/yum.conf, not swapping out the value of "2013" for "17" in each /etc/*release. Nothing I could find globally changed this variable. Eventually, after a couple of lost days on the project, I gave up and brute forced the repo files individually. Before modifying, I eliminated the old cache and backed-up all the unmodified repos into a new directory I called "default". Then I modified the repos, exchanging "$releasever" for "17" in each file.

# rm -r /var/tmp/*

# mkdir /etc/yum.repos.d/default
# cp /etc/yum.repos.d/* /etc/yum.repos.d/default/

# sed -i 's/$releasever/17/g' ./etc/yum.repos.d/*

The repos finally loaded.

(non)solution - remove IPv6 functionality

Link: Disabling IPv6

This is not a solution, because curl/rpm/yum needs IPv6 and IPv4 information and does not get what it needs with the process below. I'm including this info however, because it provides insight into how other TCP clients (ie, not strictly curl/rpm) can be assisted in a mixed A/AAAA environment. Some readers are interested in a Chromium,etc.

# nano /etc/sysconfig/network
NETWORKING_IPV6=no

# nano /etc/modprobe.d/blacklist.conf
blacklist nf_conntrack_ipv6
blacklist nf_defrag_ipv6

Appendix 1 - wireshark

A good link for using wireshark to check DNS problems. The wireshark GUI is more elegant than my CLI approach above, and arguably more user-friendly for those working on IPv4 /IPv6 solutions.

Appendix 2 - strace

Straces are too long to regurgitate here; let's look at the 14 relevant lines where ping succeeds and curl is unsuccessful. Of possible interest here is that, from inside the LAN, the DNS server, via DHCP, is simply the gateway at "192.168.1.254".

$ strace ping www.websense.com
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, 16) = 0
gettimeofday({1373430779, 7870}, NULL) = 0
poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}])
send(3, "\346\f\1\0\0\1\0\0\0\0\0\0\3www\10websense\3com\0\0\1"..., 34, MSG_NOSIGNAL) = 34
poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])
ioctl(3, FIONREAD, [50]) = 0
recvfrom(3, "\346\f\201\200\0\1\0\1\0\0\0\0\3www\10websense\3com\0\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, [16]) = 50
close(3) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("204.15.67.11")}, 16) = 0
getsockname(3, {sa_family=AF_INET, sin_port=htons(53553), sin_addr=inet_addr("192.168.1.20")}, [16]) = 0
close(3)

And for the failing curl :

$ strace curl www.websense.com
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, 16) = 0
gettimeofday({1373429425, 713068}, NULL) = 0
poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}])
sendmmsg(3, {{{msg_name(0)=NULL, msg_iov(1)=[{"\215s\1\0\0\1\0\0\0\0\0\0\3www\10websense\3com\0\0\1"..., 34}], msg_controllen=0, msg_flags=0}, 34}, {{msg_name(0)=NULL, msg_iov(1)=[{"\34\305\1\0\0\1\0\0\0\0\0\0\3www\10websense\3com\0\0\34"..., 34}], msg_controllen=0, msg_flags=0}, 34}}, 2, MSG_NOSIGNAL) = 2
poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])
ioctl(3, FIONREAD, [34]) = 0
recvfrom(3, "\34\305\200\0\0\1\0\0\0\0\0\0\3www\10websense\3com\0\0\34"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, [16]) = 34
close(3) = 0
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, 16) = 0

Curl never seems to leave port 53, and it also appears curl may have actually received the IP of the DNS server in response to its query to that selfsame DNS server. Perhaps this is due to curl embedding its request inside a more complex sendmmsg routine, as opposed to ping's simpler send routine. Additionally, ping uses a getsockname process not used by curl.

More information: while Epiphany is running, we check to see what calls are creating errors. Get its PID, open a terminal, and let strace run for several seconds while attempting to surf to an address in Epiphany. Then CTRL-C out and examine the data,eg....

$ strace -c -p 13881
Process 3811 attached
^CProcess 3811 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
54.08 0.000384 0 2167 writev
14.79 0.000105 5 20 munmap
13.24 0.000094 0 1526 clock_gettime
13.24 0.000094 0 6690 4567 recv
4.65 0.000033 0 4583 poll
0.00 0.000000 0 1 restart_syscall
0.00 0.000000 0 80 9 read
0.00 0.000000 0 31 write
0.00 0.000000 0 36 open
0.00 0.000000 0 36 close
0.00 0.000000 0 4 unlink
0.00 0.000000 0 18 access
0.00 0.000000 0 8 rename
0.00 0.000000 0 262 gettimeofday
0.00 0.000000 0 1 clone
0.00 0.000000 0 14 _llseek
0.00 0.000000 0 21 mmap2
0.00 0.000000 0 58 46 stat64
0.00 0.000000 0 84 8 lstat64
0.00 0.000000 0 36 fstat64
0.00 0.000000 0 2 1 madvise
0.00 0.000000 0 70 6 futex
0.00 0.000000 0 1 statfs64
------ ----------- ----------- --------- --------- ----------------
100.00 0.000710 15749 4637 total

Blog formatting squishes the data a little, but we see significant errors(4567 of them) on "recv" calls, as well as some on "stat64" and a few others.

Wish I could write in C and recompile curl.

"X"...in a box

Tuesday, June 12, 2018

pacman failure "404", duid (dhcpcd)