Tuesday, July 9, 2013

[solved] dns, yum, rpm, curl, ping

Links: yum variables :: IPv4 address conversion :: yum commands
NB: This is complicated post. It first addresses IPv6 (mostly successfully), but a second problem is revealed specific to Fuduntu, that I could not circumvent. Since Fuduntu is defunct, I'm disregarding and posting "solved" above. Hopefully, there's plenty of info below for others working on what might be a similar Fedora flavor of the Fuduntu release problem.
Consider the following problem -- if I can ping, I should be equally able to curl, but I'm not:
$ ping www.websense.com
PING www.websense.com (204.15.67.11) 56(84) bytes of data.
64 bytes from www.websense.com (204.15.67.11): icmp_seq=1 ttl=49 time=27.9 ms
64 bytes from www.websense.com (204.15.67.11): icmp_seq=2 ttl=49 time=27.7 ms
^C
$ curl www.websense.com
curl: (6) Couldn't resolve host 'www.websense.com'
This did more than just raise my curiosity; rpm/yum relies on curl during access to repositories. First I checked for proxy and IPv6 settings. All looked normal: no proxy, IPv6 set to ignore, but not to block or forced resolution. Let's look under the hood.

tcpdump

Here are portions of dumps for the successful ping and struggling curl:
# tcpdump -nlieth0 -s0 udp port 53 -vvv
[during ping]
192.168.1.20.34097 > 192.168.1.254.53: [udp sum ok] 11891+ A? www.websense.com. (34)
192.168.1.254.53 > 192.168.1.20.34097: [udp sum ok] 11891 q: A? www.websense.com. 1/0/0 www.websense.com. [5s] A 204.15.67.11 (50)
192.168.1.20.58651 > 192.168.1.254.53: [udp sum ok] 51147+ PTR? 11.67.15.204.in-addr.arpa. (43)
192.168.1.254.53 > 192.168.1.20.58651: [udp sum ok] 51147 q: PTR? 11.67.15.204.in-addr.arpa. 1/0/0 11.67.15.204.in-addr.arpa. [9h53m39s] PTR www.websense.com. (73)

[during curl]
192.168.1.20.41050 > 192.168.1.254.53: [udp sum ok] 26082+ A? www.websense.com. (34)
192.168.1.20.41050 > 192.168.1.254.53: [udp sum ok] 54668+ AAAA? www.websense.com. (34)
192.168.1.254.53 > 192.168.1.20.41050: [udp sum ok] 26082 q: A? www.websense.com. 1/0/0 www.websense.com. [5s] A 204.15.67.11 (50)
192.168.1.254.53 > 192.168.1.20.41050: [udp sum ok] 54668- q: AAAA? www.websense.com. 0/0/0 (34)
192.168.1.20.58040 > 192.168.1.254.53: [udp sum ok] 47978+ A? www.websense.com.localdomain. (46)
192.168.1.20.58040 > 192.168.1.254.53: [udp sum ok] 42568+ AAAA? www.websense.com.localdomain. (46)
192.168.1.254.53 > 192.168.1.20.58040: [udp sum ok] 47978 NXDomain- q: A? www.websense.com.localdomain. 0/0/0 (46)
192.168.1.254.53 > 192.168.1.20.58040: [udp sum ok] 42568 NXDomain- q: AAAA? www.websense.com.localdomain. 0/0/0 (46)
Ping only queries the DNS server in IPv4 (A?) and has success. Curl initially requests in both IPv4(A?) and IPv6 (AAAA?). Although curl receives a proper response (204.15.67.11) to its IPv4 request, nothing is returned for IPv6 request. Apparently due to some bug, curl ignores the IPv4 resolution and requests a second time in both formats. It also mysteriously appends "localdomain" onto its query(!).

solution - /etc/hosts + release awareness

Links: IPv4 address conversion :: yum concerns :: cleaning old yum info

We should write a patch for curl and recompile it, but that's for programmers. I only know how to supply curl with the IPv6 information it wants. The site www.websense.com may not have an AAAA record in its DNS zone file, but I can still manually enter IPv6 info into /etc/hosts and force curl to use that.
# nano /etc/hosts
::ffff:cc0f:430b www.websense.com
204.15.67.11 www.websense.com

# nano /etc/host.conf
order hosts,bind

$ curl www.websense.com
[page loads normally]
Problem 1 solved. However, there is a second problem, one specific to Fuduntu, not curl. Fuduntu is a hybrid. It accordingly doesn't have typical Fedora values in its rpm variables, eg $releasever.
$ rpm - q fedora-release
package fedora-release is not installed

$ rpm -q fuduntu-release
fuduntu-release-2013-3.noarch

$ ls /etc/*release
ls: cannot access /etc/release*: No such file or directory

$ yum list fedora-release
Loaded plugins: fastestmirror, langpacks, presto, refresh-packagekit
Adding en_US to language list
Determining fastest mirrors
Could not retrieve mirrorlist http://packages.fuduntu.org/repo/mirrors/fuduntu-stable-rpms-2013 error was
14: PYCURL ERROR 6 - ""
Error: Cannot find a valid baseurl for repo: fuduntu
Glitches also cause this with Fedora users when version conflicts arise. In the case of Fuduntu however, the repos no longer exist -- one strategy might be to spoof Fuduntu version checking as if were being upgraded when it accesses third-party repos. If we eliminate the locally-stored repo files and the rpm release file, we might be able to override with third party information. First let's do a debug dump with the current info (in case we need it later), then remove local information.
$ yum-debug-dump
Output written to: /home/~/yum_debug_dump-local-2013-07-12_20:50:30.txt.gz

$ rpm -q fuduntu-release
fuduntu-release-2013-3.noarch

# yum remove fuduntu-release-2013-3.noarch
[screens and screens of removal]

$ rpm -q fuduntu-release
fuduntu-release-2013-3.noarch

# ls /etc/yum.repos.d
dropbox.repo fuduntu.repo
# rm /etc/yum.repos.d/fuduntu.repo
# ls /etc/pki/rpm-gpg/
RPM-GPG-KEY-fuduntu RPM-GPG-KEY-fuduntu-i386
RPM-GPG-KEY-fuduntu-2013-primary RPM-GPG-KEY-fuduntu-x86_64
# rm /etc/pki/rpm-gpg/*

# yum clean all

$ rpm -q fuduntu-release
fuduntu-release-2013-3.noarch


$releasever


Links:replace $releasever using sed :: yum variables
The orphaned Fuduntu release has no access to Fuduntu repositories because they no longer exist. Fuduntu must rely on third-party repos to move forward. Fedora-related repositories are arranged with "f[$releasever]-[$basearch]". In Fuduntu this variable was "2013-i386". This special variable worked in Fuduntu repos, but fails in 3rd party repos -- $releasever, needs to be changed to something standard, such as "17", to create a more Fedora-standard "f17-i386" variable.

But nothing worked. Not exporting the variable $releasever=17 to the kernel, not changing /etc/yum.conf, not swapping out the value of "2013" for "17" in each /etc/*release. Nothing I could find globally changed this variable. Eventually, after a couple of lost days on the project, I gave up and brute forced the repo files individually. Before modifying, I eliminated the old cache and backed-up all the unmodified repos into a new directory I called "default". Then I modified the repos, exchanging "$releasever" for "17" in each file.
# rm -r /var/tmp/*

# mkdir /etc/yum.repos.d/default
# cp /etc/yum.repos.d/* /etc/yum.repos.d/default/


# sed -i 's/$releasever/17/g' ./etc/yum.repos.d/*
The repos finally loaded.

(non)solution - remove IPv6 functionality

Link: Disabling IPv6

This is not a solution, because curl/rpm/yum needs IPv6 and IPv4 information and does not get what it needs with the process below. I'm including this info however, because it provides insight into how other TCP clients (ie, not strictly curl/rpm) can be assisted in a mixed A/AAAA environment. Some readers are interested in a Chromium,etc.
# nano /etc/sysconfig/network
NETWORKING_IPV6=no

# nano /etc/modprobe.d/blacklist.conf
blacklist nf_conntrack_ipv6
blacklist nf_defrag_ipv6


Appendix 1 - wireshark

A good link for using wireshark to check DNS problems. The wireshark GUI is more elegant than my CLI approach above, and arguably more user-friendly for those working on IPv4 /IPv6 solutions.

Appendix 2 - strace

Straces are too long to regurgitate here; let's look at the 14 relevant lines where ping succeeds and curl is unsuccessful. Of possible interest here is that, from inside the LAN, the DNS server, via DHCP, is simply the gateway at "192.168.1.254".
$ strace ping www.websense.com
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, 16) = 0
gettimeofday({1373430779, 7870}, NULL) = 0
poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}])
send(3, "\346\f\1\0\0\1\0\0\0\0\0\0\3www\10websense\3com\0\0\1"..., 34, MSG_NOSIGNAL) = 34
poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])
ioctl(3, FIONREAD, [50]) = 0
recvfrom(3, "\346\f\201\200\0\1\0\1\0\0\0\0\3www\10websense\3com\0\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, [16]) = 50
close(3) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("204.15.67.11")}, 16) = 0
getsockname(3, {sa_family=AF_INET, sin_port=htons(53553), sin_addr=inet_addr("192.168.1.20")}, [16]) = 0
close(3)

And for the failing curl :
$ strace curl www.websense.com
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, 16) = 0
gettimeofday({1373429425, 713068}, NULL) = 0
poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}])
sendmmsg(3, {{{msg_name(0)=NULL, msg_iov(1)=[{"\215s\1\0\0\1\0\0\0\0\0\0\3www\10websense\3com\0\0\1"..., 34}], msg_controllen=0, msg_flags=0}, 34}, {{msg_name(0)=NULL, msg_iov(1)=[{"\34\305\1\0\0\1\0\0\0\0\0\0\3www\10websense\3com\0\0\34"..., 34}], msg_controllen=0, msg_flags=0}, 34}}, 2, MSG_NOSIGNAL) = 2
poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])
ioctl(3, FIONREAD, [34]) = 0
recvfrom(3, "\34\305\200\0\0\1\0\0\0\0\0\0\3www\10websense\3com\0\0\34"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, [16]) = 34
close(3) = 0
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, 16) = 0

Curl never seems to leave port 53, and it also appears curl may have actually received the IP of the DNS server in response to its query to that selfsame DNS server. Perhaps this is due to curl embedding its request inside a more complex sendmmsg routine, as opposed to ping's simpler send routine. Additionally, ping uses a getsockname process not used by curl.

More information: while Epiphany is running, we check to see what calls are creating errors. Get its PID, open a terminal, and let strace run for several seconds while attempting to surf to an address in Epiphany. Then CTRL-C out and examine the data,eg....
$ strace -c -p 13881
Process 3811 attached
^CProcess 3811 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
54.08 0.000384 0 2167 writev
14.79 0.000105 5 20 munmap
13.24 0.000094 0 1526 clock_gettime
13.24 0.000094 0 6690 4567 recv
4.65 0.000033 0 4583 poll
0.00 0.000000 0 1 restart_syscall
0.00 0.000000 0 80 9 read
0.00 0.000000 0 31 write
0.00 0.000000 0 36 open
0.00 0.000000 0 36 close
0.00 0.000000 0 4 unlink
0.00 0.000000 0 18 access
0.00 0.000000 0 8 rename
0.00 0.000000 0 262 gettimeofday
0.00 0.000000 0 1 clone
0.00 0.000000 0 14 _llseek
0.00 0.000000 0 21 mmap2
0.00 0.000000 0 58 46 stat64
0.00 0.000000 0 84 8 lstat64
0.00 0.000000 0 36 fstat64
0.00 0.000000 0 2 1 madvise
0.00 0.000000 0 70 6 futex
0.00 0.000000 0 1 statfs64
------ ----------- ----------- --------- --------- ----------------
100.00 0.000710 15749 4637 total
Blog formatting squishes the data a little, but we see significant errors(4567 of them) on "recv" calls, as well as some on "stat64" and a few others.

Wish I could write in C and recompile curl.

No comments: