DNS & BIND

DNS & BINDSearch this book
Previous: 13.1 Is NIS Really Your Problem?Chapter 13
Troubleshooting DNS and BIND
Next: 13.3 Potential Problem List
 

13.2 Troubleshooting Tools and Techniques

We went over nslookup and the name server's debugging output in the last two chapters. Before we go on, let's introduce two new tools that can be useful in troubleshooting: named-xfer and name server database dumps.

13.2.1 How to Use named-xfer

named-xfer is the program named starts to perform zone transfers. named-xfer checks whether the slave's copy of the zone data is up-to-date, and transfers a new zone, if necessary. (In versions 4.9 and 8, named checks first if a zone is up to date in order to avoid starting up child processes when no transfer is necessary.)

In Chapter 12, Reading BIND Debugging Output, we showed you the debugging output a slave name server logged as it checked its zone. When the slave server transferred the zone, it started a child process (named-xfer) to pull the data to the local file system. We didn't tell you, however, that you can also start named-xfer manually, instead of waiting for named to start it, and that you can tell it to produce debugging output independently of named.

This can be useful if you're tracking down a problem with zone transfers but don't want to wait for named to schedule one. To test a zone transfer manually, you need to specify a number of command-line options:

% /etc/named-xfer
Usage: xfer
        -z zone_to_transfer
        -f db_file
        -s serial_no
        [-d debug_level]
        [-l debug_log_file]
        [-t trace_file]
        [-p port]
        [-S]
        [-C class]
        servers...

When named starts named-xfer, it specifies the -z option (the zone named wants to check), the -f option (the name of the db file that corresponds to the zone, from named.boot), the -s option (the zone's serial number on the slave, from the current SOA record), and the addresses of the servers the secondary was instructed to load from (the IP addresses from the masters substatement in the zone statement in named.conf, or from the secondary directive in named.boot). If named is running in debug mode, it also specifies the debug level for named-xfer with the -d option.

When you run named-xfer manually, you can also specify the debug level on the command line with -d. (Don't forget, though, that debug levels above three will produce tons of debugging output if the transfer succeeds!) You can also specify an alternate filename for the debug file with the -l option. The default log file is /usr/tmp/xfer.ddt.XXXXXX, where XXXXXX is a suffix appended to preserve uniqueness, or a file by the same name in /var/tmp. And you can specify the name of the host to load from, instead of its IP address.

For example, you could check to see whether zone transfers from terminator were working, with the following command line:

% /etc/named-xfer -z movie.edu -f /tmp/db.movie -s 0 terminator
% echo $?
1

In this command, we specified a serial number of zero because we wanted to force named-xfer to attempt a zone transfer, even if it wasn't needed. Zero is a special serial number - named-xfer will transfer the zone regardless of the actual zone serial number. Also, we told named-xfer to put the new zone file in /tmp, rather than overwriting the zone's working data file.

We can tell if the transfer succeeded by looking at named-xfer's return value, which has four possible values:

0

The zone data is up-to-date and no transfer was needed

1

Indicates a successful transfer

2

The host(s) named-xfer queried can't be reached, or an error occurred and named-xfer may have logged an error message

3

An error occurred and named-xfer logged an error message

13.2.2 How to Read a Database Dump

Poring over a dump of the name server's internal database - including cached information - can also help you track down problems. The INT signal causes named to dump its authoritative data, cache data, and hints data to named_dump.db in BIND's running directory (or in /usr/tmp/named_dump.db or /var/tmp/named_dump.db, for BIND 4). An example of a named_dump.db file follows. The authoritative data and cache entries, mixed together, appear first in the file. At the end of the file are the hints data:

; Dumped at Tue Jan  6 10:49:08 1998
;; ++zone table++
; 0.0.127.in-addr.arpa (type 1, class 1, source db.127.0.0)
;   time=0, lastupdate=0, serial=1,
;   refresh=0, retry=3600, expire=608400, minimum=86400
;   ftime=884015430, xaddr=[0.0.0.0], state=0041, pid=0
;; --zone table--
; Note: Cr=(auth,answer,addtnl,cache) tag only shown for non-auth RR's
; Note: NT=milliseconds for any A RR which we've used as a nameserver
; --- Cache & Data ---
$ORIGIN .
.   518375  IN      NS  G.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  J.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  K.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  L.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  M.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  A.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  H.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  B.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  C.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  D.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  E.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  I.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  F.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
EDU  86393  IN      SOA A.ROOT-SERVERS.NET.  hostmaster.INTERNIC.NET. (
         1998010500 1800 900 604800 86400 )   ;Cr=addtnl [128.63.2.53]
$ORIGIN  0.127.in-addr.arpa.
0        IN    SOA cujo.movie.edu. root.cujo.movie.edu. (
         1998010600 10800 3600 608400 86400 )        ;Cl=5
         IN    NS  cujo.movie.edu.   ;Cl=5
$ORIGIN  0.0.127.in-addr.arpa.
1        IN    PTR localhost.    ;Cl=5
$ORIGIN EDU.
PURDUE   172787  IN  NS  NS.PURDUE.EDU.           ;Cr=addtnl [192.36.148.17]
         172787  IN  NS  MOE.RICE.EDU.            ;Cr=addtnl [192.36.148.17]
         172787  IN  NS  PENDRAGON.CS.PURDUE.EDU.  ;Cr=addtnl [192.36.148.17]
         172787  IN  NS  HARBOR.ECN.PURDUE.EDU.    ;Cr=addtnl [192.36.148.17]
$ORIGIN  movie.EDU.
;cujo    593     IN  SOA  A.ROOT-SERVERS.NET. hostmaster.INTERNIC. NET. (
;        1998010500 1800 900 604800 86400 );EDU.; NXDOMAIN  ;-$
   ;Cr=auth [128.63.2.53]
$ORIGIN   RICE.EDU.
MOE      172787  IN  A   128.42.5.4        ;NT=84 Cr=addtnl [192.36.148.17]
$ORIGIN   PURDUE.EDU.
CS       86387   IN  NS  pendragon.cs.PURDUE.edu.    ;Cr=addtnl [128.42.5.4]
         86387   IN  NS  ns.PURDUE.edu.              ;Cr=addtnl [128.42.5.4]
         86387   IN  NS  harbor.ecn.PURDUE.edu.      ;Cr=addtnl [128.42.5.4]
         86387   IN  NS  moe.rice.edu.               ;Cr=addtnl [128.42.5.4]
NS       172787   IN  A  128.210.11.5        ;NT=4 Cr=addtnl [192.36.148.17]
$ORIGIN   ECN.PURDUE.EDU.
HARBOR   172787  IN   A  128.46.199.76       ;NT=6 Cr=addtnl [192.36.148.17]
$ORIGIN   CS.PURDUE.EDU.
galt     86387   IN   A  128.10.2.39                 ;Cr=auth [128.42.5.4]
PENDRAGON  172787  IN  A  128.10.2.5         ;NT=20 Cr=addtnl [192.36.148.17]
$ORIGIN   ROOT-SERVERS.NET.
K        604775    IN  A  193.0.14.129       ;NT=10 Cr=answer [128.8.10.90]
A        604775    IN  A  198.41.0.4         ;NT=20 Cr=answer [128.8.10.90]
L        604775    IN  A  198.32.64.12       ;NT=8 Cr=answer [128.8.10.90]
B        604775    IN  A  128.9.0.107        ;NT=9 Cr=answer [128.8.10.90]
M        604775    IN  A  202.12.27.33       ;NT=20 Cr=answer [128.8.10.90]
C        604775    IN  A  192.33.4.12        ;NT=17 Cr=answer [128.8.10.90]
D        604775    IN  A  128.8.10.90        ;NT=11 Cr=answer [128.8.10.90]
E        604775    IN  A  192.203.230.10     ;NT=9 Cr=answer [128.8.10.90]
F        604775    IN  A  192.5.5.241        ;NT=73 Cr=answer [128.8.10.90]
G        604775    IN  A  192.112.36.4       ;NT=14 Cr=answer [128.8.10.90]
H        604775    IN  A  128.63.2.53        ;NT=160 Cr=answer [128.8.10.90]
I        604775    IN  A  192.36.148.17      ;NT=102 Cr=answer [128.8.10.90]
J        604775    IN  A  198.41.0.10        ;NT=21 Cr=answer [128.8.10.90]
; --- Hints ---
$ORIGIN .
.   3600           IN  NS  A.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  B.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  C.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  D.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  E.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  F.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  G.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  H.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  I.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  J.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  K.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  L.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  M.ROOT-SERVERS.NET.     ;Cl=0
$ORIGIN   ROOT-SERVERS.NET.
K     3600         IN   A  193.0.14.129      ;NT=11 Cl=0
L     3600         IN   A  198.32.64.12      ;NT=9 Cl=0
A     3600         IN   A  198.41.0.4        ;NT=10 Cl=0
M     3600         IN   A  202.12.27.33      ;NT=11 Cl=0
B     3600         IN   A  128.9.0.107       ;NT=1288 Cl=0
C     3600         IN   A  192.33.4.12       ;NT=21 Cl=0
D     3600         IN   A  128.8.10.90       ;NT=1288 Cl=0
E     3600         IN   A  192.203.230.10    ;NT=19 Cl=0
F     3600         IN   A  192.5.5.241       ;NT=23 Cl=0
G     3600         IN   A  192.112.36.4      ;NT=18 Cl=0
H     3600         IN   A  128.63.2.53       ;NT=11 Cl=0
I     3600         IN   A  192.36.148.17     ;NT=21 Cl=0
J     3600         IN   A  198.41.0.10       ;NT=13 Cl=0

The name server that created this named_dump.db file was authoritative only for 0.0.127.in-addr.arpa. Only two names have been looked up by this server: galt.cs.purdue.edu and cujo.movie.edu. In the process of looking up galt, this server cached not only the address of galt, but also the list of name servers for purdue.edu and the addresses for those servers. The name cujo.movie.edu, however, doesn't really exist (nor does the domain movie.edu, except in our examples), so the server cached the negative response. In the dump file, the negative response is commented out (the line starts with a semicolon), and the reason is listed (NXDOMAIN) instead of real data. You'll notice the TTL is quite low (593). Negative responses are only cached for 10 minutes (600 seconds).

The hints section at the bottom of the file contains the data from the db.cache file. The TTL of the hints data is decremented, and it may go to zero, but the hints are never discarded.

Note that some of the resource records are followed by a semicolon and NT=. You will only see these on the address records of name servers. The number is a round-trip time calculation that the name server keeps so that it knows which name servers have responded most quickly in the past; the name server with the lowest round-trip time will be tried first the next time.

The cache data is easy to pick out - those entries have a credibility tag (Cr=) and the IP address of the server the data came from. The zone data and hint data are tagged with (Cl=), which is just a count of the level in the domain tree. (root is level 0, foo is level 1, foo.foo is level 2, etc.) Let's digress a moment to explain the concept of credibility.

One of the advances between version 4.8.3 and 4.9 is the addition of a credibility measure. This allows a name server to make more intelligent decisions about what to do with new data from a remote server.

A 4.8.3 name server only had two credibility levels - locally authoritative data, and everything else. The locally authoritative data were data from your zone files - your name server knew better than to update its internal copy of what came from your zone file. But, all data from remote name servers were considered equal.

Here is a situation that could happen, and the way a 4.8.3 server would deal with it. Suppose that your server looked up an address for terminator.movie.edu and received an authoritative answer from the movie.edu name server. (Remember, an authoritative answer is the best you can get.) Sometime later while looking up foo.ora.com, your server receives another address record for terminator.movie.edu, but this time as part of the delegation info for ora.com (which terminator.movie.edu backs up). The 4.8.3 name server would update the cached address record for terminator.movie.edu, even though the data came from the com name server instead of the authoritative movie.edu name server. Of course, the com and movie.edu name servers will have exactly the same data for terminator, so this won't be a problem, right? Yeah, and it never rains in southern California, either.

A 4.9 or 8 name server is more intelligent. Like a 4.8.3 name server, it still considers your zone data beyond any doubt. But a 4.9 name server distinguishes among the different data from remote name servers. Here is the hierarchy of remote data credibility from most credible to least:

auth

These records are data from authoritative answers - the answer section of a response packet with the authoritative answer bit set.

answer

These records are data from nonauthoritative, or cached, answers - the answer section of a response packet without the authoritative answer bit set.

addtnl

These records are data from the rest of the response packet - the authority and additional. The authority section of the response contains NS records that delegate a domain to an authoritative name server. The additional section contains address records that may complete information in other sections (i.e., address records that go with NS records in the authority section).

There is one exception to this rule: when the server is priming its root name server cache, the records that would be at credibility addtnl are bumped up to answer credibility, to make them harder to change accidentally. Notice in the dump that the address records for root name servers are at credibility answer, but the address records for the purdue.edu name servers are at credibility addtnl.

In the situation just described, a 4.9 or 8 name server would not replace the authoritative data (credibility = auth) for terminator.movie.edu with the delegation data (credibility = addtnl) because the authoritative answer has higher credibility.

13.2.3 Logging Queries

BIND version 4.9 added a feature, query logging, which can be used to help diagnose certain problems. When query logging is turned on, a running name server will log every query with syslog. This feature could help you find resolver configuration errors, because you can check that the name you think is being looked up really is the name being looked up.

First you must make sure that LOG_INFO messages are being logged by syslog for the facility daemon. Next you need to turn on query logging. This can be done three ways: for BIND 4.9, set options query-log in your name server boot file; for BIND 4.9 or BIND 8, start the name server with -q on the command line, or send a WINCH signal to a running name server. You'll start seeing syslog messages like this:

Feb 20 21:43:25 terminator named[3830]:
                     XX /192.253.253.2/carrie.movie.edu/A
Feb 20 21:43:32 terminator named[3830]:
                     XX /192.253.253.2/4.253.253.192.in-addr.arpa/PTR

These messages include the IP address of the host that made the query and the query itself. Inverse queries will have a dash before the query type (i.e., an inverse query for an address record would be logged as "-A" instead of just "A"). After enough queries have been logged, you can turn off query logging by sending a WINCH signal to your name server.


Previous: 13.1 Is NIS Really Your Problem?DNS & BINDNext: 13.3 Potential Problem List
13.1 Is NIS Really Your Problem?Book Index13.3 Potential Problem List