Thoughts on codec-dns's API

19 views

Skip to first unread message

Tim Boudreau

unread,

Feb 14, 2017, 1:03:00 AM2/14/17

to Netty discussions

A year and change ago I wrote some patches to add support for writing DNS servers to codec-dns. The patches were on the now-deprecated "master" branch, and the pull request got closed when the "master" branch was. I've got a work-in-progress version for 4.1 (alas, without realizing which branch I was on, I forked off the updated old "master" branch again, so I've got more work to do to *actually* adapt it to 4.1 this time).

I'm finding that there are some things I think could be improved over the existing API, and some choices in what's there that I think there are better options for. So I'm wondering if a pretty high-impact set of patches is likely to be accepted.

Here are the things I'm aiming to improve - you can see the work-in-progress against the old master branch here - https://github.com/timboudreau/netty/tree/dns-server-ii

1. Some memory footprint improvements for records and packets

- Use CharSequence for DNS names, so AsciiString or another byte-array backed string can be used (but unicode is an option - see below)

- Store the binary header flags as a Map implementation that reads/writes bits from a short, rather than 4 boolean fields on DnsResponse (could have used EnumMap, but it was kind of a zen little project one evening to give it a micro-footprint - someone else was nuts enough to make all the lists in AbstractDnsMessage type Object and flip between being a single record and list of them - that's no crazier :-))

2. Unify code that reads and writes DNS names, implement writing pointer compression and punycode decoding as NameCodec, and add it as a parameter to decoding methods, so it can cache offsets to previously written strings. It's implemented as AutoCloseable, so you can either clear it's state in a try-with-resources loop or if you need to write to one record from multiple threads (not likely, but conceivable), its state is not tied to the current thread. So encoding looks something like

try (NameCodec nameCodec = NameCodec.get(COMPRESSION, PUNYCODE)) {

for (DnsSection sect : sections) {

int max = msg.count(sect);

for (int i = 0; i < max; i++) {

DnsRecord record = msg.recordAt(sect, i);

encoder.encodeRecord(nameCodec, record, into);

}

and decoding looks similar.

It seems like there are a bunch of use cases for DNS code - writing a name server, writing a caching name server that also makes DNS requests and transforms or caches records, writing a DNS client. For names, these don't have the same requirements. For example, if I'm writing a simple cache, there is zero reason to store DNS names as 16-bit unicode Java chars - the smaller punycode wire representation is fine. On the other hand, a DNS client probably needs that, and anything with a GUI definitely does.

The current code is String-based and uses java.net.IDN for punycode -> unicode translation. If we switch to using CharSequence, we can make it optional. Similarly, depending on what you're doing, outputing the trailing dot that makes a domain name fully qualified may not be the right thing in all cases. The way I'm approaching it is simply to have a NameCodec.Factory you pass into your handler (since pointer encoding is stateful, you need a way to get multiple instances). And you pass some enum elements describing features you want (punycode, name compression on write, trailing dots).

Another reason for this approach is the cruft I can already see accumulating - multiply this bit of cruft below * the number of RDATA types that involve names and things will get unmaintainable quickly:

// DNS message compression means that domain names may contain "pointers" to other positions in the packet

// to build a full message. This means the indexes are meaningful and we need the ability to reference the

// indexes un-obstructed, and thus we cannot use a slice here.

// See https://www.ietf.org/rfc/rfc1035 [4.1.4. Message compression]

if (type == DnsRecordType.PTR) { // XXX deleteme

return new DefaultDnsPtrRecord(

name.toString(), DnsClass.valueOf(dnsClass), timeToLive,

forReadingNames.readName(in.duplicate().setIndex(offset, offset + length)).toString());

}

4. Optional paramterized DNS records - i.e.

interface TypedDnsRecord<T> extends DnsRecord {

T content();

}

that will allow *pluggable* codecs for DnsRecordType+Class pairs (and a nice builder API for that registry) - all you do is use a different DnsRecordDecoder/Encoder. So if you get an A record, you can have a TypedDnsRecord<Ipv4Address> if you want. Or plug in your own codec if you want it to be a TypedDnsRecord<Integer>. Anything there isn't a codec for gets DnsRawRecord just as now.

I see on the 4.1 branch there is the beginning of attempts to implement various record types, e.g. DefaultDnsOptEcsRecord, DefaultDnsPtrRecord, etc. Things will be much more maintainable (not to mention less boilerplate code, smaller memory footprint and less duplicate code) if the payload and record are separate concerns. The record header is the same, code-wise, no matter what the payload is - so duplicating a bunch of constructor and setup code to have subclasses of DnsRecord for every type of RDATA out there is very noisy, code-wise.

But more importantly, there are a number of record types with the same payload - a PTR, NS, CNAME and DNAME are all either TypedDnsRecord<CharSequence> ... or they are DnsPtrRecord, DefaultDnsPtrRecord, DnsNsRecord, DefaultDnsNsRecord, DnsCnameRecord, DefaultDnsCnameRecord, ... this seems like no contest.

In the bigger picture, pluggable record codecs/payloads means there's a simple way for anybody using this API to encode/decode additional record types, and the code you write to do that is really focused on just deserializing a Java object from a ByteBuf. E.g.

class MxDecoder extends DnsRecordCodec<MailExchanger> {

MxDecoder() {

super(MailExchanger.class);

}

@Override

public MailExchanger read(ByteBuf buf, NameCodec forReadingNames) throws DnsDecoderException, UnmappableCharacterException {

int pref = buf.readShort();

CharSequence mx = forReadingNames.readName(buf);

return new MailExchanger(pref, mx);

}

@Override

public void write(MailExchanger value, NameCodec names, ByteBuf into) throws IOException {

into.writeShort(value.pref());

names.writeName(value.mx(), into);

}

I think for long-term maintainability, this is the way to go, and has some benefits for anyone using the API as well.

(fwiw, I've got a nice little AbstractDnsClient that takes a callback you can parameterize and filter with - so you can do

client.query("foo.com", A, (DnsRecordType type, long timeToLive, Ipv4Address answer, int item, int ofTotal) -> {

// here's your answer...

}

so you can parameterize on the type your record codec decodes w/ no casts - but that's probably a separate patch).

5. I'm not at all sure that having DatagramDnsQuery and DatagramDnsResponse implement AddressedEnvelope and carry around their sender and recipient is right. Any time you're caching or forwarding messages or otherwise reusing them, you wind up having to write a bunch of extra code to copy between instances and datagram- and non-datagram types. A DNS message categorically does not need to know where it's going, and it seems like one that does looks convenient in the simplest case but causes problems as soon as you do something interesting.

6. The distinction between DnsQuery and DnsResponse is kind of artificial, and having separate decoder and encoder classes that do almost exactly the same thing seems silly. Once you remove the overloads to alter the return type of some methods (I submitted an earlier patch to paramterize those so the base class takes care of it), DnsQuery is just a marker interface over DnsMessage. And having a marker interface is usually a good indication that something isn't right. In fact, the packets have the same header modulo one bit. So just having DnsMessage and DnsMessageDecoder/Encoder seems like the right thing. But there is an argument to be made that it's more obvious how to use the API with them separated for that reason (but you can solve API usability other ways).

Anyway, this is a lot of stuff - the question is, is it too large a set of changes to be likely to be accepted?

Here's the diff against master_deprecated, which compiles and works. Note the TypedDnsRecord stuff is not in there (nor is punycode support) - I implemented that separately, so that it can be bolted onto the existing API if not accepted (I learned my lesson with the previous pull request :-)). https://github.com/netty/netty/compare/master_deprecated...timboudreau:dns-server-ii

Thanks,

Tim

Reply all

Reply to author

Forward

0 new messages