mdnconv


Overview

mdnconv is a code set (encoding) conversion tool for the named.conf configuration and zone master files used with name servers.

mdnconv receives the file specified by an argument or from standard input, executes the encoding conversion and normalization specified by the mDNkit configuration file (mdn.conf) or option, and writes the result to standard output.

For detailed information about usage, refer to Creating named.conf, a Zone Master File, in the User's Guide.

In addition, since mdnsproxy ignores the setting of the environment variable MDN_DISABLE explicitly ,in spite of whether the environment variable MDN_DISABLE is set or not, the conversion of the domain names is performed.


Startup

The command line format used to start up mdnconv is as follows.

% mdnconv [Option...] [File name...]


Option

Many options of mdnconv have the same configuration functions as the entries in the mDNkit configuration file (mdn.conf).

When an entry in the configuration file and an option are both specified, both specifications are valid if the configuration item permits more than one definition (for example, nameprep-map entry and its corresponding -map option). Precisely, the definition content in the configuration file is set first, then the content specified by the option is added.

When the configuration item allows only one definition (for example, idn-encoding and its corresponding -out option) and both are specified, the operation takes precedence.

mdnconv recognizes the following options. For frequently used options, one character option is provided as a short version.

-in <in-code>
-i <in-code>
Specifies the input text encoding. The default encoding in not specifying this option is local encoding of application. However, in specifying -reverse option which is described later, the default encoding is the setting of idn-encoding entry in mDNkit configuration file.
-out <out-code>
-o <out-code>
Specifies the code set name <out-code> of output text. The default encoding in not specifying this option is the setting of idn-encoding entry in mDNkit configuration file. However, in specifying -reverse option which is described later, the default encoding is local encoding of application.
-conf <path>
-c <path>
Specifies the path name of the mDNkit configuration file (mdn.conf). When both the -conf and -noconf options are not specified, the default configuration file is loaded.
-noconf
-C
Prevents loading of the mDNkit configuration file (mdn.conf).
-reverse
-r
Performs reverse conversion. Usually mdnconv loads local encoding text, after normalization, convert IDN encoding, and output. However when this option is specified, mdnconv loads IDN encoding text, convert local encoding, and output.
-nameprep <version>
-n <version>
Specifies the version of NAMEPREP. This corresponds to nameprep entry of the mDNkit configuration file.
-nonameprep
-N
Does not perform NAMEPREP processing. This option combines an effect of -nounassigncheck.
-map <scheme>
Specifies the mapping scheme of NAMEPREP. This corresponds to nameprep-map entry of the mDNkit configuration file. This option can be specified more than once.
-normalize <scheme>
Specifies the mapping scheme of NAMEPREP. This corresponds to nameprep-normalize entry of the mDNkit configuration file. This option can be specified more than once.
-prohibit <set>
Specfies prohibited characters in the prohibited character check in NAMEPREP. This corresponds to nameprep-prohibit entry of the mDNkit configuration file. This option can be specified more than once.
-unassigned <set>
Specfies unassigned characters in the unassigned character check in NAMEPREP. This corresponds to nameprep-unassigned entry of the mDNkit configuration file. This option can be specified more than once but is ignored when -nonameprep or -nounassigncheck was specified.
-nounassigncheck
-U
Not performs unassigned character check of NAMEPREP.
-delimiter <codepoint>
Specifies the character used as the delimiter in domain names other than period (`.'). This corresponds to delimiter-map entry of the mDNkit configuration file. This option can be specified more than once but is ignored when -reverse option is specified or -delimitermap was not specified.
-localmap <map>
Specifies the local mapping scheme performed besides NAMEPREP. This corresponds to local-map entry of the mDNkit configuration file. This option can be specified more than once but is ignored when the -reverse option is specified or -nolocalmap was not specified.
-nolocalmap
-L
Does not perform local mapping. When -reverse option is spefified, this option is ignored.
-delimitermap
-d
Map a character other than period (`.') to period. If this option is not specified, this mapping is not run ordinalry. When -reverse option is spefified, this option is ignored.
-whole
-w
Performs normalization and conversion of the entire text input. When this option is not specified, only the part that is determined as a domain name that includes non-ASCII characters is the conversion target.
-alias path
-a path
Defines the alias of the encoding name. This corresponds to encoding-alias-file entry of the mDNkit configuration file.
-flush
Flushes the output for each line. mdnconv usually writes more than one line of data at once when outputting data to a pipe file. When this option is specified, conversion results are written line by line. This option does not usually need to be specified, but it is useful when using mdnconv as the filter program from another program.
-version
-v
Displays the version information and ends execution. Displays both the mdnconv and library (libmdn) versions. Both are usually the same but they may be different when a shared library is used.

Details of conversion (forward direction) processing

In forward direction conversion of mdnconv (performance when -reverse option is not specified), the following process is performed line-by-line for input data.

  1. Reads one line of text from a file or standard input.
  2. Removes the carriage return at the end of the line. This prevents the carriage return from disappearing; this is needed because when mdnconv is executed with the -whole option set, and also when the output encoding is ASCII-compatible encoding such as Punycode or RACE, the carriage code will also be encoded using Punycode or RACE encoding.
  3. Converts a line to UTF-8 encoding. Usually, the entire line is converted from the input encoding to UTF-8. However, with this method, when the input code set is ASCII-compatible encoding (ACE) such as RACE and -whole option is not specified, the following special processing is performed. This is because conversion cannot be performed and an error occurs when characters such as space are included in the input line.
    1. Extracts a partial character string (that consists of only alphanumeric characters, hyphen and period; however, the first character is an alphanumeric character only and the end character is other than hyhpen) that can be understood as the correct ASCII domain name from the line.
    2. Converts each from the input ACE encoding to UTF-8.
    3. Replaces the partial character string with the conversion result when conversion is successful. If it is not successful (for example, when the conversion target charascter string does not have a prefix or suffic unique to ACE), that part is not replaced and remains unchanged.
  4. Checks whether or not the conversion result is correctly encoded as UTF-8. Basically, this processing is not necessary but checking is performed at this point so that if there is a bug in code conversion implementation, it can be detected at an early stage.
  5. Multilingual domain name part is extracted from a line converted to UTF-8. The following partial text strings are extracted. When the -whole option is specified, gets out whole line as one domain name. When not specified, gets out one-by-one corresponded sectional strings as multilingual domain name.
  6. For each multilingual domain names which are got out, the set delimiter is converted to period (`.'). However, when -delimitermap is specified, this step is skipped.
  7. For each multilingual domain names which are got out, performs local mapping. However, when -nolocalmap is specified, this step is skipped.
  8. For each multilingual domain names which are got out, performs normalization by NAMEPREP (mapping, normalization, prohibit characters check, unassigned code point check). If prohibit characters or unassigned code point are contained, error log is output, and the process is stopped. However, when -nonameprep is specified, these normalization steps are never performed. Moreover, when -nounassigncheck is specified, only unassigined code point check is not performed.
  9. For each multilingual domain names which are got out, checks again whether result of conversion is correctly encoded as UTF-8. Basically, this processing is not necessary but checking is performed so that if there is a bug in normalization implementation.
  10. Fill multilingual domain names which are got out back to inside line. Fill multilingual domain names which are got out and converted UTF-8 back to inside line.
  11. Converts whole line from UTF-8 encoding to output encoding specified by -out.
  12. Adds linefeed code of line end.
  13. Outputs the contents of line to standard output as result of conversion.

Details of reverse conversion processing

In reverse direction conversion of mdnconv (performance when -reverse option is specified), the following process is performed line-by-line for input data.

  1. Reads text one line from a file or standard input.
  2. Remove linefeed code of line end.
  3. Gets out a part been surmised multilingual domain name. When the input code set is ASCII-compatible encoding (ACE) such as RACE and -whole option is not specified, gets out whole line as one domain name. In the others case, chases up all sectional strings which can be taken as correct ASCII domain names (parts composed of alphanumeric, hyphen and period, however start character is only alphanumeric, and end character is other than hyphen) in the line And, for each found strings, try to convert from input ACE encoding to UTF-8, when successed, gets out the strings as multilingual domain name. By performing these processing, in the case of mixed strings encoded by ACE in the middle of line, multilingual domain names stands to be able to be got out correctly.
  4. Converts each multilingual domain names which are got out to UTF-8.
  5. For each multilingual domain names which are got out, performs normalization by NAMEPREP (performing mapping, normalization, and check whether not be contained prohibit characters check, unassigned code point check). If not be normalized, if input encoding is ACE, set back to input string, if not ACE, proceeds to next step directly. However, when -nonameprep is specified, these normalization steps are never performed. Moreover, when -nounassigncheck is specified, only unassigined code point check is not performed.
  6. Fill multilingual domain names which are got out back to inside line. Fill multilingual domain names which are got out and converted UTF-8 back to inside line.
  7. In this operation, check whether the conversion result is correctly encoded as UTF-8. In essence, this processing is not necessary but checking is performed here in consideration of the possibility of bugs in normalization.
  8. Converts from UTF-8 encoding to the output encoding specified by -out.
  9. Adds linefeed code of line end.
  10. Outputs the contents of line to standard output as result of conversion.