Next: PR29 Functions, Previous: IDNA Functions, Up: GNU Libidn [Contents][Index]
Organizations that manage some Top Level Domains (TLDs) have published tables with characters they accept within the domain. The reason may be to reduce complexity that come from using the full Unicode range, and to protect themselves from future (backwards incompatible) changes in the IDN or Unicode specifications. Libidn implement an infrastructure for defining and checking strings against such tables. Libidn also ship some tables from TLDs that we have managed to get permission to use them from. Because these tables are even less static than Unicode or StringPrep tables, it is likely that they will be updated from time to time (even in backwards incompatible ways). The Libidn interface provide a “version” field for each TLD table, which can be compared for equality to guarantee the same operation over time.
From a design point of view, you can regard the TLD tables for IDN as the “localization” step that come after the “internationalization” step provided by the IETF standards.
The TLD functionality rely on up-to-date tables. The latest version of Libidn aim to provide these, but tables with unclear copying conditions, or generally experimental tables, are not included. Some such tables can be found at https://github.com/gnuthor/tldchk.
tld.h
To use the functions explained in this chapter, you need to include the file tld.h using:
#include <tld.h>
in: Array of unicode code points to process. Does not need to be zero terminated.
inlen: Number of unicode code points.
errpos: Position of offending character is returned here.
tld: A Tld_table
data structure representing the restrictions for
which the input should be tested.
Test each of the code points in in
for whether or not
they are allowed by the data structure in tld
, return
the position of the first character for which this is not
the case in errpos
.
Return value: Returns the Tld_rc
value TLD_SUCCESS
if all code
points are valid or when tld
is null, TLD_INVALID
if a
character is not allowed, or additional error codes on general
failure conditions.
in: Zero terminated array of unicode code points to process.
errpos: Position of offending character is returned here.
tld: A Tld_table
data structure representing the restrictions for
which the input should be tested.
Test each of the code points in in
for whether or not
they are allowed by the data structure in tld
, return
the position of the first character for which this is not
the case in errpos
.
Return value: Returns the Tld_rc
value TLD_SUCCESS
if all code
points are valid or when tld
is null, TLD_INVALID
if a
character is not allowed, or additional error codes on general
failure conditions.
in: Array of unicode code points to process. Does not need to be zero terminated.
inlen: Number of unicode code points.
out: Zero terminated ascii result string pointer.
Isolate the top-level domain of in
and return it as an ASCII
string in out
.
Return value: Return TLD_SUCCESS
on success, or the corresponding
Tld_rc
error code otherwise.
in: Zero terminated array of unicode code points to process.
out: Zero terminated ascii result string pointer.
Isolate the top-level domain of in
and return it as an ASCII
string in out
.
Return value: Return TLD_SUCCESS
on success, or the corresponding
Tld_rc
error code otherwise.
in: Zero terminated character array to process.
out: Zero terminated ascii result string pointer.
Isolate the top-level domain of in
and return it as an ASCII
string in out
. The input string in
may be UTF-8, ISO-8859-1 or
any ASCII compatible character encoding.
Return value: Return TLD_SUCCESS
on success, or the corresponding
Tld_rc
error code otherwise.
tld: TLD name (e.g. "com") as zero terminated ASCII byte string.
tables: Zero terminated array of Tld_table
info-structures for
TLDs.
Get the TLD table for a named TLD by searching through the given TLD table array.
Return value: Return structure corresponding to TLD tld
by going
thru tables
, or return NULL
if no such structure is found.
Get the TLD table for a named TLD by searching through the given
TLD table array.
Return value: Return structure corresponding to TLD tld
by going
thru tables
, or return NULL
if no such structure is found.
tld: TLD name (e.g. "com") as zero terminated ASCII byte string.
overrides: Additional zero terminated array of Tld_table
info-structures for TLDs, or NULL
to only use library default
tables.
Get the TLD table for a named TLD, using the internal defaults, possibly overridden by the (optional) supplied tables.
Return value: Return structure corresponding to TLD tld_str
, first
looking through overrides
then thru built-in list, or NULL
if
no such structure found.
in: Array of unicode code points to process. Does not need to be zero terminated.
inlen: Number of unicode code points.
errpos: Position of offending character is returned here.
overrides: A Tld_table
array of additional domain restriction
structures that complement and supersede the built-in information.
Test each of the code points in in
for whether or not they are
allowed by the information in overrides
or by the built-in TLD
restriction data. When data for the same TLD is available both
internally and in overrides
, the information in overrides
takes
precedence. If several entries for a specific TLD are found, the
first one is used. If overrides
is NULL
, only the built-in
information is used. The position of the first offending character
is returned in errpos
.
Return value: Returns the Tld_rc
value TLD_SUCCESS
if all code
points are valid or when tld
is null, TLD_INVALID
if a
character is not allowed, or additional error codes on general
failure conditions.
in: Zero-terminated array of unicode code points to process.
errpos: Position of offending character is returned here.
overrides: A Tld_table
array of additional domain restriction
structures that complement and supersede the built-in information.
Test each of the code points in in
for whether or not they are
allowed by the information in overrides
or by the built-in TLD
restriction data. When data for the same TLD is available both
internally and in overrides
, the information in overrides
takes
precedence. If several entries for a specific TLD are found, the
first one is used. If overrides
is NULL
, only the built-in
information is used. The position of the first offending character
is returned in errpos
.
Return value: Returns the Tld_rc
value TLD_SUCCESS
if all code
points are valid or when tld
is null, TLD_INVALID
if a
character is not allowed, or additional error codes on general
failure conditions.
in: Zero-terminated UTF8 string to process.
errpos: Position of offending character is returned here.
overrides: A Tld_table
array of additional domain restriction
structures that complement and supersede the built-in information.
Test each of the characters in in
for whether or not they are
allowed by the information in overrides
or by the built-in TLD
restriction data. When data for the same TLD is available both
internally and in overrides
, the information in overrides
takes
precedence. If several entries for a specific TLD are found, the
first one is used. If overrides
is NULL
, only the built-in
information is used. The position of the first offending character
is returned in errpos
. Note that the error position refers to the
decoded character offset rather than the byte position in the
string.
Return value: Returns the Tld_rc
value TLD_SUCCESS
if all
characters are valid or when tld
is null, TLD_INVALID
if a
character is not allowed, or additional error codes on general
failure conditions.
in: Zero-terminated string in the current locales encoding to process.
errpos: Position of offending character is returned here.
overrides: A Tld_table
array of additional domain restriction
structures that complement and supersede the built-in information.
Test each of the characters in in
for whether or not they are
allowed by the information in overrides
or by the built-in TLD
restriction data. When data for the same TLD is available both
internally and in overrides
, the information in overrides
takes
precedence. If several entries for a specific TLD are found, the
first one is used. If overrides
is NULL
, only the built-in
information is used. The position of the first offending character
is returned in errpos
. Note that the error position refers to the
decoded character offset rather than the byte position in the
string.
Return value: Returns the Tld_rc
value TLD_SUCCESS
if all
characters are valid or when tld
is null, TLD_INVALID
if a
character is not allowed, or additional error codes on general
failure conditions.
rc: tld return code
Convert a return code integer to a text string. This string can be used to output a diagnostic message to the user.
TLD_SUCCESS: Successful operation. This value is guaranteed to always be zero, the remaining ones are only guaranteed to hold non-zero values, for logical comparison purposes.
TLD_INVALID: Invalid character found.
TLD_NODATA: No input data was provided.
TLD_MALLOC_ERROR: Error during memory allocation.
TLD_ICONV_ERROR: Character encoding conversion error.
TLD_NO_TLD: No top-level domain found in domain string.
Return value: Returns a pointer to a statically allocated string
containing a description of the error with the return code rc
.
Next: PR29 Functions, Previous: IDNA Functions, Up: GNU Libidn [Contents][Index]