Top Description Fields Constructors Methods
java.net

public final Class IDN

extends Object
Class Inheritance
Imports
java.io.InputStream, .IOException, java.security.AccessController, .PrivilegedAction, jdk.internal.icu.impl.Punycode, jdk.internal.icu.text.StringPrep, .UCharacterIterator

Provides methods to convert internationalized domain names (IDNs) between a normal Unicode representation and an ASCII Compatible Encoding (ACE) representation. Internationalized domain names can use characters from the entire range of Unicode, while traditional domain names are restricted to ASCII characters. ACE is an encoding of Unicode strings that uses only ASCII characters and can be used with software (such as the Domain Name System) that only understands traditional domain names.

Internationalized domain names are defined in RFC 3490. RFC 3490 defines two operations: ToASCII and ToUnicode. These 2 operations employ Nameprep algorithm, which is a profile of Stringprep, and Punycode algorithm to convert domain name string back and forth.

The behavior of aforementioned conversion process can be adjusted by various flags:

These flags can be logically OR'ed together.

The security consideration is important with respect to internationalization domain name support. For example, English domain names may be homographed - maliciously misspelled by substitution of non-Latin letters. Unicode Technical Report #36 discusses security issues of IDN support as well as possible solutions. Applications are responsible for taking adequate security measures when using international domain names.

Author
Edward Wang
Since
1.6
External Specification
https://www.rfc-editor.org/info/rfc1122, https://www.rfc-editor.org/info/rfc1123, https://www.rfc-editor.org/info/rfc3454, https://www.rfc-editor.org/info/rfc3490, https://www.rfc-editor.org/info/rfc3491, https://www.rfc-editor.org/info/rfc3492, https://www.unicode.org/reports/tr36

Field Summary

Modifier and TypeField and Description
private static final String
private static final int
public static final int
ALLOW_UNASSIGNED

Flag to allow processing of unassigned code points

private static final int
private static final StringPrep
public static final int
USE_STD3_ASCII_RULES

Flag to turn on the check against STD-3 ASCII rules

Constructor Summary

AccessConstructor and Description
private
IDN()

Method Summary

Modifier and TypeMethod and Description
private static boolean
private static boolean
private static boolean
private static boolean
private static int
searchDots(String s, int start)

private static boolean
public static String

Returns:

the translated String
toASCII
(String
the string to be processed
input
,
int
process flag; can be 0 or any logical OR of possible flags
flag
)

Translates a string from Unicode to ASCII Compatible Encoding (ACE), as defined by the ToASCII operation of RFC 3490.

public static String

Returns:

the translated String
toASCII
(String
the string to be processed
input
)

Translates a string from Unicode to ASCII Compatible Encoding (ACE), as defined by the ToASCII operation of RFC 3490.

private static String
toASCIIInternal(String label, int flag)

private static char
toASCIILower(char ch)

private static StringBuffer
public static String

Returns:

the translated String
toUnicode
(String
the string to be processed
input
,
int
process flag; can be 0 or any logical OR of possible flags
flag
)

Translates a string from ASCII Compatible Encoding (ACE) to Unicode, as defined by the ToUnicode operation of RFC 3490.

public static String

Returns:

the translated String
toUnicode
(String
the string to be processed
input
)

Translates a string from ASCII Compatible Encoding (ACE) to Unicode, as defined by the ToUnicode operation of RFC 3490.

private static String
toUnicodeInternal(String label, int flag)

Inherited from java.lang.Object:
cloneequalsfinalizegetClasshashCodenotifynotifyAlltoStringwaitwaitwait

Field Detail

ACE_PREFIXback to summary
private static final String ACE_PREFIX
ACE_PREFIX_LENGTHback to summary
private static final int ACE_PREFIX_LENGTH
ALLOW_UNASSIGNEDback to summary
public static final int ALLOW_UNASSIGNED

Flag to allow processing of unassigned code points

MAX_LABEL_LENGTHback to summary
private static final int MAX_LABEL_LENGTH
namePrepback to summary
private static final StringPrep namePrep
USE_STD3_ASCII_RULESback to summary
public static final int USE_STD3_ASCII_RULES

Flag to turn on the check against STD-3 ASCII rules

Constructor Detail

IDNback to summary
private IDN()

Method Detail

isAllASCIIback to summary
private static boolean isAllASCII(String input)
isLabelSeparatorback to summary
private static boolean isLabelSeparator(char c)
isNonLDHAsciiCodePointback to summary
private static boolean isNonLDHAsciiCodePoint(int ch)
isRootLabelback to summary
private static boolean isRootLabel(String s)
searchDotsback to summary
private static int searchDots(String s, int start)
startsWithACEPrefixback to summary
private static boolean startsWithACEPrefix(StringBuffer input)
toASCIIback to summary
public static String toASCII(String input, int flag)

Translates a string from Unicode to ASCII Compatible Encoding (ACE), as defined by the ToASCII operation of RFC 3490.

ToASCII operation can fail. ToASCII fails if any step of it fails. If ToASCII operation fails, an IllegalArgumentException will be thrown. In this case, the input string should not be used in an internationalized domain name.

A label is an individual part of a domain name. The original ToASCII operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots. The following characters are recognized as dots: \u002E (full stop), \u3002 (ideographic full stop), \uFF0E (fullwidth full stop), and \uFF61 (halfwidth ideographic full stop). if dots are used as label separators, this method also changes all of them to \u002E (full stop) in output translated string.

Parameters
input:String

the string to be processed

flag:int

process flag; can be 0 or any logical OR of possible flags

Returns:String

the translated String

Exceptions
IllegalArgumentException:
if the input string doesn't conform to RFC 3490 specification
External Specification
https://www.rfc-editor.org/info/rfc3490
toASCIIback to summary
public static String toASCII(String input)

Translates a string from Unicode to ASCII Compatible Encoding (ACE), as defined by the ToASCII operation of RFC 3490.

This convenience method works as if by invoking the two-argument counterpart as follows:

toASCII(input, 0);
Parameters
input:String

the string to be processed

Returns:String

the translated String

Exceptions
IllegalArgumentException:
if the input string doesn't conform to RFC 3490 specification
External Specification
https://www.rfc-editor.org/info/rfc3490
toASCIIInternalback to summary
private static String toASCIIInternal(String label, int flag)
toASCIILowerback to summary
private static char toASCIILower(char ch)
toASCIILowerback to summary
private static StringBuffer toASCIILower(StringBuffer input)
toUnicodeback to summary
public static String toUnicode(String input, int flag)

Translates a string from ASCII Compatible Encoding (ACE) to Unicode, as defined by the ToUnicode operation of RFC 3490.

ToUnicode never fails. In case of any error, the input string is returned unmodified.

A label is an individual part of a domain name. The original ToUnicode operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots. The following characters are recognized as dots: \u002E (full stop), \u3002 (ideographic full stop), \uFF0E (fullwidth full stop), and \uFF61 (halfwidth ideographic full stop).

Parameters
input:String

the string to be processed

flag:int

process flag; can be 0 or any logical OR of possible flags

Returns:String

the translated String

External Specification
https://www.rfc-editor.org/info/rfc3490
toUnicodeback to summary
public static String toUnicode(String input)

Translates a string from ASCII Compatible Encoding (ACE) to Unicode, as defined by the ToUnicode operation of RFC 3490.

This convenience method works as if by invoking the two-argument counterpart as follows:

toUnicode(input, 0);
Parameters
input:String

the string to be processed

Returns:String

the translated String

External Specification
https://www.rfc-editor.org/info/rfc3490
toUnicodeInternalback to summary
private static String toUnicodeInternal(String label, int flag)