Internationalized domain names are defined in RFC 3490. RFC 3490 defines two operations: ToASCII and ToUnicode. These 2 operations employ Nameprep algorithm, which is a profile of Stringprep, and Punycode algorithm to convert domain name string back and forth.
The behavior of aforementioned conversion process can be adjusted by various flags:
The security consideration is important with respect to internationalization domain name support. For example, English domain names may be homographed - maliciously misspelled by substitution of non-Latin letters. Unicode Technical Report #36 discusses security issues of IDN support as well as possible solutions. Applications are responsible for taking adequate security measures when using international domain names.
Modifier and Type | Field and Description |
---|---|
private static final String | |
private static final int | |
public static final int | ALLOW_UNASSIGNED
Flag to allow processing of unassigned code points |
private static final int | |
private static final StringPrep | |
public static final int | USE_STD3_ASCII_RULES
Flag to turn on the check against STD-3 ASCII rules |
Access | Constructor and Description |
---|---|
private |
Modifier and Type | Method and Description |
---|---|
private static boolean | |
private static boolean | |
private static boolean | |
private static boolean | |
private static int | |
private static boolean | |
public static String | |
public static String | |
private static String | |
private static char | |
private static StringBuffer | |
public static String | |
public static String | |
private static String |
ACE_PREFIX | back to summary |
---|---|
private static final String ACE_PREFIX |
ACE_PREFIX_LENGTH | back to summary |
---|---|
private static final int ACE_PREFIX_LENGTH |
ALLOW_UNASSIGNED | back to summary |
---|---|
public static final int ALLOW_UNASSIGNED Flag to allow processing of unassigned code points |
MAX_LABEL_LENGTH | back to summary |
---|---|
private static final int MAX_LABEL_LENGTH |
namePrep | back to summary |
---|---|
private static final StringPrep namePrep |
USE_STD3_ASCII_RULES | back to summary |
---|---|
public static final int USE_STD3_ASCII_RULES Flag to turn on the check against STD-3 ASCII rules |
IDN | back to summary |
---|---|
private IDN() |
isAllASCII | back to summary |
---|---|
private static boolean isAllASCII(String input) |
isLabelSeparator | back to summary |
---|---|
private static boolean isLabelSeparator(char c) |
isNonLDHAsciiCodePoint | back to summary |
---|---|
private static boolean isNonLDHAsciiCodePoint(int ch) |
isRootLabel | back to summary |
---|---|
private static boolean isRootLabel(String s) |
searchDots | back to summary |
---|---|
private static int searchDots(String s, int start) |
startsWithACEPrefix | back to summary |
---|---|
private static boolean startsWithACEPrefix(StringBuffer input) |
toASCII | back to summary |
---|---|
public static String toASCII(String input, int flag) Translates a string from Unicode to ASCII Compatible Encoding (ACE), as defined by the ToASCII operation of RFC 3490. ToASCII operation can fail. ToASCII fails if any step of it fails. If ToASCII operation fails, an IllegalArgumentException will be thrown. In this case, the input string should not be used in an internationalized domain name. A label is an individual part of a domain name. The original ToASCII operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots. The following characters are recognized as dots: \u002E (full stop), \u3002 (ideographic full stop), \uFF0E (fullwidth full stop), and \uFF61 (halfwidth ideographic full stop). if dots are used as label separators, this method also changes all of them to \u002E (full stop) in output translated string.
|
toASCII | back to summary |
---|---|
public static String toASCII(String input) Translates a string from Unicode to ASCII Compatible Encoding (ACE), as defined by the ToASCII operation of RFC 3490. This convenience method works as if by invoking the two-argument counterpart as follows:
|
toASCIIInternal | back to summary |
---|---|
private static String toASCIIInternal(String label, int flag) |
toASCIILower | back to summary |
---|---|
private static char toASCIILower(char ch) |
toASCIILower | back to summary |
---|---|
private static StringBuffer toASCIILower(StringBuffer input) |
toUnicode | back to summary |
---|---|
public static String toUnicode(String input, int flag) Translates a string from ASCII Compatible Encoding (ACE) to Unicode, as defined by the ToUnicode operation of RFC 3490. ToUnicode never fails. In case of any error, the input string is returned unmodified. A label is an individual part of a domain name. The original ToUnicode operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots. The following characters are recognized as dots: \u002E (full stop), \u3002 (ideographic full stop), \uFF0E (fullwidth full stop), and \uFF61 (halfwidth ideographic full stop).
|
toUnicode | back to summary |
---|---|
public static String toUnicode(String input) Translates a string from ASCII Compatible Encoding (ACE) to Unicode, as defined by the ToUnicode operation of RFC 3490. This convenience method works as if by invoking the two-argument counterpart as follows:
|
toUnicodeInternal | back to summary |
---|---|
private static String toUnicodeInternal(String label, int flag) |