- 09 Feb, 2022 1 commit
-
-
Jason Rhinelander authored
-
- 23 Jan, 2020 1 commit
-
-
Jason Rhinelander authored
-
- 29 Mar, 2019 1 commit
-
-
Jason Rhinelander authored
-
- 26 Sep, 2018 1 commit
-
-
Jason Rhinelander authored
-
- 23 Aug, 2015 4 commits
-
-
Jason Rhinelander authored
-
Jason Rhinelander authored
The full (unihan-included) database can be loaded by specifying the -H option.
-
Jason Rhinelander authored
-
Jason Rhinelander authored
- No longer depends on Perl's Unicode::UCD for information: instead a new script builds a unicode.data file from the UCD XML file. - Character aliases and abbreviations are now supported for searching and are displayed - Searching now accepts multiple search terms (or regexes), all of which must match (any of which if using new --or argument) - Control character abbreviations now come out of the unicode data instead of being hard coded. - Added scripts to download and extract the data from the latest unicode specification. - Various character attributes are now reported - The new data file (unicode.data.gz) built by extract.pl is designed to work compressed, reducing the storage space required. - Added the unicode version in which a character was added - Small cosmetic tweaks - Rewrote --help output - Fixed bugs: - UTF-16 display was flat out wrong for 4 byte UTF-16 characters - --details didn't do anything when searching - searching never found control characters
-
- 22 Aug, 2015 2 commits
-
-
Jason Rhinelander authored
The previous commit removed binary literals, but didn't properly update the code for literal character mode.
-
Jason Rhinelander authored
"0b1" is a valid hex value, and was interpreted as such, so the binary input value regular expression subgroup never matched.
-
- 30 Jul, 2015 1 commit
-
-
Jason Rhinelander authored
-
- 27 Nov, 2011 1 commit
-
-
Jason Rhinelander authored
Perl 5.14 no longer installs UnicodeData.txt, so include a copy (from Unicode 6.0.0) alongside the script and, using FindBin, load it instead of Perl's internal version.
-
- 09 Oct, 2011 3 commits
-
-
Jason Rhinelander authored
Added -d for --details Added -l and --list as aliases for --nodetails Made -X (and --HEX) actually work (it was mentioned in --help, but not handled in the code) Added the long option names for format aliases to --help output
-
Jason Rhinelander authored
-
Jason Rhinelander authored
- Added decomposition, when it exists, including looking up the decomposed character names, and printing out the decomposed version (which should be visually indistinguishable, but can be copied) - Added unicode version to --help output - Allowed a single character to be prefixed with _ so that _e can be used to show U+0065 instead of U+000E - Changed the regexp matching a single character to \X instead of . so that it matches a single grapheme cluster instead of just a single codepoint, thus allow decomposed character encodings to work properly - Print out an error if you try to do x-y where one of x or y is a multi-codepoint composition (i.e. a grapheme cluster), since a range makes no sense in that case.
-
- 18 Apr, 2009 1 commit
-
-
Jason Rhinelander authored
[[:xdigit:]] doesn't work properly under 'i' in a regex (perlbug #64838); changed to 0-9a-fA-F which does work.
-
- 25 Oct, 2008 1 commit
-
-
Jason Rhinelander authored
-
- 17 Jul, 2008 1 commit
-
-
Jason Rhinelander authored
-
- 10 Jun, 2008 2 commits
-
-
Jason Rhinelander authored
-
Jason Rhinelander authored
-
- 03 Jun, 2008 1 commit
-
-
Jason Rhinelander authored
Added support for spaces (i.e. shell-escaped) in addition to underscores
-
- 29 Jun, 2007 3 commits
-
-
Jason Rhinelander authored
Added the shortened special characters again to make the list mode show up nicely; changed 0-255 default to 0-ff (since arguments are now hex)
-
Jason Rhinelander authored
-
Jason Rhinelander authored
-