Java Swing - JTextField/JTextArea unable to paste supplemental unicode characters

Question

I have done an exhaustive search of stackoverflow and Google, but I have so far been unable to find others having a similar problem.

In a sample Java Swing test program, I create a plain JTextField so that I can try to paste characters into it from a webpage (http://isthisthingon.org/unicode/). When I test with '㓿' (code point 13567) it is able to paste the character. This character is the last listed character in the CJK Ideograph Extension A plane. However, when I move to the next related plane, CJK Ideograph Extension B, trying to copy and paste the character '𠀀' (code point 131072) fails. It does not render a box or any sort of glyph, it appears as if I had nothing in the system clipboard at all.

I realize that CJK Ideograph Extension B is a set of characters that are considered "supplemental" and need two 16bit blocks instead of one when Java encodes them internally as UTF-16. Further testing proves that I am able to display the supplemental characters if I hard-code the text into a display area.

This was tested using Windows 7 and Java 6.

I understand that as of Java 5, support for the supplemental unicode characters was added, however, I am wondering why (or if) the cut and paste functionality in swing still does not allow me to paste these characters. Is there something additional I need to do to tell Java to handle these characters when using the JTextField or JTextArea classes? Is there a way yet for Java's Swing libraries to be able to paste these characters into a text field yet?

Thank you for your time!

No sooner did I post this, than I may have found my answer. This has been a long standing bug in the JDK - bugs.sun.com/bugdatabase/view_bug.do?bug_id=6877495. — Locriansax
– Locriansax, Commented Aug 11, 2011 at 15:53
Unicode has had more characters than fit in a 16-bit integer for more most of its lifetime! I can’t believe that Java is still screwed up with this. But yesterday I found yet another UCS-2 bug in the Java String class, one that’s been there forever. This is ridiculous. The whole UTF-16 thing is a horrible curse, and Java will never be free of the countless bugs it causes. They are simply everywhere and it is maddening. People just can’t get things right. — tchrist
– tchrist, Commented Aug 12, 2011 at 1:18
@tchrist - what was the bug that you found in the String class? If it was submitted as an official bug could you post the link too? I've been doing a lot of work with i18n stuff here at work and the more I know about Java's quirks with respect to the supplementary character set, the better! — Locriansax
– Locriansax, Commented Aug 12, 2011 at 15:43
@Locriansax: No, I didn’t bug report it, I mailed it to i18n-dev openjdk list that I’m on. You can find that mail right here. The problem is that the code processes things by partial code points, not full ones, so gets wrong answers. It snuck by till Unicode 3.1 showed up in March 2001, because that introduced the Deseret script, which is a case-changing script up it the astral planes. It’s been broken >10 years. I hold all char-based Java code so super highly suspect that it’s guilty till proven innocent. Safe assumption. — tchrist
– tchrist, Commented Aug 12, 2011 at 18:49

Community · Accepted Answer · 2023-11-10 19:48:45Z

2

No sooner did I post this, than I may have found my answer. This has been a long standing bug in the JDK.

edited Nov 10, 2023 at 19:48

CommunityBot

11 silver badge

answered Aug 12, 2011 at 15:39

Locriansax

1338 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Java Swing - JTextField/JTextArea unable to paste supplemental unicode characters

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related