How to use Unicode in Word X on the Macintosh
This article provides help for people who need to use Unicode in Word X on the Macintosh. This article requires some technical skills. This is as good a time as any to get started learning some fancy, advanced techniques with Word X.
This proposed solution is far from ideal, so let's set some expectations up front:
This solution enables "some" users to use "some" characters to get by for now. If it doesn't suit you, it's time to upgrade to Word 2004.
The Ground Rules
This solution relies on several facts:
- Word stores Unicode internally. If you place a Unicode character into a Word document, it will be stored and output normally from Word.
- Internet Explorer and TextEdit can both handle Unicode. Internet Explorer can display it, but may not be able to print it. TextEdit can both display and print it.
- The fonts supplied by Apple and by Microsoft are Unicode fonts internally. Most of the characters you need will be in one of your fonts, particularly if you install the very large 40 MB foreign-language updates for OS X.
My solution consists of three macros. One enables you to enter any Unicode character you like into a Word document by typing its hexadecimal code into the text. Another enables you to select any character and return its Unicode value. The last one enables you to list all of the characters defined in each of your fonts so that you can see what those codes are.
Click here for a template containing the compiled macros.
Look here for instructions on installing a template.
Look here for instructions on installing macros.
If you use this solution, you will be left with a document containing characters you cannot see. Current Carbon applications will not display any character that is not in the Macintosh Character Set. Word displays the characters as an underscore.
Viewing Documents containing Unicode
To view the document, you will have to save it as a Web Page and open it in Internet Explorer. Since this shatters your document's layout, it is useful only for confirming that you have the correct character in the correct place.
Printing Documents containing Unicode
To print the document, you will have to Save As RTF then print it from TextEdit. As far as I know, you cannot get Unicode characters into PDF yet, because the PDF writer is a Carbon application. Of course, if you email the document to work, you can print it from your PC....
Entering Unicode Characters
The first problem is how to enter characters. The Unicode Hexadecimal keyboard required to type special characters is not available in Word (it will enable only in genuine Unicode applications).
- You can insert such characters in TextEdit (where the keyboard will enable).
- Hold down the Option key and enter the four-digit Unicode character number in hexadecimal.
- You can then copy the character from TextEdit to Word.
- If you have one or more of the Asian (Chinese, Japanese, Korean) languages installed, you will find an Asia Text Extras folder in your Utilities folder.
- The TrueType Font Editor installed there is a better way to find characters in a font than Keycaps. If you do not have this, see Making Word Show You the Whole Font below.
- Keycaps will not reveal the characters in a font unless you use its Font menu to choose the font you are interested in.
- Keycaps cannot show you some characters unless you have the correct keyboard enabled (because there are no key combinations on the current keyboard that will produce the character). I don't know how you are supposed to find out which keyboard you need if you cannot find the character.
Making Word Show You the Whole Font
This is nasty but it works:
- Type a whole word including the following space. It must not be a spelling error.
- Double-click to select the word.
- Do not move your insertion point.
- Format the word with the font you are interested in.
- If you have done it right, the Insert>Symbol dialog in Word will now show the name of the font instead of Normal Text. If the name of the font does not appear, try again; this won't work until you get the name of the font to stick. Hint: Do not click in the word after you have selected it; if you do, Word cancels the font formatting. It's a bug.
- If the font is a Unicode font, all characters will now appear in Insert>Symbol.
- If the font has multiple subsets, the Unicode scrollbars will appear.
- You can insert the character using Insert>Symbol. It will replace the highlighted word.
Macro to Insert Unicode Characters
Here is a macro that will insert any Unicode character you like in a Word v.X document: To use it, you type the four-character number in the document and run the Macro.
- If the character you have chosen is in the Macintosh character set, AND it is in the font you are using, you will see the character displaying normally. If the character is not in the Macintosh character set, Word cannot display the character. It will show it as an underscore.
- To check your result, save the document as a web page. If you double-click the HTML file Word puts out, Microsoft Internet Explorer will display the character correctly if it exists in the font you used.
- When you save the document as a web page, make sure that you choose an Encoding of UTF-8 (eight-bit Unicode). If you save in any other encoding, Explorer may not be able to decode the character.
Click here for a template containing the compiled macros.
Look here for instructions for installing a template.
Look here for instructions on installing macros.
Macro Code
Sub InsertUnicode()
' InsertUnicode Macro
' Macro written 26 Jul 2002 by John McGhie
' Converts typed text into Unicode
Dim CharNum As Long
Selection.Collapse
Selection.MoveStart Unit:=wdWord, Count:=-1
If Selection.Text <> "" Then
CharNum = Val("&H" & Selection.Text)
If CharNum < 0 Or CharNum > 65535 Then
MsgBox "Sorry, there is no such character in Unicode. " _
& "The character code must be four digits in hexadecimal."
Else
Selection.TypeText Text:=ChrW(CharNum)
End If
End If
End Sub
When we send Macros over the Internet, various things can happen to the lines and cause errors. To see if you got any, go to Debug and choose Compile Normal.
If a warning dialog pops up telling you about a "Compile Error" this usually means that one of the lines has wrapped on its way to you. The line after the error will turn red. Usually all you have to do is use the Delete key to join that line onto the end of the one above.
When you have no errors, click Save, then click the blue W button to come back to the Word user interface. Assign this macro to a keystroke: Word 2002 for Windows has this command built-in; the default keystroke for it is Alt + x. Since the Mac won't let you use Option (it's reserved for the operating system) and Alt is a bit of a stretch, try Ctrl + x; that's available.
Look here for How to Assign a Macro to a Keystroke
Hold down your Shift key and choose File>Save All. Save All does not appear unless you do hold down the shift key. You wouldn't want to lose all this work in a crash. Now would you?
How to Insert a Unicode Character
To use the macro, simply type the hexadecimal character code into the document where you want the character to appear, then hit the keystroke you assigned.
The macro will convert the character code to a Unicode character.
The reason this macro is provided in hexadecimal is because most of the font utilities around provide the character codes for characters in hex. If yours works in Decimal, go back to the Macro editor, open the code window and change the line that reads
CharNum = Val("&H" & Selection.text)
to read just
CharNum = Val(Selection.text)
That removes the conversion from Hexadecimal. To be elegant, you may want to remove the words "in hexadecimal" from the end of the MsgBox line.
How to find the Unicode Value of a Character
Run the following Macro.
Click here for a template containing the compiled macros.
Look here for instructions for installing a template.
Look here for instructions on installing macros.
Macro Code
Sub ShowCharacterCode()
'
' Charcode Macro
' Macro recorded 8/06/00 by John McGhie
'
MsgBox AscW(Selection.Text)
End Sub
How to Find Unicode Characters in your Fonts
The following macro enables you to list all of the characters in a font. The reason you have to do this is because most Macintosh fonts contain characters that Word cannot display.
Apple and Microsoft fonts supplied with OS X and Office v.X are actually Unicode fonts. However, in common with most other Carbon applications, Word can display only the characters in the Macintosh Character Set. Most fonts contain many more characters you cannot see, but you can use them if you can get their character numbers.
Windows fonts typically contain five times more characters than Macintosh fonts.
Macintosh OS X can use Windows fonts of kind OTF (OpenType Font) and TTF (TrueType Font). PostScript (Type 1) fonts made for Windows will not work on the Mac. Simply drag compatible fonts to your Fonts folder. It doesn't matter which of your fonts folders you use; you may wish to use the Fonts folder you will find in your Microsoft Office X/Office Folder to avoid the possibility of interfering with other applications.
The following macro produces a listing of the font you choose. It places all 65,536 character codes possible in Unicode into a document, 16 to a line. When you display or print the result, the character will appear at each position for which the selected font has a character defined. You will get a question mark or a hollow box where there is no character in the font.
Notes:
- When installing this macro, you can place it in the Unicode module you created for the first macro.
- There is no need to assign either a keystroke or a button to this macro, you will not want to run it that often (trust me, you won't!).
- The performance of this macro is critically dependant upon how much memory is available to Word while it is running. For best performance, quit all other applications before running the macro. On a TiBook G4 667 with 1 GB of memory under OS X, this macro runs about 3 minutes 50 seconds if there is nothing else running. If Word cannot get enough memory, the macro will run about half an hour.
- You can run the macro in Word on a PC if you want to. The macro will run about 1:50 on the PC. The code demonstrates how to do a Conditional Compile in VBA so a macro will run on both platforms.
- If you do run the macro on the PC, the macro has an extra function. On the PC, if a character is not available in a font, the PC will search your fonts and insert the character from whichever contains it. On the PC, the macro turns the characters where this has happened red. Characters that actually are in the font you nominate are colored blue. This function has no purpose (and it slows things down) on the Mac. On the Mac, the character is either present in the nominated font, or it does not appear at all.
- This macro is savagely CPU- and memory-intensive. Word on the Mac is inclined to crash sometimes when running it. If it does, restart Word and try again. It is best to Quit and re-start Word before running the macro a second time; it crashes less often that way.
- While this macro is running, expect Word's Windows to look strange or go blank. It is thinking very hard and does not have time to update the screen!
List Unicode Font Macro Code
Click here for a template containing the compiled macros.
Look here for instructions for installing a template.
Look here for instructions on Installing a Macro.
Macro Code
Sub ListUnicodeFont()
' Macro written 28 July 2002 by John McGhie
' Prints entire character set of a unicode font
Dim theFont As String
Dim fontDoc As Document
Dim tabNumber As Integer
Dim charNumber As Long
charNumber = MsgBox("Choose just the font name from the following Dialog" _
& " box, then wait...", vbOKCancel + vbInformation)
If charNumber <> 1 Then End
With Dialogs(wdDialogFormatFont)
.Display
theFont = .Font
End With
Set fontDoc = Application.Documents.Add
fontDoc.Activate
fontDoc.ActiveWindow.View.Type = wdNormalView
StatusBar = "Please wait..."
Selection.TypeText "Character listing for " & theFont
Selection.Paragraphs(1).Format.Style = wdStyleHeading1
Selection.TypeParagraph
Selection.TypeParagraph
With Selection.Paragraphs(1).TabStops
For tabNumber = 3 To 17
.Add Position:=(tabNumber * 22), Alignment:=wdAlignTabRight
Next tabNumber
End With
Selection.Font.Name = "Arial"
Selection.TypeText "Number"
For tabNumber = 0 To 15
Selection.TypeText Text:=vbTab & Hex(tabNumber)
Next tabNumber
x = 32
While x < 65532
Selection.TypeParagraph
Selection.Font.Name = "Arial"
Selection.TypeText Text:=Hex(x)
StatusBar = "Character number " & Hex(x)
For tabNumber = 1 To 16
Selection.Font.Name = theFont
Selection.TypeText Text:=vbTab & ChrW(x)
x = x + 1
Next tabNumber
Wend
' The PC substitutes the closest available font if
' the character is not available in the nominated font.
' The following routine marks characters Blue if they
' are from the requested font, and dark red if they
' have been substituted. This routine is not necessary
' on the Mac, which doesn't have the function
#If Win32 Then
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = ""
.Font.Name = theFont
.Replacement.Text = ""
.Replacement.Font.Color = wdColorBlue
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.HomeKey Unit:=wdStory
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Font.Color = wdColorAutomatic
.Replacement.Font.Color = wdColorDarkRed
End With
Selection.Find.Execute Replace:=wdReplaceAll
#End If
fontDoc.SaveAs FileName:=theFont & ".doc"
fontDoc.WebOptions.Encoding = msoEncodingUTF8
' The Mac has different HTML Options in the Save As
#If Mac Then
fontDoc.SaveAs FileName:=theFont & ".htm", _
FileFormat:=wdFormatHTML, _
HTMLDisplayOnlyOutput:=True
#Else
fontDoc.SaveAs Encoding:=msoEncodingUTF8, FileFormat:= _
wdFormatFilteredHTM
#End If
ActiveWindow.View.Type = wdWebView
End Sub