How do I generate an index in Word?
Article contributed by John McGhie
The Microsoft Word Help suggests that you can automatically generate an index. Sorry, but you can't (the "result" looks like an index, but the reader can't use it). You can automatically mark index entries: however, the amount of work required to edit the result into a useable index is usually double the effort required to manually mark the index entries one-by-one.
Instead of automatically generating something that is not useable, the reader would far prefer you to express the document electronically and provide a free text search. A free text search serves the reader's needs far better than a badly-constructed index, and the search engines available these days are smart enough to look for what the reader “wanted” rather than what he or she “asked for”.
Making an Index
An experienced technical writer wrote this article. As a technical writer, I produce long documents running to thousands of pages of technical material. Indexes are part of my game. I can't tell you how to produce one automatically, but I can tell you how to produce one easily!
Before 1990-ish, Indexing was a profession of its own; in addition to an Author and an Editor, a large book had an Indexer. Even today, if you are making a book such as a medical encyclopedia that is going to remain in print for many years, it is simply stupid not to use a professional indexer. Really good indexes are an even mix of science and art form, and the quality improvement a professional makes is well worth paying for. Of course, few of us these days work on publications that are going to last long enough to justify this effort. And even fewer of us have the time to produce such an index. If you do have the time, obtain a copy of “Indexing, The Art of” by G. Norman Knight (Allen & Unwin, ISBN 0-04-029002-6). Norman Knight is a former President of The Society of Indexers, and his book is simple and charming. Reading it, you will soon realize that indexing is not difficult; it simply takes attention to detail and patience.
Planning the Job
Word has one of the nicest and most powerful index generators around built right in, so you have all the tools you are going to need. You need to allow a week per 500 pages to generate an index in a technical book. Technical publications are fairly “information dense”. Scholarly monographs and the like are usually quicker to index.
Types of Index
In the old days (say, 1995 or thereabouts!) indexes were all produced by the “shoebox” method. They literally used a shoebox into which they inserted index cards: three-inch by five-inch cards upon which they wrote the index term and its page number. The Indexer would sit with a large pile of “galley proofs”, single-page images as they were returned from the typesetter, and go through each one line-by-line seeking and recording the index terms. At the finish, they typed the index out with its page numbers and sent it off to the typesetter for publication. There is a software tool specially built for indexing that emulates this process exactly. I tell you this simply because, in certain circumstances, this method is still the best today. If your document is going to be published from a different computer to the one it is being created on, and that machine cannot interpret Microsoft Word XE tags, and you do not know what the page numbers are yet because the other machine is going to do the pagination, then use the shoebox method!
Word will do two forms of index: The Concordance Index and the Mark-up Index. It will also do something half-way in-between, using its “Mark All” command.
Mark-up Indexes
A Mark-up index is the method I recommend. It's quick,
accurate, easy to understand, and easy to correct. With a little care in
the planning, it normally results in a very useable index.
As the term implies, you produce a mark-up index by embedding mark-up “tags” in the Word document. Word automatically looks up the page numbers at Print time and generates and formats the index for you. Study the help topic “Create an index” and all its sub-topics. This is the way I recommend. It's the way that all good writers create an index these days. Mark by mark, page by page! It is explained in detail below.
Concordance Indexes
I implore you not to waste your time with a Concordance Index for
most publications. It results in a huge pile of rubbish that is of very little
use to the reader. And it takes nearly as long to make as it does to generate an
index properly. The Concordance Index is a hangover from the past when people
were desperately hoping to produce an
“automatic index”
to reduce the labor. Every major word-processor will do them, and no
professional writer or editor would, these days, permit one.
To make a Concordance index you make up a table of all the terms you want Word to find in one column, and the index entry you want to see for each term in the other. For more information, see “Create a concordance file” in the Word help file. But the end result is that you have every term indexed at EVERY place it occurs. Most of the mentions of a term in a book are simply passing references: what the reader wants to see in the index is only one page number; the one that contains the main topic for the term. If you send them on a wild goose chase to 20 other places first, they will think most unkindly of you.
The concordance mechanism does have its place: It can often be used to good effect in Reference Books such as Programming Reference Manuals, where each command or function is referred to only in a small section of the text, then rarely mentioned anywhere else in the book.
For the truly adventurous...
Technical writers and other folk who publish seriously-huge documents in HTML may want to spend a little time learning about Concordance Indexes. In conjunction with VBA, a concordance index is a great way to automatically generate hyperlinks in your document. You tag every mention of each term with the concordance indexing mechanism, then use VBA to change the tags into hyperlink tags.
Indexing Made Easy
Here are some worthwhile hints I can give you so you do not go mad during the process:
1. |
Print a copy of the book and go through it with a highlighter, marking the items you would like to see in the index. If you are not the subject-matter expert, get someone who is expert in the subject to do this for you (the process is massively easier if you understand the subject well). Mark only places where the reader will get information about each item. For example, if you want to include “installation procedure”, you would mark “Follow the procedure below to install...” in Chapter 1, you would not mark “if you completed the installation procedure...” in Chapter 5. The first is what the reader would expect to see when he looks up 'Installation Procedure'. The second might cause the reader to come and look you up {grin}. |
||||||||||||||
2. |
Make some design decisions before you start putting codes in the file. The most important are:
|
||||||||||||||
3. |
Now run through and tag the entries you have highlighted, according to the instructions in the help topic “Mark index entries”. Unfortunately, if you have made a few indexes, you will know how to do this, and if you haven't, your first attempt will contain errors. Sorry: I had to go through this too {grin}. I will give you a hint that will save you a bit of time (quite a lot, actually...) Do not put in the subentries at this stage. By that I mean tag each item as a main term. If the entry does belong as a subentry, you will find that you can add the main term to the tag more simply on your second pass. A Word About Tagging:Word's index tags are both case-sensitive and "space-sensitive". "Installing" and "installing" are not the same thing: each will appear under its own heading. "Administration" and " Administration" are not the same thing: one will sort right at the top of the index. See? When you are debugging "entries out of sequence" you sometimes have to look extremely closely to ensure that the tags really do match exactly. To enter an index tag in a heading, ensure that your headings are formatted by styles, and do not apply any formatting overrides to the heading. If you apply direct formatting to the headings that contain index tags, the direct formatting will be copied through to your Index. A colon : and a semicolon ; are not the same thing! You use colons to divide the levels of sub-entry in your index tags. When you are in a hurry, it is too easy to type the un-shifted character (the semi-colon) instead of the shifted character (the full colon) in the tag. If you do, you will get some very weird errors in your generated index. There's no easy way to find these, but the semi-colon will appear in the index. If you have strange things happening (items that do not appear under their correct entries or sub-entries) try searching your generated index for semi-colons. If you find any, at least you know "what" is wrong: finding the tag that produced the problem is a real chore (it will not be on the page in the index...). Try this: Reveal your hidden text (so you can see your XE tags) then search for a semi-colon with the font format hidden text. If you find any, chances are they are in your bad index tags. |
||||||||||||||
4. |
Now generate the index. Ignore the formatting at this
stage; just print it. Leave it as a single column for ease of reference. If you
have a big screen, you can open a second window into the document and look at
the index that way (see the Window menu) but for most, it's easier to print the
first result. |
||||||||||||||
5. |
Now sit down with a colored pen or pencil (you can't see blue or black against black type...) and edit the index.
|
||||||||||||||
6. |
Go through and edit the tags in the file to implement the changes you have identified. You can find index tags easily by using the Browse buttons on your vertical scroll bar (see “Browse to the next or previous page, table, or other item” in the help). In later versions of Word (2002 and above) you can use Ctrl + G to
bring up the "Go To" dialog. Set "Go to what?" to "Field".
Set the Enter field name box to "XE". Click Next,
then Close. Your "Previous" and "Next" browse buttons (at the
extreme bottom right corner of the Word window, under the vertical scroll bar)
will now go to the next or previous index entry fields on each click, until you
change to something else. If you use Find, or Browse by Find, you can specify ^d XE as your Find string to find only index tags. If you know exactly what the text of the tag is, you can use ^d XE "tag text string" to find exactly that tag. However, this requires you to work out exactly what the tag content will be, and that's not easy three levels down in an Index. So I prefer to use Ctrl + G, Page Number (from the index), then Ctrl + F, ^d (to find the next XE tag. Then keep hitting Browse Next to find the tag you want. |
||||||||||||||
7. |
Now regenerate your index. (Click in it and press F9). You can now change it to double-column if you wish. You format an index by using Format>Style to change the styles Index 1 through Index 9. Each style controls the formatting of one level of entry. |
Page Number Conflation
Page number conflation is where only the first and last page numbers appear for a topic. In the index you see 88 - 95 instead of 88, 89, 90...
I am very tempted to say "don't bother"! Tag the first instance of each term. If your reader does not have the brains to see that the information on a topic continues for several pages, they should be kept away from your book in case they hurt themselves... However, if you absolutely must conflate, this is the way to do it:
- Place a bookmark around all of the pages you want to conflate.
- Then place the name of the bookmark in the XE tag, Word will generate a conflated page reference for you.
See! It isn't that hard
There! That's the way I do it. If you trust me and do it that way, you will find out why I do it that way. If you don't trust me and do it another way, you will find out why much sooner {grin}.