[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

stds-802-16: new search feature on 802.16 web site




[Notice: It is the policy of 802.16 to treat messages posted here as non-confidential.]

Dear 802.16 Aficionado,

The 802.16 web site now incorporates an important new tool: an advanced, full-text search procedure. The power is provided by the Ultraseek search engine which runs on the IEEE web server and makes use of a local site index. It is fast and accurate.

One really important feature of Ultraseek is that it reads text inside Acrobat PDF and Microsoft Office files as well as in HTML files. Now that we have well over 200 files on the site, with well over 10 MB of data (plus over 200 archived reflector messages), this search procedure will give you a lot of power.

You will appreciate Ultraseek's detailed, but simple to use, search request form and its ability to refine searches by searching the current search results.

Here is a brief explanation to help you find what you are looking for.

OUTPUT:

The Ultraseek output provides the "Title" and "Summary" of each located file.

For HTML files, "Title" is taken from the HTML <TITLE> tag. I have always been pretty careful about making the <TITLE> tag appropriate. Up until now, I haven't paid any attention to the file tags in PDF and DOC files (these can be set within the application). However, in order to improve the value of the search routine, I have recently updated all of the PDF and DOC files on the site, changing their file "Title" parameter to be the title of the document. I think you'll find that the title now gives you pretty good idea about the content of all your search results.

Regarding the "Summary" tag, there doesn't seem to be a way to specify it in PDF or DOC files, so Ultraseek, by default, just uses the first fews lines. That's not so bad; in the current PDF document format, you will see the document's number and date. In HTML files, the title comes from the meta "Description" tag. In the 802.16 HTML files, the "Description" tag is either absent or generic, so it won't be much help. I don't plan to fix that right now.

INPUT:

The file tags are useful not only in the output but also in the input. On the PDF and DOC files, I have updated the "Subject" to be the document number and the "Author" to be the submitter. The way to search on these parameters is to use a non-obvious Ultraseek convention [documented in the Ultraseek help]. Namely, search in the body for "tag:text", where "tag" is the parameter (such as "author", "subject", or any html meta tag) and "text" is any text in it. For example, to find document 802.16sc-99/36, you could search for:

	subject:99/36 

in the body of the file. Or, to find submissions by Brian Petry, you could look for:

	author:Brian

in the body. Don't leave a space after the colon.

To make things easier, I have set up custom links that automatically set up a form for searching by document number or document author. These links are all near the top of the 802.16 main page.


For the moment, I think we are in pretty good shape on searching. The main thing is to make sure that future PDF and Office files have the "Title", "Subject", and "Author" tags set correctly. In Acrobat, these parameters are set using Document Info/General under the File menu. For Microsoft Office files, you set them with Properties/Summary under the File menu.

Happy Hunting!

Roger


Dr. Roger B. Marks  <mailto:marks@nist.gov>           
Chair, IEEE 802.16 Working Group on Broadband Wireless Access
National Wireless Electronic Systems Testbed (N-WEST) <http://nwest.nist.gov>
National Institute of Standards and Technology/Boulder, CO
phone: 1-303-497-3037  fax: 1-303-497-7828