Application Settings - Search Engine

Applies to HTML Viewer, IE Browser publications.

About the built-in search engine

imgPublications or ebooks created with HTML Executable come with a built-in search engine that allows end users to search for specific words or expressions through the entire publication in seconds. When compiling your publication ebook, HTML Executable parses all HTML pages and PDF documents, collecting keywords from them. These keywords are then indexed and the result is stored in the publication's data. Since keywords are indexed, it only takes seconds for a search query to be completed.

img

HTML Executable indicates the number of pages and unique words that were found while indexing pages in the compilation log.

imgEnd users can access the search panel by clicking the "Search" button or by selecting "Navigate|Show Search" in the application. You can also use HEScript commands to integrate the search engine into your HTML pages: see StartFullSearch.

Configuring the search engine

imgEnabling the search engine can result in a larger publication file (it depends on the number of HTML pages and PDF documents you compile). If you, therefore, do not want to include a search engine in your publication, then turn the following option on: "Disable the search engine".

imgPDF documents can be indexed too, provided that the built-in PDF viewer is activated.

imgWhen a search is complete, the publication lists the results. Each result displays the page's title and URL on which end users can click to access the page. If you prefer to keep your URLs secret, you can also hide page URLs from the search results. In this case, a "(click)" URL will be displayed instead.

Search Results

imgSome keywords may be automatically excluded from the index so they won't give any result if end users search for them. In addition to some common words, you may add your own sensitive keywords to the exclusion list. Just press Add and specify the keyword to add. On the contrary, you can remove keywords from the exclusion list by selecting them and clicking Remove. Keyword exclusion lists may be imported/exported from/to XML files using the XML Tools button, so you can edit them manually using any XML editor.

Finally, if your compiled website uses frames, you may need to specify in which frame a page whose URL was clicked on should be displayed. Use the SearchFrameTarget property to indicate it. This only applies to IE browser publications, however.

Support for Unicode

imgThe search engine is Unicode-enabled. When parsing HTML pages, HTML Executable takes account of the encoding format and the charset defined in HTML documents. All keywords are natively converted and stored in UTF-8 format.

imgThe "Use default word delimiters based on Unicode character categories" should never be turned off. Otherwise, the search engine will use word delimiters defined in the Environment Options.

imgThe "Automatically split CJK characters" (Chinese, Japanese and Korean) option allows an improved search for East Asian languages.

About searches

imgThe search engine supports phrases containing logical operators as for major Web search engines: + (AND), - (NOT), OR, *,  ? (wild cards) and double quotes.

In particular "?" is used as a substitute for any one character as opposed to the asterisk, "*", which can be used as a substitute for zero or more characters in a keyword.

Examples:

  • red apple will return pages that both contain red and apple.

  • "red apple" comes with pages that exactly contain the "red apple" expression.

  • red OR apple return pages with red, and pages with apple.

  • red -apple return pages that contain "red" but that also do not contain "apple".

  • app* return pages which contain any words beginning with app. The wildcard operator can be placed between characters like this: char*s. You may use up to 3 wildcards anywhere in your query.

imgWhen a page from a search result is opened, keywords that were searched for may be highlighted. You can modify the text style for highlighted words. For PDF documents, keywords are highlighted too.

img

The HTML Viewer engine only supports highlighting one keyword.

Customizing the display of search results

It is possible to customize how search results are formatted: go to the Application Behavior => Language page, and under Resource Strings, you can modify these three resource strings:

  • SSearchResHTMLTableStart: HTML tags that start the HTML table which will contain the search results.

  • SSearchResHTMLCellFormat: HTML tags that define a single table cell and its contents. The four %s parameters are required: do not remove them (enclose a parameter with an HTML comment <!-- --> if you do not want to make it visible). The 1st %s parameter is replaced by the result's index; the 2nd one by the page's title; the 3rd one by the URL that would display the page if clicked and finally the last one by either the filename of the page or the "click" word (see above). If you want to use the percent symbol (for example 3%), use it twice: 3%% (thus the formatting routine doesn't misunderstand it with a parameter like %s).

  • SSearchResHTMLCellFormatAltern: same as SSearchResHTMLCellFormat, but used every two search results. Thus, you can get alternative backgrounds for search results (see the screenshot above).

  • SSearchResHTMLTableEnd: HTML tags that end the HTML table.

Customizing the search query

You can programmatically modify the query of the end user with the UserMain.OnModifySearchRequest HEScript event.


About

This is the online documentation of HTML Executable.

About HTML Executable

HTML Executable is a versatile HTML compiler and ebook compiler: it lets you create secure ebooks and desktop applications with your websites, HTML or PDF documents.

You can easily create attractive ebooks, full-featured HTML applications (RIA) and software, digital publications from your websites, PDF files and HTML documents for online or offline distribution.

Learn more - Free Trial