TheSage documentation

Documentation is always a work in progress as it is only useful when users are actually able to benefit from it.

We have compiled here the information that we believe will be of practical use. If you find that something merits being included or discussed in more detail, please use the contact form and we will adjust this document as appropriate.

A Spanish translation of this documentation can be found here.

What is TheSage?

It is best to think of TheSage as two different systems: a knowledge database and a multi-tool interface.

The knowledge database consists of a tightly integrated English dictionary and thesaurus. TheSage's index contains roughly 255,000 words and its dictionary approximately 340,000 senses, 110,000 etymologies, 100,000 examples of use, and 240,000 phonetic transcriptions. TheSage's thesaurus contains approximately 2.1 million relationships between words and definitions, from synonyms and antonyms to hypernyms, hyponyms, meronyms, holonyms, etc. As a corpus, TheSage is comprised of approximately 18.2 million words.

The interface allows the user to extract and collect information from the knowledge database in a variety of ways. This is accomplished by means of the tools shown in the left Navigation panel.

Intended audience

TheSage has been designed to be used by language researchers and pedagogues seeking to carry out certain kinds of linguistic analyses. Please read this documentation carefully and contact us if you have any questions.

Despite the complexity of TheSage, using the dictionary is extremely simple. Casual users only need to type a word and press the Enter key. If the word exists, TheSage will display its senses, thesaurus, pronunciation, and other information. If the word does not exist, TheSage will provide suggestions and alternative spellings.

Journalists, translators, etc., might want to know that our approach is that of descriptive linguistics. Therefore, TheSage is not to be used to determine what the "correct" or "incorrect" use of the English language is. We are satisfied by describing how English speakers use the language.

Dictionary

To use the dictionary, first select the Dictionary tool on the left Navigation panel, then type a word in the text box, and press the Enter key (or click on the Enter button).

By default, a drop down list will automatically provide suggestions as you type. This feature can be disabled via the Dictionary page in the Options dialog.

Dictionary entries can be displayed using three different views: e-paper view, classic view, and retro view. Click on the more... button to decide which one to use.

The e-paper view is the simplest of all views. It is meant to resemble the structure used in a printed dictionary (if space considerations were not a concern).

Additionally, the e-paper view takes advantage of the electronic medium to enhance traditional dictionary entries with color coding and a host of options available through selection and a right-click context menu.

The classic view will be familiar to TheSage users. Dictionary entries are organized into an intuitive tree structure, rendering specific information quickly and conveniently accessible. This the default view.

At the top of the tree is the lemma followed by its phonetic transcription. The dark blue nodes expand into parts of speech (noun, verb, adjective, adverb, etc.) and serve to group definitions accordingly. The light blue nodes expand into definitions. These are numbered and their background color changes when selected to help isolate them visually. The red nodes expand into thesauri and contain related entries to the lemma in function of the following relationships: synonym, antonym, hypernym, hyponym, attribute, entailment, cause, similar, derived, holonym, meronym, category, region, usage, and other (under 'see also').

A thesaurus found under a particular definition only applies to that definition. A thesaurus found under a given part of speech is the collection of all thesauri for definitions under such part of speech. Finally, thesauri under the lemma are all thesauri available for this entry. The green nodes correspond to examples of use and can be found as branches of the lemma, parts of speech, and definitions (the same grouping logic applies). The yellow-gold nodes correspond to etymologies.

The retro view is a love letter to Windows 95 diehards (you know who you are). Thesauri are divided into lists, lists, and more lists arranged in tabs below the main dictionary entry. The first tab is reserved for the etymological information. As with other views, additional features are accessible through selection and mouse right-click.

TheSage comes with a speller assistant. When a word is not found in the dictionary, TheSage presents the closest entry in the dictionary together with a selection of suggestions based on common misspelling and mispronunciation errors. Searching online is also a click away.

The more... button makes it possible to toggle the display of etymological information (shortened by default) as well as to enlarge, reduce, and change the default font.

Additional settings are available through the Options dialog. Please note that certain features, such as color customization or the ability to control the actions that correspond to mouse double-clicks, can only be accessed through the Advanced options.

Wildcard search

To use the wildcard search, first select the Wildcard tool on the left Navigation panel, then type a pattern in the text box and press the Enter key (or click on the Enter button).

In a standard search (the default), patterns use the special characters asterisk (*) and question mark (?). The asterisk matches any sequence of characters (including none) and the question mark matches any single character. Note that a character can be a letter, a digit, or a symbol like a dash or a space.

The following two special characters can also be used in standard search: the hashtag (#) and the dollar sign ($). The hashtag matches a single vowel and the dollar sign matches a single consonant.

Example 1 - The pattern *orb will return all entries in TheSage's index that end with the string of letters orb.

Results: absorb, adsorb, chemisorb, desorb, orb, reabsorb, resorb, and sorb.

Example 2 - The pattern b#t will return all entries in TheSage's index that are three letters long, the first of which is b, the second is a vowel, and the third is t.

Results: bat, bet, bit, bot, and but.

Example 3 - The pattern b#$d will return all entries in TheSage's index that are four letters long, the first of which is b, the second is a vowel, the third is a consonant, and the fourth is d.

Results: bald, band, bard, bawd, bend, bind, bird, bold, and bond.

Example 4 - The pattern t??d will return all entries in TheSage's index that are four letters long, start with t and finish in d.

Results: (screenshot below)

As can be seen above, the selection of one or several results and a right-click brings up a context menu that gives access additional features that should be becoming familiar by now. Moving content around TheSage is easy.

Also, note that a double-click on any one result will perform a dictionary lookup. If the Quick viewer is shown, it will display the dictionary entry. If it is not, TheSage will route the request to the Dictionary tool and the full entry will be displayed there.

The full .Net Framework implementation of Regex can be enabled via the more... button. Note that this implementation has been expanded to include the special characters $v (vowels) and $c (consonants) as well as [^$v] and [^$c] to negate each class and, for example, [$v-[ae]] will match all vowels except a and e. If you are not familiar with Regex, you can find a good simple tutorial at here. Please note that the meanings of the hashtag (#) and the dollar sign ($) used in standard search do not apply here.

Example 1 - The regex pattern ^mo[$v-[ae]] will return words that start with mo followed by any vowel except a and e.

Results: moieties, moiety, moil, moiled, moiling, moils... (too many to list)

Example 2 - The regex pattern [$v-[ou]][^$c][$c-[lns]]$ will return words that end in any consonant except l, n, or s, preceded by any character except a consonant, itself preceded by any vowel except o and u.

Results: (screenshot below)

Additional settings are available via the more... button as well as through the Options dialog. Please note that certain features, such as lifting result limitations, can only be accessed through the Advanced options.

Anagram search

To use the anagram search, first select the Anagram tool on the left Navigation panel, then type a series of letters in the text box and press the Enter key (or click on the Enter button).

In addition to right-click context menus that give access to additional features, it is also possible to drag and drop words from tool to tool. In the screenshot below, the selected words are being copied to the Concordander work pad.

The more... button makes it possible to request exact length anagrams. Additional settings are available through the Options dialog. Again, please note that certain features, such as lifting result limitations, can only be accessed through the Advanced options.

Thesaurus search

The Thesaurus tool implements two ways to access information: list mode and cloud mode.

In list mode, TheSage takes an individual word and returns its thesaurus. To use it, first select the Thesaurus tool on the left Navigation panel, then choose List mode via the more... button, type a word in the text box and press the Enter key (or click on the Enter button).

By means of the Relationships panel on the right side of the screen, the user can create a custom thesaurus so that, for example, only hypernyms and hyponyms are searched for.

In cloud mode, TheSage takes multiple words or seeds and returns a range of related words graded according to their relatedness to the seeds.

As before, the Relationships panel allows the user to narrow or widen the scope of the search. TheSage uses various font sizes and colors to convey degrees of connectedness. These settings can be tweaked via the Options dialog.

As with other tools that return a list of words, double-clicks and right-clicks provide access to additional features. Explore.

Etymologer

Advanced users might want to search for etymologies that contain specific keywords and senses. To do so, first select the Etymologer tool on the left Navigation panel, then type a word in the text box and press the Enter key (or click on the Enter button).

Please note that this tool matches entire words and that wildcards are meaningless here.

The Keyword search makes it possible to retrieve those etymologies that contain a given term in another, perhaps ancient, language (e.g., Portuguese, Proto-Italic, Old French). Keywords can be easily recognized because they are displayed in blue and bold font by default.

Example 1 - Looking up the Latin word breviō will return all words whose etymologies contain this keyword.

Example 2 - Looking up the Ancient Greek word βραχίων will return all words whose etymologies contain this keyword.

The Sense search makes it possible to retrieve those etymologies that contain a given meaning of a term in another, perhaps ancient, language (e.g., Portuguese, Proto-Italic, Old French). Meanings can be easily recognized because they are displayed within parenthesis and double quotes.

Example 1 - Looking up the word repute will return all words whose etymologies contain this sense as the meaning of any of the keywords used.

Example 2 - Looking up the word safekeeping will return all words whose etymologies contain this sense as the meaning of any of the keywords used.

Example 3 - Looking up the word sweet will return all words whose etymologies contain this sense as the meaning of any of the keywords used.

Etymologies often contain words in languages that might require fonts other than the default one in a user's system. The more... button can be used to select the most adequate font to display etymologies with.

Etymologies contain a complex amalgamation of information. Please use the Etymologer page in the Options dialog to specify which content to display and which to hide.

Phonetic search

Advanced users might want to search for phonetic transcriptions that contain specific patterns of phonemes. To do so, first select the Phonetic tool on the left Navigation panel, then type a phonetic pattern in the text box and press the Enter key (or click on the Enter button).

The phonetic and the wildcard search share many characteristics regarding the fundamentals of pattern construction. However, it is important to note that phonetic searches require phonetic symbols (rather than letters) to construct patterns.

It is not possible to type phonetic characters directly from most keyboards. TheSage adopts the SAMPA convention of key binding so that, for example, typing E will display ɛ, typing @ will display ə, typing { will display æ, typing D will display ð, typing N will display ŋ, and so on. In order to ease recall, please right-click on the text box and a context menu will list all the vowel and consonant switches and insert them on request. Please visit the SAMPA page for more information.

In a standard search (the default), patterns use the special characters asterisk (*) and question mark (?). The asterisk matches any sequence of phonetic characters (including none) and the question mark matches any single phonetic character. Note that, in this context, a phonetic character can be a phonetic symbol but also a dash or a space in the case of multiword entries.

The following two special characters can also be used in standard search: the hashtag (#) and the dollar sign ($). The hashtag matches a single vowel sound and the dollar sign matches a single consonant sound.

Example 1 - The phonetic pattern ə*rt will return all words in TheSage's index whose phonetic transcriptions start with a schwa and finish with the phonemes /r/ and /t/.

Results: abort, alert, amort, apart... (too many to list)

Example 2 - The phonetic pattern h#p will return all words in TheSage's index whose phonetic transcriptions are three phonemes in length, the first of which is /h/, the second is a vowel sound, and the third is /p/.

Results: hap, heap, hep, hip, hoop, and hop.

Example 3 - The phonetic pattern b$#d will return all words in TheSage's index whose phonetic transcriptions are four phonemes in length, the first of which is /b/, the second is a consonant sound, the third is a vowel sound, and the fourth is /d/.

Results: bled, bleed, blood, brad, bread, bred, breed, broad, and brood.

Example 4 - The phonetic pattern b?æ? will return all words in TheSage's index whose phonetic transcriptions are four phonemes in length, start with /b/, followed by any phoneme (or symbol like space or dash), followed by /æ/, and finishing with any phoneme.

Results: (screenshot below)

As with the Wildcard tool, the full .Net Framework implementation of Regex can be enabled via the more... button. Note that this implementation has also been expanded to include the special characters $v (vowels) and $c (consonants) as well as other particulars (e.g., class negation, exclusion). Naturally, vowel and consonant orthgraphic classes here correspond to vowel and consonant phoneme classes. Please note that the meanings of the hashtag (#) and the dollar sign ($) used in standard search do not apply here.

Example 1 - The phonetic regex pattern æ[ʃt]$ will return words whose phonetic transcriptions finish with /æ/ followed by either /ʃ/ or /t/.

Example 2 - The phonetic regex pattern ^gʊ[$c-[bk]] will return words whose phonetic transcriptions start with /gʊ/ and are followed by any consonant sound except /b/ and /k/.

Copying to and pasting from the clipboard is therefore further complicated by SAMPA encodings. By default, SAMPA will be copied to and pasted from the clipboard. This feature can be changed through the Phonetic page in the Options dialog so that Unicode characters can be used directly.

Rhyme assistant

To search for rhymes, first select the Rhyme tool on the left Navigation panel, then type a word in the text box and press the Enter key (or click on the Enter button).

Please note that this tool matches entire words and that wildcards are meaningless here.

Single full rhyme: TheSage will return those words whose phonetic transcription match the last syllable of the word entered. This is the default search.

Double full rhyme: TheSage will return those words whose phonetic transcription match the last two syllables of the word entered.

Triple full rhyme: TheSage will return those words whose phonetic transcription match the last three syllables of the word entered (e.g., deliberative will yield accelerative, agglomerative, alliterative, alterative, ameliorative, among others).

Assonant rhymes: TheSage will match the vowel nuclei of the last syllable (single), of the last two syllables (double), and of the last three syllables (triple) of the word entered.

Consonant rhymes: TheSage will match the consonant onsets and codas of the last syllable (single), of the last two syllables (double), and of the last three syllables (triple) of the word entered.

As shown above, the more... button makes it possible to request the search to match the number of syllables of the word entered. Additional settings are available through the Options dialog. Please note that certain features, such as lifting result limitations, can only be accessed through the Advanced options.

Concordancer

To use the concordancer search (also called reverse search), first select the Concordancer tool on the left Navigation panel, then type a word in the text box and press the Enter key (or click on the Enter button).

Please note that this tool matches entire words and that wildcards are meaningless here.

In a Definitions search, TheSage will return those definitions that contain the word entered. Importantly, results are displayed in the form of concordances thereby showing the contexts in which the word is used. This is the default search.

In an Examples search, TheSage will return those examples that contain the word entered.

In both searches, the left-most column displays the lemma(s) that contain the definitions or examples returned.

The more... button makes it possible to request the search to be case sensitive. Additional settings are available through the Options dialog. Please note that certain features, such as lifting result limitations, can only be accessed through the Advanced options.

Online search

Select the Online tool on the left Navigation panel, type a word in the text box and press the Enter key (or click on the Enter button).

TheSage will use its internal browser by default. This can be changed via the Advanced options.

Also by default, TheSage will look up the word entered in three websites (as shown below), each displayed in its own tab. Websites can be managed (e.g., added, removed, modified) via the more... button and the Options dialog.

Word lists

Select the Word lists tool on the left navigation panel, type the name of a new word list in the text box and press the Enter key to create it (or click on the Enter button).

As shown throughout this document, words can be added to lists directly from most tools and work pads via context menus. It is also possible to export, import, and manipulate word lists and their contents in a variety of ways. As always, right-click context menus are your friends.

History

Select the History tool on the left Navigation panel, and the history of all tools will be displayed in descending chronological order (most recent first) and grouped by date.

The most common use of the history is to search for words or patterns that were previously used. To simplify this process, type a string of letters and press the Enter key (or click on the Enter button). Those words and patterns in the history that contain the substring will be displayed. It is also possible to filter results by tool via the drop-down on the top-right.

The more... button makes it possible to clear the history. Additional settings are available through the Options dialog. Please note that certain features, such as precise management of history size, can only be accessed through the Advanced options.

Work pads

All tools come with a work pad. At their simplest, work pads are visible clipboards where the user can easily store words for quick reference and use. By default, work pads record the local tool's history.

Words in work pads can be dragged and dropped on other tools in the left Navigation panel and copied to word lists via the context menu.

The right-click context menu provides access to a host of features (e.g., clipboard, import & export, sorting). Additional settings are available through the Options dialog. Please note that work pads can be hidden and ignored by clicking on HIDE WORK PAD at the bottom of the screen.

Opening & closing

The Options dialog makes it possible to request that TheSage be launched at Windows startup. Additional settings are available in the Opening & closing page in the Options dialog.

Fonts

In general, it is possible to enlarge and decrease the size of fonts, as well as choose different fonts, via the more... button. Nonetheless, the Options dialog has a Fonts page where all fonts can be seen and adjusted.

Accessability

The following keyboard accelerators have been implemented.

General:
alt + qfor the pattern/query textbox
alt + mfor the most often used options for a tool
alt + ufor the results window
alt + ofor TheSage's options
alt + afor TheSage's about dialog
alt + hfor TheSage's help/documentation

Tools:
alt + dfor the dictionary
alt + wfor the wildcard
alt + gfor the anagram
alt + tfor the thesaurus
alt + efor the etymologer
alt + pfor the phonetic
alt + rfor the rhyme
alt + cfor the concordancer
alt + nfor the online
alt + sfor the word lists
alt + yfor the history

Updates

The standard edition of TheSage provides two ways to check for updates. If TheSage is minimized to the System tray, right-click on TheSage's icon to bring up a menu where there is an entry that will check for updates.

Alternatively, please go to the Options dialog and then to the Updates page where a button allows you to check for updates.

In addition to the alternatives provided by the standard edition, the professional edition of TheSage checks automatically every few days. When updates are available, the user is notified and the update takes a fraction of a second to install.

Word capture

TheSage can look up words directly from almost any program (IE, Word, Firefox, Outlook, Thunderbird, etc.). In order to capture a word in another program, TheSage provides two methods: mouse and keyboard.

TheSage is a so-called "one-click" dictionary. By default, it is possible to lookup a word directly from other programs by placing the mouse pointer over the word and right-clicking while pressing the CTRL key. The default key-click combination can be changed via the Mouse page in the Options dialog.

TheSage can look up words directly from other programs via the keyboard. This is accomplished by selecting the relevant word in the appropriate program and by pressing TheSage's hotkey. TheSage will then read the word and carry out a lookup automatically. By default, TheSage's hotkey is Ctrl + Shift + A but users can chose whatever combination they prefer via the Hotkey page in the Options dialog.

Installation Installing

To install TheSage, execute the setup program. Once installed, TheSage can be invoked from the Start menu, the Desktop, or directly from its installation folder in the hard drive.

Uninstalling

TheSage's setup includes an uninstaller utility. To run the uninstaller, navigate through the Start Menu or through the hard drive to TheSage's folder. Alternatively, it is possible to uninstall TheSage through the Programs and Features applet located in the Control Panel.

Portability

TheSage requires the .NET Framework 4. It is already installed on most machines and useable from XP on.

Beyond this requirement, TheSage is 100% portable in and of itself. If the contents of the folder where TheSage is installed are copied elsewhere (a USB flash drive, for example), TheSage will work perfectly.

For convenience, an uninstaller-free zip file is also available for download.

Contact