Spellchecker module

Introduction

This is the core Spellchecker module that uses Hunspell dictionaries used to spell check for instance text areas, elements with contentEditable and documents in designMode. This is a complete rewrite of a spell checker from scratch. It supersedes the old adjunct Spellcheck module that uses the system's aspell binary.

Current information about the Spellchecker module.

API documentation

For detailed information on how to use the Spellchecker module in Opera and the module's public API, please refer to the API documentation. The documentation needs to be generated by Doxygen.

Internal data structures

See the overview of the fundamental data structures for a detailed description of the internals.

Memory management

OOM policy

Out of memory handling is handled by propagating OP_STATUS values. On OOM we disable the spell checker and inform any listeners.

Heap memory usage

All dictionaries are transformed to our internal format and stored on the heap. Some complex dictionaries (e.g., hu_HU) can take a considerable amount of heap memory (>100 MB).

Stack memory usage

There might be some recursive methods that might theoretically use all stack, but no such methods are known to have caused problems. The module also uses some stack arrays as temporary buffers.

Memory ownership

Most heap memory (the dictionary radix tree) are owned by the global OpSpellcheckerManager. Each SpellCheckerSession has some memory and the user of SpellCheckerSession is responsible for appropriate deletion of the session.

Temporary buffers

No global temporary buffers are used.

Memory tuning

Some minor tuning can be done. See the tweaks in module.tweaks file.

Tests and coverage

The selftests are providing sufficient coverage.

Design choices

The spell checker calculates all possible words and store them in a radix tree to speed up the checking.

Improvements

Due to the extensive memory usage for certain dictionaries it should use a more lazy approach when determining if a word is correctly spelled or not.