Stdlib (standard library) module

Copyright © 1995-2012 Opera Software ASA. All rights reserved. This file is part of the Opera web browser. It may not be distributed under any circumstances.

Overview

Scope

stdlib is an implementation of all the C standard library functions used by Opera, with names prefixed by "op_", eg, "op_memcpy".

It also contains implementations of the corresponding Unicode versions, prefixed by "uni_", and a few mixed-representation versions, also prefixed "uni_" but with different signatures.

Finally, it contains some locally concocted Unicode utility functions.

Motivation

We want our own implementations of these functions to use on platforms where the platform implementation is not good enough or is missing.

Requirements

The implementation should meet the following requirements:

Portable

Functions should be platform-independent when possible and should use the standard porting interfaces to achieve this.

Standards-conforming

The implementation should follow the C standard closely; any deviations must explicitly be deemed unimportant, and the deviations must be documented. Deviations are strongly discouraged.

High quality

Correctness, performance and memory use must be excellent. Footprint is also important, but less important than the previous three.

Unencumbered

Third-party code is allowed, but should be avoided if possible, and should always be visible.

API documentation

The API is documented using Doxygen, here. Most of the API consists of functions defined by the C and C++ standards, but there are also some utility classes both used internally and available for other modules.

Implementation notes

Note that the source code for some of the string functions is defined in an include file that is included multiple times into src/stdlib_string.cpp to generate various versions. See more documentation in that file.

Some of the functionality does require access to information from the porting interfaces to function correctly. Thus you cannot, for instance, implement OpSystemInfo::GetTimezone using the stdlib module implementation of op_localtime(), as the latter will call the former.

If the third-party dtoa library is disabled (note: this is strongly discouraged), The stdlib module will export an additional porting interface that the platform must implement. This porting interface is defined by the class StdlibPI, defined in the file stdlib_pi.h.

Some notes and ToDos.

Memory documentation

OOM Policies

Standard C functions signal OOM by returning an error code and setting errno to ENOMEM. The stdlib module always returns an error code when it encounters OOM. It will set errno when it is available (HAVE_ERRNO is defined).

If errno is not available, OOM is therefore not signalled reliably by this module: clients cannot always tell OOM from other errors.

Who is handling OOM

The module frees internal storage. The client code is responsible for freeing storage it owns.

Description of flow

Most of this module consists of separate functions used separately, so there are no "flow" issues. The exception is the thirdparty number-to-string and string-to-number functions in the OpDoubleFormat interface and op_strtod(), and the thirdparty random number generator. These have internal state and some working storage. This memory is allocated either at startup or on demand, and deallocated when the module is destroyed at shutdown.

Heap memory usage

None of the functions use much heap memory at all.

The thirdparty number-to-string and string-to-number functions can use some, but maintain an internal free list, so the pressure on the memory allocator is slight.

The bitmap structure used by op_scanf() for tracking character sets can become large if large ranges of Unicode characters are used, but this does not usually happen.

Stack memory usage

None of the functions in the module use more than a few stack frames, and generally avoid stack-allocated buffers.

Static memory usage

The thirdparty number-to-string and string-to-number functions and the thirdparty random number generator (RGN) have some global data. Not all of these data are allocated at startup. The number-to-string and string-to-number functions may use an unbounded amount of memory (in exceptional cases) and will maintain an internal free list of unused storage that won't be cleared until Opera shuts down and is therefore "static". The RNG has about 3KB of internal state.

Caching and freeing memory

The thirdparty number-to-string and string-to-number functions will return their internal storage if the FreeCachedData() API is called on the StdlibModule object.

Freeing memory on exit

All static data are freed on exit as long as the module's Destroy() method is called.

Temp buffers

The module uses MemoryManager::TempBufferUni for the macros and functions that convert single-byte to unicode and vice versa. It does not use any of the other tempbuffers.

Memory tuning

There is no way to tune the memory usage of the module.

Tests

There are no provided tests for memory use or leaks.

Coverage

The selftest suite contains base-line tests for the entire public API of the module, and more in-depth tests for some of the functionality.

Running the selftest suite and loading a fairly complex page like vg.no, followed perhaps by running the ECMAScript test suite, is expected to give reasonable coverage. This has not actually been verified.

Design choices

The third-party random number generator was not chosen for having the smallest internal state, it (the Mersenne Twister RNG) was chosen because it is generally acknowledged as being the best, and as being fast.

Suggestions for improvements

OOM handling

There should be reliable OOM tracking. The most attractive solution is to implement errno ourselves if the platform does not; however, errno will still not be set by the platform functions in that case, so it is still not a perfect solution.

Other notes

References