API documentation generated by Doxygen contains all necessary information for the external APIs.
The Cache_Manager class is part of the URL_Manager class from the URL module, and is used as part of the API provided through the URL API and URL_Manager.
Central to the cache is the concept of context, basically a separated cache, persistent or temporary, dedicated to a specific task (a widget, a privacy tab...). Each context is identified by a context ID stored in each URL_Rep object associated with the store. Each context is managed by a Context_Manager, and all the context managers are managed by the Cache_Manager that is included in the URL_Manager class.
Starting with Core 2.5, we introduced a chained cache architecture, that allows Opera to have multiple levels of cache. This is achieved chaining several Context_Managers.
Additionally, the module provides the URLs Cache_Storage API objects for accessing downloaded documents in RAM or disk storage (browser's cache or local disk, persistent or temporary). Other modules also define implementations based on this hierarchy.
For a more detailed explanation, look at the Architecture And Implementation Documetation.
The development of Core 2.3 was performance driven, with the aim to improve the behavior on mobile phones and devices, because sometimes they disable the cache for performance reasons. The disk activity is greatly reduced, so devices with a slow disk should get a nice boost.
A lot of TWEAKs have been provided to tune the memory / performance ratio, hoping that the default values are a meaningful starting point.
For a more detailed explanation, look at the Performances Improvements.
As part of a task intended to improve the bandwidth management for the audio and video tag, a cache tailored to manage multimedia content has been developed.
The key requirement was managing out of order download of huge files, storimg in the cache the segments already downloaded.
For a more detailed explanation, look at the Multimedia Cache.
A new opera:cache page is now available that lets the user filter the content of the cache, and (with a small support outside of core) also export it. Apart for being a nice feature (with the current settings you could watch a video on YouTube and then export it), it should also be well received by the people that did not like the removal of the extensions from the cache files.
TWEAK_CACHE_ADVANCED_VIEW control this feature.
Listed here is a sample usage of the cache which will hopefully aid someone new to the cache to get the big picture.
To reduce a bit the complexity, the description does not consider the chaining functionality. Just assume that if a level cannot perform an operation (or if the operation is "global enough"), the request will be passed to the next manager
A URL is requested through g_url_api->GetURL() which will trickle down via URL_Manager::LocalGetURL() and Cache_Man::GetResolvedURL()
to Context_Manager::GetResolvedURL(). The context manager will first check if a URL_Rep already exists in the URL_Store by calling
URL_Store::GetURL_Rep(). If it can be found it means it has either already been requested or was created during start-up because it
exists in the cache.
If it cannot be found in the URL_Store a new URL_Rep will be created and returned.
No assumptions can be made as to whether the returned URL_Rep has a URL_DataStorage or not.
At this point nothing more is initiated until the resulting URL is later loaded by calling URL:LoadDocument().
LoadDocument() will ask the URL_Rep to create its URL_DataStorage (in this instance because it needed to set an attribute on the storage, but it would've been created a bit later anyway before the loading proceeded). After that asynchronous loading is kicked off by a call to URL_Rep::Load().
The first time data is received for the URL via the socket it will trickle up via URL_LoadHandler::ProcessReceivedData() to URL_DataStorage::ReceiveDataL() that will create the cache storage. The Cache_Storage is allocated in URL_DataStorage::CreateNewCache(), that will create a storage of one of several types (Persistent_Storage, Session_Only_Storage, Multimedia_Storage...). The received data will be stored in the base class attribute cache_content by calling StoreData() and after that the message MSG_URL_DATA_LOADED will be broadcasted. At this point the cache item exists in memory only.
At some later point, either when the URL is no longer used or when a cache write is forced (e.g. Cache_Manager::WriteCacheIndexesL()) URL_DataStorage::DumpSourceToDisk() will be called. The cache storage will be flushed and if it is of type File_Storage it will be written to disk in one of three ways:
The module is moderately large
Various features can be enabled or disabled, either through feature defines or specific defines, one example is access to files on disk.
Due to the requirements from various modules (including the url module) and platforms it is very difficult to reduce the footprint
Most of internal module functions handles OOM locally, and signals an OOM by raising the OOM signal in the memory manager, and aborts the current action. If appropriate a message is posted to the document.
However, much of the public API is now LEAVE based, and in those cases the caller must TRAP errors and handle them. Some internal functions will also LEAVE, but these are TRAPed internally
Some of the module maintainance functions are message callback based, and these functions are not able to report OOM situations directly to the documents or UI. In these cases the current operation will be terminated, and errormessages sent.
Much of the external API is based on direct calls, but some classes do use virtual functions. In many cases these are LEAVE bases, and callers must TRAP them and handle them appropriately.
NOTE: these numbers tend to be estimates, not actual measurements
Unloaded URL will usually consume approximately 40 bytes, plus the URL's path segment
Loaded URL_Reps will probably,on average, use 300-400 bytes, depending on the lengh of the URL's name. URL_Reps that uses RAM cache will additionally store the entire document in RAM.
RAM usage for memory cache is kept to a minimum, although in desktop versions 1 MB can be generally used for such caching. In RAM only versions the disk-cache size if used, and on limited memory platforms the memeory use is kept as close to 0 as possible.
Usually, large objects are allocated. In some cases sizeable objects are placed on the stack but only for shorter periods.
In most cases stack consumption should be less than 300 bytes.
No global memory is used, except through the URL_Manager.
URLs are kept in a list, and memory used by these URLs can be freed by the following g_url_api (URL module calls):
In addition the size of the allocated resources are controlled by the preference: PrefsCollectionNetwork::DiskCacheSize
Memory is freed as part of the URL module's shutdown, when all URLs and independent cache contexts are destroyed.
A few functions use the g_memory_manager's tempbuffers
There is no check for external use of these buffers, and the use of different buffers should prevent internal collisions, unless implementations also use them in calls to/from these functions.
At present there are no opportunities to tune memory use, aside from a couple of tweaks to increase memeory use before flushing to disk.
Selftests, but they do not check memory usage.
Selftests, ordinary surfing.
URL_Rep, URL_DataStorage and several other classes are independent objects owned by other objects to reduce the use of unnecesarily large objects.
In the future we will probably try to store less of the URLs known to Opera in memory, and move more of it out into a database on disk.
It may be possible to force compression of data in the cache, in particular generated memory-only cache entries.