# -*- tab-width: 4 -*-
#
# Implementation domain concepts
# 2005-09-12 / lth
#
# The implementation domain is comprised of the hardware platform, the
# operating system, system services (including network and file), the
# compiler, and the libraries.  The word "implementation" below is
# taken to mean all of these together.
#
# Implementation concepts are reflected in the SYSTEM system, in the
# document "Minimal required platform capabilities" available on the
# Wiki, in Opera's porting interface layer, and in structures in the
# code, eg, in how data are structured and packed as a consequence of
# impl.ram.small_systems.
#
# Some definitions here may seem redundant (eg, for "RAM"), but they
# provide two things: a basis for precise discussion of the other
# issues, and a location in which to note significant facts that would
# otherwise easily be forgotten.
#
# These are the concepts defined so far:
#
#   IMPL.HARDWARE
#   IMPL.RAM
#   IMPL.STATIC_MEMORY
#   IMPL.HEAP_MEMORY
#   IMPL.HEAP_ALLOCATION_FAILURE
#   IMPL.HEAP_MEMORY_FRAGMENTATION
#   IMPL.STACK_MEMORY
#   IMPL.CODE_MEMORY
#   IMPL.DISK_MEMORY
#   IMPL.FILE_SYSTEM
#   IMPL.PROGRAMMING_LANGUAGE
#   IMPL.PROGRAMMING_MODEL
#   IMPL.LIBRARIES
#   IMPL.LOADABLE_MODULES
#   IMPL.FLOATING_POINT
#   IMPL.CHARSET
#   IMPL.BATTERY
#   IMPL.RASTER_GRAPHICS_DISPLAY
#   IMPL.INPUT_DEVICE
#   IMPL.NETWORK
#   IMPL.GUI
#   IMPL.JAVA
#   IMPL.TIME
#   IMPL.INTEGRITY_AND_SECRECY


IMPL.HARDWARE

  .def
	By "hardware" we understand the physical CPU, FPU, and memory
	subsystem: the basic computer, so to speak.

  .cpu
	The CPU is distinguished by its instruction set and implementation
    technology (including clock speed).

  .word_size
	The most important aspects of the instruction set are its memory
    model and word size.  Most modern instruction sets use a 32-bit
    word size for both data and addresses and a flat memory model
    where each byte has a distinguished address.

    64-bit word sizes for addresses are becoming more common; these
    implementations still use 32-bit words for data.

    Other word sizes (be it 16-bit, 18-bit, or 36-bit) are no longer
    common in our application domain, nor are segmented memory models.

  .instruction_encoding
	Instruction encodings we encounter range from byte-encoded CISC via
	word-encoded RISC to VLIW.

  .memory
	The memory subsystem is comprised of zero or more levels of cache
	and a connection to the main memory that is usually running on a
	slower clock than the CPU.

  .large_systems
    High-end implementations have deeply pipelined superscalar CPUs
    and clock speeds of several GHz.  They have multiple levels of
    cache on-chip and more off-chip, and the largest caches hold
    several MB of data.

  .small_systems
    Low-end implementations have single-issue low-power CPUs with
    clock speeds of 100MHz or less, and may not have hardware
    floating-point support.  They have small caches (a few KB) with
    poor cache policies (write-through), and cheap memories that are
    very slow.

  .mimd
    Multicore and multi-CPU implementations are becoming increasingly
    common at the high end.

  .simd
    Some CPUs have attached SIMD extensions (eg MMX, Velocity Engine)

  .dsp
    Some CPUs have attached signal processing units

  .related
    IMPL.FLOATING_POINT, IMPL.RASTER_GRAPHICS_DISPLAY


IMPL.RAM

  .def
    The memory available to Opera for global variables, non-ROMable
    constants, non-ROMable code, heap memory, and stack is jointly
    called "RAM".

  .physical
    Every machine has some amount of physical RAM.

  .virtual
    Some machines may also have virtual RAM, allowing the active
    address space of a program to be larger than the amount of
    physical RAM on the device.

  .small_systems
	The amount of physical RAM may be small, and there may not be any
	virtual memory at all.

  .large_systems
	Some implementations will provide large amounts of physical data
	memory and nearly unbounded amounts of virtual memory.

  .varying_amount
	Opera may have to share the available memory with other
    applications, and the amount of free memory outside one program
    may be variable, even if that one program has constant memory use.

  .unmeasurable
	It may not be possible to obtain accurate information about the
	amounts of memory in an implementation.

  .related
	FEATURE_LIMITED_MEMORY


IMPL.STATIC_MEMORY

  .def
    Static memory is the memory used for global variables, non-ROMable
    constants, and non-ROMable code.

  .small_systems
    On implementations without virtual memory the amount of static
    memory used by a program directly impacts the amount of heap
    memory available to it.


IMPL.HEAP_MEMORY

  .def
    Heap memory is the memory used for dynamic storage.  Approximately
    all the data memory used by Opera is heap memory.

  .management
    Heap memory is typically managed through low-level interfaces like
    malloc() and free().  These routines do not allow the program to
    specify placement or alignment.

    There may be facilities for inspecting the state of the heap, eg,
    obtaining the amount of memory on the free list.

    There may be facilities for requesting certain object placements
    or alignments.

    State inquiry and placement/alignment facilities are nonportable
	and not universally available.

  .common
    The application and precompiled libraries usually share the heap.
    The application may allocate memory that is freed by the library,
    and vice versa.


IMPL.HEAP_ALLOCATION_FAILURE

  .def
    Heap allocation failure occurs when Opera needs to allocate a
    block of memory of size k, but no block of k bytes can be obtained
    from the heap or from the implementation.

  .platform_variations
	The implementation may or may not signal heap allocation failure
	to Opera when a request to allocate heap memory cannot be
	satisfied.

  .reliable
    Reliable allocation failure signalling means (1) that an error
    code of some kind is returned to Opera or library routines that
    Opera calls from the basic level of allocation routine (like
    malloc or new), (2) that the device does not suffer ill effects
    based on this event alone, and (3) that the libraries are able to
    either recover from this failure or to propagate the failure to
    Opera in such a way that the libraries' internal state remains
    consistent.

  .unreliable
    If allocation failure is not signalled reliably, then Opera or the
	device may crash or fail in random ways when heap memory is
	depleted or very low.

  .related
    FEATURE_LIMITED_MEMORY


IMPL.HEAP_MEMORY_FRAGMENTATION

  .def
	Fragmentation is a phenomenon that is seen in dynamically
	allocated memory where placement decisions and allocation patterns
	together make it increasingly hard, over time, to allocate blocks
	of any given size without expanding the heap.

	Conversely, in a heap of fixed size, fragmentation leads to ever
	more frequent heap allocation failures.
	
  .comment
	Fragmentation is a concept in the implementation domain because
	it is in part a result of weaknesses in the implementation, and
	because it is sometimes possible to mitigate these weaknesses in
	the application.  Thus specific architectural structures may be
	created to combat fragmentation.

  .related
	IMPL.HEAP_ALLOCATION_FAILURE


IMPL.STACK_MEMORY

  .def
    Stack memory is the memory the implementation makes available for
    the control stack of a thread or process.

  .small_systems
	The amount of available program stack may be quite small, in the
    tens of kilobytes.

  .large_systems
    At the same time, some implementations will provide multi-megabyte
    demand-allocated stacks.

  .system_controlled
    On some implementations it may be required to run with the stack
    that the implementation provides, Opera can't allocate its own
    stack.

  .stack_overflow
    The implementation usually will not signal stack allocation
    failure.  If Opera overflows the stack, it will usually crash
    either Opera or the device.

  .comment
    The restriction in .system_controlled is believed to exist on
    Symbian systems.


IMPL.CODE_MEMORY

  .def
    Code can be executed out of ROM or RAM, or a combination of the
    two; whatever memory is used is "code memory".

  .demand_paged
    On some implemantations code is demand paged and the amount of RAM
    occupied by code at any time is substantially less than the total
    code size.

  .small_systems
	The amount of memory available for Opera code may be small, on the
	order of 1.5MB in limiting cases.

    Small implemenations usually do not implement demand paging of
	code, so if code is executed out of RAM then the use of code
	memory influences the availability of data memory.

  .large_systems
    Many implementations are now so large that it is unlikely that
	code memory will ever be an issue again.

  .related
	FEATURED_LIMITED_FOOTPRINT


IMPL.DISK_MEMORY

  .def
	Many implemenations provide persistent storage in the form of a
    "disk" abstraction with a filesystem-like API through which data
    are stored and loaded.

  .capacity 
    Usually, these disks have higher capacity than fast (RAM) memory

  .performance
    Disks that are not mapped to RAM are usually substantially slower
    to access than RAM

  .implementation
	The disk abstraction can be implemented on top of various
    hardware, some of which may not be actual spinning magnetic disks
    but which may instead be networked storage or nonvolatile RAM.

  .small_systems
    Disks on small devices may be quite slow (flash memory) and may
    have limited capacity.  Both reading and writing may be blocking,
    and there may be significant latency.

  .networked_systems
    Disks on networked implementations may be quite slow, as they may
    be mapped to disks on remote machines.  Both reading and writing
    may be blocking, and there may be significant latency.


IMPL.FILE_SYSTEM

  .def
    A file system is an abstract data type providing naming and
    storage services, mapped onto some storage medium, most often a
    disk.

  .directories
    Most file systems provide nested directories.  The syntax for
    naming a file in a directory (the file's path name) varies among
    file systems.

  .roots
    Some file systems have a single root, others have multiple roots,
    often tied to physical disk drives, and different drives may have
    different characteristica (eg some may be volatile).

  .limitations
    Some file systems limit how many files that may be open at any one
    time.

    Virtually all file systems limit the characters that may appear in
    a file name, give special meaning to other characters, or limit
    the length of both the entire path name and each component.


IMPL.PROGRAMMING_LANGUAGE

  .def
	By "programming language" we understand both the core programming
    language and its supporting libraries.  Libraries for basic system
    services, networks, user interfaces, file systems, and similar are
    treaded separately.

  .effective_subset
    Every implementation provides some variant of C++, but it may be a
    subset, not the full language.

    Language features like exceptions are not supported by all
	implementations, nor do many implementations provide the standard
	C++ library (nor the standard C library, nor their header files).

  .quality
	In some cases there are language constructs that are supported,
    but they are supported poorly and may therefore be expensive.

  .related
    The Opera Coding Standards contains a definition of the subset of
    C++ that is accepted at any time.

  .std
    ISO/IEC 14882-2003 (C++)


IMPL.PROGRAMMING_MODEL

  .def
    The programming model is that offered by the programming language
    plus features of the implementation, minus restrictions in the
    implementation.

  .processes
    A "process" is an address space that is separate from other
    address spaces in a running implementation, together with one or
    more threads of control.  Usually one process cannot access memory
    or resources in another.  Some implementations support multiple
    processes, others don't.

  .shared_memory
    Shared memory is a means whereby multiple processes can access the
    same memory directly (ie, by memory reads and writes not mediated
    by the operating system), though their address spaces are otherwise
    protected from each other.

  .threads
    A "thread" is a preempted, separate thread of control in an
    address space that may have multiple such threads.  Some
    implementations directly support threads, others don't.

  .globals
    Some implementations use program linking, loading, and relocation
    techniques that preclude the use of global variables and global
    structured constants.

  .main_thread
    The "main thread" of an application is the thread that the
    implementation creates initially to start the application.

	On some implementations, the main thread has privileges to perform
    some actions vis-a-vis the implementation's libraries, often in
    the domain of drawing.

    It is unclear whether this is a dirty workaround for locking or is
    related to e.g. how shared memory is handled.

  .related
    SYSTEM_GLOBALS, SYSTEM_COMPLEX_GLOBALS, SYSTEM_DLL_NAME_LOOKUP.

  .std
    Posix standards?


IMPL.LIBRARIES

  .def
    With "libraries" we understand libraries that offer complex
    functionality, like HTTP, SSL, or UI frameworks, not the basic
    compiler libraries.

  .reasons
    Sometimes we may wish to use the implementation's libraries for
    footprint reasons.  Other times because they provide performance
    superior to what we can hope to implement (consider OpenGL) or
    functionality we must use (GUI APIs).

  .problems
    Libraries may have inferior quality or may be mismatched to Opera.


IMPL.LOADABLE_MODULES

  .def
	Some implementations allow code to be dynamically loaded and
	linked into a program at run-time.  Known as "shared objects",
	"dynamic libraries", or "DLLs", these are loaded and linked by
	implementation-specific mechanisms.

  .api
	A typical API allows a program to specify which module to load and
	to query it for the resolved addresses of exported symbols.

  .behavior
	Implementations vary as to whether they allow the application to
	specify how to resolve imports in the loaded module, and thus the
	application has little control over whether the loaded module can
	share its resources (eg a shared heap) or must be seen as an
	opaque entity.


IMPL.FLOATING_POINT

  .def
    By floating point numbers we understand IEEE standard single and
    double precision numbers.  There are other formats, but not for us.

  .implementation
    An "implementation" of floating-point arithmetic is typically some
    combination of hardware and software components, the latter
    including both libraries and the compiler.

  .nonstandard
    Some implementations do not support the full range of operations
    on floating point numbers, or they are buggy.  Typically there are
    problems with arithmetic on NaNs and infinities, and Visual C++
    bizarrely makes NaN == x be true for any x.

    Some implementations perform computations in nonstandard ways, eg
    by keeping results internally in an extended format.

    Some implementations optimize floating-point computations in a way
    such that their semantics are destroyed.

  .large_systems
    On large and high-end implementations, floating-point performance
    is usually good.  A little more data must be moved than for
    integer arithmetic, and there are sometimes fewer functional units
    and registers, but there is relatively little reason to avoid it.

  .small_systems
    On small and low-end implementations, floating-point arithmetic
    may be implemented in software.  Besides being quite slow, such
    implementations will tend to produce correct results less often
    than hardware implementations.

  .std
    IEEE floating point spec (1989)


IMPL.CHARSET

  .def
    The "charset" of an implementation is characterized by the set of
    characters the implementation can handle (typically as input or
    output) the set of encodings that the implementation can handle
    natively.

  .ascii
    ASCII is a common 7 bit character set.

  .unicode
    Unicode uses 21 bits to code all its values.  Many modern
    implementations handle Unicode or a large subset thereof, and
    primitive library functions accept Unicode in some encoding.

    ASCII is a subset of Unicode: the values defined by ASCII have the
    same meanings in Unicode, and the utf8 encoding encodes all ASCII
    values as single bytes.  Thus any ASCII text is also a utf8 text.

  .utf8
    UFT8 is a variable-length coding of Unicode that uses a single
    byte for values in the ASCII range, and multi-byte codes for
    values above that.

  .utf16
    UTF16 is a variable-length coding of Unicode that uses a single
    16-bit value for most characters, and two 16-bit values for the
    rest.

  .utf32
    UTF32 is a fixed-length coding of Unicode that uses 32 bits to
    encode all character values (leaving the high 11 bits unused).

  .std
    ASCII standard
    ISO Unicode standard


IMPL.BATTERY

  .def
    Battery-operated devices get their operating power from a battery
    that holds a limited charge.

  .life
    Battery life is usually very important.  Continually running
    computations will tend to drain batteries quickly.

  .sleep
    Devices will wish to go to sleep when applications are not running
    actively (even though loaded into memory); during sleep, the
    device may not allow the application to run.

  .death
    When the battery is drained, the device may (in the worst case)
    shutdown abruptly.  It is likely that this is a common occurence.

  .hardware_issues
    Some hardware may be constructed in such a way as to allow parts
    of it not to be switched on unless there is a need for it; for
    example, memory can be constructed in multiple banks.


IMPL.OUTPUT_DEVICE

  (Raster graphics display, printer, sound system) ...


IMPL.RASTER_GRAPHICS_DISPLAY

  .def
	Generally, all drawing (including text) in GUI windows is
	bit-mapped on a raster-graphics display: at the level of the
	display, everything is represented as collections of pixels, and
	the moment something is drawn it loses its definition as a unit.

  .small_system
     The display may be very small, smaller than 100x100 pixels.

  .large_system
     The display may be very large (several megapixels).

  .framebuffer
	The simplest display interface is a "frame buffer": an area of
	memory directly representing the pixels of the display in such a
	way that changing a value in the framebuffer makes the change
	visible on the display.

  .double_buffering
	"Double buffering" is a common technique that allows drawing to be
    performed on a non-displayed framebuffer and then written in a
    single operation to the display

  .acceleration
	Modern graphics display cards (for high-end implementations)
    provide support for higher-level operations in hardware, in the
    form of models of geometries, textures, and light sources in the
    scene to be displayed.

  .bit_depth
	The "bit depth" of the display is the number of bits available to
    represent each color (R,G,B) in each pixel.  Bit depths are highly
    variable among displays.

  .palette
    A "palette" or "color table" reduces the amount of memory needed
    to represent a raster image by representing each pixel as a
    single, small value (typically a byte or less) which is used as an
    index into a table that defines the colors represented by that
    value.
  
    [How common is palette-based hardware nowadays?]


IMPL.INPUT_DEVICE

  .def
	An input device allows the user to provide input to the computer
    system in the form of signals that are subject to device-specific
    interpretation.

  .comment
    For each device I would like to describe its primary
    characteristics, so that its possibilities and limitations are
    apparent.  What I have now is not always very good.

  .mouse
    A "mouse" device is characterized by always being present on the
    display (indicated by a graphical pointer).  It can be moved;
    movement causes signals to be sent to windows over which it moves.
    It has (usually multiple) buttons that can be pressed and
    released.  Pressing and releasing a button is a "click".

    A mouse may have a "wheel" which, when rolled, sends distinguished
    signals.  (The new Apple mouse has a wheel that works in both the
    x and y planes.)

    Events: down, up, click, doubleclick, tripleclick, move, enter, 
    leave, scroll.

    Trackballs, trackpoints, and trackpads are like a mouse; only the
    physical realization is different.

  .pen
    A "pen" device is characterized by being "placed" on the display
    (or something representing the display), moved, and then lifted
    again.  Placing and lifting a pen quickly is a "tap".  The pen may
    have a button that modifies the meaning of a pen gesture (how?).

    Events: (don't know yet) ...

  .keyboard
    A "keyboard" is a large device with keys for letters of the
    alphabet, digits, punctuation, and functions.  Some of the input
    signals are obtained by pressing a shift key in combination with
    another key.

    There are several shift keys, known variously as SHIFT, CONTROL,
    ALT, FN, COMMAND, OPTION, META, HYPER, or SUPER.  Few keyboars
    have all of these, most have only three or four.  Some shift keys
    have both "left" and "right" varieties.

    Events: usually a primary code (the number of the key pressed)
    together with bits for the state of the various shift keys, and
    often carrying the letter, digit, or punctuation signified.

  .remote
    A "remote" is a (often large) keypad device used with a TV set.

    (Don't know what characterizes this.) ...

  .keypad
    (Smaller keypads on telephones.) ...

  .paddle
    (Game controller) ...

  .ime
    (Input method editor) ...

  .microphone
    (Spoken input) ...

  .twiddler
    (A "twiddler" is a combination keyboard and mouse for one-handed
    portable operation.)  ...


IMPL.NETWORK

  .def
    Networks are the underlying transport media to which certain
    remote communication services in operating systems are mapped.

    A connection is an agreement between two systems to communicate
    using certain identifying tokens that identify data as belonging
    to that connection.

  .limitations
    Operating systems may restrict the number of simultaneous
    network connections, sometimes severely (ie, one or two).

  .server
    A server is a machine that listens for connections on known
    ports and provides actions or data in response to queries on
    the channel so created.

  .client
    A client is a machine that makes a connection to a server to
    ask for service.

  .proxy
    A proxy is a machine that behaves like a server toward the client,
    and a client toward the server: it forwards requests and replies

  .dns
    The domain name system is a network-distributed cached service that
    provides mappings from symbolic names to internet addresses.  It
    is available on a computer system as a local service or through a
    library API.
    
  .ip
    The internet protocol IP is a commonly used link-level protocol,
    the lingua franca of the public Internet and most local networks.

    IP exists in two variants, IPv4 (32-bit addresses) and IPv6
    (128-bit addresses).

  .tcp
    The Transport Control Protocol TCP is a common end-to-end lossfree
    protocol used by many programs for reliable communication.

  .dhcp
    The Dynamic Host Configuration Protocol allows a machine to obtain
    its internet address from a local server through means of a
    broadcast; the consequence of this is that a machine's address may
    vary.

    (Is there something similar for phones?  Or are phones NAT'ed?)

  .nat
    A "network address translator" sits at the boundary of the
    intranet and the internet, and translates internally used
    addresses to external addresses.  The consequence of this is that
    the address a machine sees itself as having is not an address that
    it can communicate to hosts on the internet.

    (Are phones NAT'ed?)

  .latency
    A communication from one machine to another may arrive at the
    destination much later (even in human terms) than it departed from
    the source.  This is due to physical effects, switching effects in
    the network and congestion, multiple networks, lost and resent
    data, and probably many other things.

  .spec
    A lot of them!


IMPL.GUI

  .def
    A graphical user interface is a metaphor for interacting with a
    computer system that is realized through interacting with a
    display that shows a graphical representation of the system's
    state, and associated input devices that allow those
    representations to be selected and manipulated.

  .document
	A document is a data structure that is to be shown on a display
    device and (usually) manipulated by the user.

  .window
    The window is the area on the display showing the document,
    possibly along with widgets for control of the window.  The window
    and its contained document are the primary representations of
    state in the GUI.

  .scrolling
	The document may be too large to be displayed all at once in the
	window, so the window may "scroll" the document horizontally or
	vertically to give the impression that the window is a view onto
	the document and one is moving the view to a different part of the
	document.

  .widgets
	Different graphical elements in a window are known as "widgets" or
	"controls".  These provide foci for data or command input (eg a
	scroller or an input box), or various stylized kinds of display
	(eg a tree view).

  .caret
	A "caret" is a graphical mark that shows the position at which
	input provided by the user will appear in a text input widget.

  .selection
	Parts of the document may be "selected" by various user actions;
	the parts thereby selected are usually highlighted in some way and
	become the focus of new user actions.

  .fonts
    A "font" or "typeface" is an abstract data structure that
    specifies how to draw "glyphs" from a character set.  Fonts are
    named.  Fonts are often defined only for subsets of the
    implementation's character set.


IMPL.JAVA

  .interfacing
	Java implementations are available on many systems, but though
    there is a standard interface to Java (known as JNI), the form in
    which Java is available is not standardized across platforms.
    Different manufacturers provide different means of accessing it.

  .threads
	Java is generally multi-threaded, and if the underlying operating
	system provides preemptive threads, then the Java implementation
	may use those.

  .std
    Java JNI specification


IMPL.TIME

  .utc
    UTC (Universal Time, Coordinated), also known as GMT and Zulu
    time, is a common reference point for clocks.  It is independent
    of time zone or daylight savings time.

  .localtime
    Local time is derived from UTC by adding a fixed time zone offset
    that is determined by geography, politics, and history, and then
    adding a daylight savings time (dst) offset that depends on the
    current time.

  .dst
    Daylight Savings Time is a mechanism in which the time in a locale
    is shifted forward in the spring and backward in the summer, in
    order to have more evening light in the summer.

    The amount of the DST adjustment, if any, and the period of the
    year in which the adjustment is in effect, are determined by the
    locale: the regional (usually state or national) government
    decides this.

    Note that the dst offset is always relative to a time in a locale.

  .source
    Time data are delivered by a timer chip in the system, modified
    by time and date settings that are controlled by the user.

    Some systems have high-resolution time chips, allowing their use
    in profiling and small-scale performance measurements.

  .drift
    Time can travel backwards if the device travels across time zones,
    if it is on during a translation from daylight savings time to
    normal time, or if the user changes the settings.

    Time can travel forwards faster than expected for pretty much the
    same reasons.

  .spec
    ISO 8601, at least

IMPL.INTEGRITY_AND_SECRECY

  .def
    "Integrity" means that data cannot be modified or forged without
    the owner's concent.  "Secrecy" means that data cannot be read
    without the owner's concent.

    In this context, we think of the owner as the browser user, and
    data includes both data in the running browser, data stored on the
    disk, and data in transit over a network.

  .mechanisms
    Operating systems offer some common mechanism for integrity and
    secrecy: permission bits on files; memory protection on processes;
    passwords for logins; firewalls to prevent break-ins; encrypted
    disks for added protection.

    Some systems offer APIs to cryptographic services.


