Module: DATASTREAM

About this module

The DataStream classes are used to define an abstract interface for writing, reading and processing binary data (in this context textbased representations can also be considered binary). An earlier version of these classes exists in the util module.

Subclasses of DataStream includes:

Presently this module is used to implement the DataFile and DataFile_Record classes in the url_4 module. This module is also used in unfinished code that implements both the PGP and SSL/TLS protocols.

Interface overview and API documentation

API documentation generated by Doxygen contains all necessary information.

NOTE: The API functions may LEAVE due to normal errors that does not permit further processing, in addition to OOM

Primary API classes

DataStream
The basic API class for the modul. All implementations derive from this class
DataStream_GenericRecord
General record with a tag id, payload length and a binary payload of the specified length. The binary encoding (enabled, length, bigendian/litte-endian, MSB policy) of the tag and length fields can be configured by a DataRecord_Spec object. The order of tag and length may be swapped.
DataStream_GenericFile
General unidirectional (set on initialization) file reader and writer. Read and write is done through the standard DataStream API
DataStream_Pipe
Subclasses can use this class to retrieve information from another DataStream object through the normal DataStream API. This information may then be processed further. Example: The PGP ASCII armor encoder/decoder and DataStream_LengthLimitedPipe which will read a specified number of bytes from a source.
DataStream_SequenceBase
Maintaines a sequence of DataStream object, and controls reading and writing of them. Implementations may perform actions between each read, e.g. to disable some objects in the sequence, or add new ones.

Additionally, classes for variable length arrays, integers, referencing another DataStream object,as well as other special tasks exists.

Sample code

NOTE: Anchoring and error checks have been left out for clarity.

Reading a file (or DataStream)

This code read a number of different data from a file. Similar code also works for DataStream derived objects that supports this kind of operations, such as DataStream_ByteArray, DataStream_GenericRecord and DataStream_Pipe.
	
	// Allocate and open file for reading
	OpFile *myfile = new OpFile;
	
	myfile->Construct(UNI_L("myfile.dat");
	myfile->Open(OPFILE_READ)
	
	// Init filereader
	DataStrem_GenericFile data_file(myfile);
	
	data_file.InitL();
	
	// Sample 5 bytes
	byte buffer[5];
	unsigned long len;
	
	len = data_file.SampleDataL(buffer, 5);
	if(len == 5)
	{
	   // Do something, then commit the read bytes
	   data_file.CommitSampledData(5);
	}
	else
	{
	  // do something else
	}
	
	// Read 5 bytes
	len = data_file.ReadDataL(buffer, 5);
	
	// Read a 32 bit unsigned integer (Bigendian, no MSB detection, 
	unsigned int read_value;
	if(data_file.ReadIntegerL(read_value, 4, TRUE, FALSE) == OpRecStatus::FINISHED)
	{
	  // Use Integer
	}
	
	// Write a record to the file
	DataStream_ByteArray output_record;
	
	// Specify that the record is 32 bytes long (but don't allocate)
	output_record.ResizeL(32);
	
	// Then read it from the file
	if(output_record.ReadRecordFromStreamL(&data_file)== OpRecStatus::FINISHED)
	{
	   // use record
	}

Writing to a file (or DataStream)

This code writes a number of different data to a file. Similar code also works for DataStream derived objects that supports this kind of operations, such as DataStream_ByteArray, DataStream_GenericRecord and DataStream_Pipe.
	
	// Allocate and open file for reading
	OpFile *myfile = new OpFile;
	
	myfile->Construct(UNI_L("myfile.dat");
	myfile->Open(OPFILE_WRITE)
	
	// Init filereader
	DataStrem_GenericFile data_file(myfile, TRUE);
	
	data_file.InitL();
	
	// Sample 5 bytes
	byte buffer[5];
	unsigned long len;
	
	data_file.WriteDataL(buffer, 5);
	
	// Write a 32 bit unsigned integer (Bigendian, no MSB detection, 
	unsigned int write_value=2046;
	data_file.WriteIntegerL(write_value, 4, TRUE, FALSE;
	
	// Write a record to the file
	DataStream_ByteArray output_record;
	
	// Fill outout record
	
	// Then write it to the file
	output_record.WriteRecordL(&data_file);

Structured records

This code creates a structured record of 4 integers (8 bit, 16 bit, 8 bit, 24 bit), followed by a variable sized array and one integer (32 bit). The fourth integer specifies the size of the array.

	class Record : public DataStream_SequenceBase
	{
	   public:
	      DataStream_Octet	        first;
	      DataStream_UInt16         second;
	      DataStream_Octet          third;
	      DataStream_UIntVarLength  fourth;
	      DataStream_ByteArray	fifth;
	      DataStream_UInt32		sixth;
	      
	   public:
	     Record(): fourth(3) /* 3 byte length */{
	        // Build record structure;
	        first.Into(this); 
	        second.Into(this);
	        third.Into(this);
	        fourth.Into(this);
	        fifth.Into(this);
	        sixth.Into(this);

  	        // Set field IDs
	        first.SetItemID(1); 
	        second.SetItemID(2);
	        third.SetItemID(3);
	        fourth.SetItemID(4);
	        fifth.SetItemID(5);
	        sixth.SetItemID(6);
	     }
	     
	  protected: 
	     // Special actions while reading a record
	     virtual void ReadActionL(uint32 step, int record_item_id)
	     {
	     	// When the fourth integer is read use it to resize the fifth element
	     	if(record_item_id == 4)
	     	   fifth.ResizeL(fourth);
	     }
	}
	
	
	Record  write_record;
	
	// Fill record, then write it to a file
	DataStrem_GenericFile data_file_out(myfile_out, TRUE);
	
	write_record.WriteRecordL(&datafile_out);
	     
	Record  read_record;
	DataStrem_GenericFile data_file_in(myfile_in, TRUE);
	
	// Read record from 
	
	if(read_record.ReadRecordFromStreamL(&data_file_in) == == OpRecStatus::FINISHED)
	{
	  // Use record
	}

	// Class that reads a string of Records
	class File_Of_Records : public DataStream_GenericFile
	{
	public:
	  File_Of_Records(OpFile *file_p, BOOL _write = FALSE) : DataStrem_GenericFile(file_p, _write){};
	  
	  Record *GetNextRecordL(){return (Record *) DataStream_GenericFile::GetNextRecordL();}

	protected:
	  virtual DataStream *CreateRecordL(){return new(ELeave) Record;}
	};

	File_Of_Records record_file(myrecord_file);

	record_file.InitL();

	// Read records
	Record *rec;
	
	while((rec = record_file.GetNextRecordL()) != NULL)
	{
	  if(rec->Finished())
	  {
	     // Use record
	     delete rec;
	  }
	  // read next record (or get the finsihed record (not really a problem with a files)
	}
	
	record_file.Close();

Implementation description

Footprint

The module is fairly small, and functionality only used by PGP and/or the new SSL/TLS code (diffie_3) is automatically disabled for builds that does not activate those modules

Dynamic memory use and OOM handling

OOM policies

All functions that allocates or may allocate memory in the module will LEAVE in case of an OOM situation.

Who handles OOM?

The calling functions must either TRAP the LEAVE errors, and either handle them, or pass it on to its own caller for handling by returning a status or by performing a LEAVE.

Flow

All dataretrieval API functions works through virtual functions, implemented by the subclasses. The actual control flow will depend on the implementation of the classes.

Some of the classes access internal databuffers, other will access files or other DataStream objects

Heap memory usage

Actual allocation depends on the subclass implementation:

Actual size of allocated memory depend on the implementation

Stack memory usage

For the most part the DataStream functions use little stackbased, usually less than 10 32-bit words. In some cases, like the DataStream_FlexibleSequence class, there are functions that uses at most two stack allocated DataStream_ByteArray as temporary values, each in the range of 150-200 bytes.

Static memory usage

The module does not define any global variables.

Caching and freeing memory

No cacheing is performed by the module.

Freeing memory on exit

The owners of DataStream objects must delete them, deallocation of allocated memroy will occur automatically.

Temp buffers

There is no check for external use of these buffers, and the different buffers should prevent internal collisions, unless implementations also use them in calls to/from these functions.

The Integer fucntions use these buffers instead of a stackallocated buffer, and it would be trivial to replace the calls with heap allocation, but it might cause a minor performance reduction

Memory tuning

At present there is no opportunities to tune memory use.

Tests

Selftests, but no checks on memory usage

Coverage

Loading a cache index or cookie file will exercise most of the non-PGP related code. The PGP selftests will exercise most of the other code.

Design choices

The three-tier memory system in DataStream_ByteArray was chosen to avoid extra allocations for small payloads (which there are a lot of when loading cache index and cookie files), and to prevent reallocate and copy on successively larger payloads. This creates some overhead (processing and footprintwise) in order to handle the different buffers and upgrade to new buffers.

Improvements

At present, no improvements are planned.