SIONlib: Scalable I/O library for parallel access to task-local files

SIONlib File data format

General structure of a sion file:

All starting positions of the blocks are aligned to the filesystem blocksize (e.g. 2 MB GPFS). The first meta data block META1 contains all static meta data which is independent from the number of chunks. The second meta data block is located at the end of the sion file and contains the data which depends on the number of chunks written by each task. The first meta block will mainly be written while opening the sion file, the second meta data block will be written while closing the file. Each BLOCK contains one chunk of space for each task according to the chunksize specified in sion_open(). All chunks are aligned to the filesystem blocksize. There could be gaps between chunks if the requested chunksize is not divisible by the fs blocksize. The size of such a BLOCK including the space for additional alignment space is internally stored in the variable globalskip (see also sion_get_locations).
If a task has reached the end of a chunk while writing data, sion_ensure_free_space will move the filepointer for this task to the next BLOCK. The new position is globalskip bytes from the starting position of the current block and can be computed locally without communication to other tasks. The information how many chunks are used and how many bytes are written in each chunk will be stored in memory until sion_close() is called. This function collects these information from all tasks to task 0 and task 0 writes the data to META2.

Structure of the first meta data block META1:
   4 bytes:  'sion'              char* identification of sion file format 
   4 bytes:  0001                int   for identification of little/big endianess 
   4 bytes:  version             int   version number of used sion library
   4 bytes:  version_patchlevel  int   patch level of used sion library
   4 bytes:  fileformat_version  int   version of sion file format
   4 bytes:  blocksize           int   fs blocksize used for access to this file
   4 bytes:  ntasks              int   number of tasks wrote to this file
   4 bytes:  nfiles              int   number of physical files
   4 bytes:  filenumber          int   number of current physical files
   8 bytes:  flag1               int   not used currently
   8 bytes:  flag2               int   not used currently
1024 bytes:  filenameprefix      char* prefix of filename (for multi-file support)
   
   8 bytes:  globalrank(1)       long  global unique id for this task 1
     ...       
   8 bytes:  globalrank(numpe)   long  global unique id for this task numpe

   8 bytes:  size(1)             long  chunksize requested by processor 1
     ...       
   8 bytes:  size(numpe)         long  chunksize requested by processor numpe
   4 bytes:  maxchunks           int   maximum number of chunks used   
   8 bytes:  start_of_varheader  long  start position of META2

Structure of the second meta data block META2:
  8 bytes: chunks(1)            long  number of chunks written from task 1
    ...
  8 bytes: chunks(numpe)        long  number of chunks written from task numpe

  8 bytes: chunksize(1)         long  number of bytes written in chunk 1  from task 1
    ...
  8 bytes: chunksize(numpe)     long  number of bytes written in chunk 1  from task numpe

  8 bytes: chunksize(1)         long  number of bytes written in chunk 2  from task 1
    ...
  8 bytes: chunksize(numpe)     long  number of bytes written in chunk 2  from task numpe

             ...

  8 bytes: chunksize(1)         long  number of bytes written in chunk n  from task 1
    ...
  8 bytes: chunksize(numpe)     long  number of bytes written in chunk n  from task numpe

  If a chunk was not used by a specific task the chunksize value is set to -1.

mapping: if multi-file used (only in first physical file)
  4 bytes: mapping_size                int  number of global ranks
  8 bytes: fnr(1),lrank(1)           2*int  file number and local for global rank 1
    ...
  8 bytes: fnr(numpe),lrank(numpe)   2*int  file number and local for global rank numpe