# The EDX/EDT file format

EDX/EDT files are used to store large data sets of simulation data. To allow an acceptable file size and also an acceptable access speed, data are stored in an ordered binary format. Each “file” consists of two parts:

• the Metadata describing the content and the format of the binary file (EDX-Files)
• the pure Binary data themselves (EDT-Files)

The Binary data files (EDT) just hold the raw data, but no information on the structure of the data or of the content itself. Therefore, the associated Metadata file (EDX) is required to allow a decoding and reading of the binary file. THE EDX file is a simple ASCII-file formatted following the EML conventions introduced with ENVI-met Version 4.

As the EDX files are EML-based, they are self-documenting and easy to extend for future uses. They follow up the old EDI files which did not provide these options and are therefore replaced. The EDX files also provide enhanced information on the background of the data produced, e.g. in which simulation run and wheather they are normal data or initial, check or panic data. In the following sections, the components of the EDX-File are described.

Note: All data stored within the EDX files by ENVI-met are consistent. Several sections contain repeating (=redundant) information. While normally redundant information are seen as a design failure, here they serve a clear target: They allow a quick access on the raw data without interpreting the thematic content of the file and its implication on the file format.

Example: A surface flux file is always just 2-dimensional, hence storing its dimension as a separate information is redundant. But some external programs may want to read the file, but will not understand the concept of a surface file. So they require an explicit definition of the spatial dimensions of the file in terms of x-, y- and z-cells in order to work.

<ENVI-MET_Datafile>
<filetype>EDX ENVI-met Simulation Data Definition</filetype>
<version>100</version>
<revisiondate>01.01.2014 00:00:01</revisiondate>
<remark>This file links to the EDT file of the same name</remark>
<encryptionlevel>0</encryptionlevel>
</Header>

The Header of the file is non-simulation specific and includes version numbers and identifiers as described in the EML standards

### Data Description

  <datadescription>
<data_type> 1 </data_type>
<data_content> 2 </data_content>
<data_health_status> 0 </data_health_status>
<data_spatial_dim> 2 </data_spatial_dim>
<nr_xdata> 100 </nr_xdata>
<nr_ydata> 100 </nr_ydata>
<nr_zdata> 1 </nr_zdata>
<spacing_x> 2.50000,2.50000, (...) ,2.50000 </spacing_x>
<spacing_y> 2.50000,2.50000, (...), 2.50000 </spacing_y>
<spacing_z> 0.00000 </spacing_z>
</datadescription>

The Data Descrition section of the EDX-file contains metadata on the general structure of the Data. The items are:

#### <data_type>

type
tSimFileType = (
ftUnknown = 0,
ft2DRaster = 1, // 2D Rasterdatafile, 1 data per grid
ft3DRaster = 2, // 3D Rasterdatafile, 1 data per grid
ft3DFacade = 3 // 3D Rasterdatafile, 3 data (x,y,z wall) per grid
);

The <data_type> items defines the general type of the data file. For the moment, there are only 3 different typologies of data files:

• 2D raster files with one data for each grid point (basically a special case of a 3D file with just one z- coordinate)
• 3D raster files with one data for each grid point
• 3D façade files with 3 data for each grid point corresponding to the x-, y- and z-cell faces of the cell

In the example given above, we obviously have a 2D raster file to deal with…. Note: More type are up to come

#### <data_content>

type
tSimFileContent = (
fcUnknown = 0, // undefined format
fcAtmosphere = 1, // 3D atmosphere data _AT_
fcSurface = 2, // surface data _FX_
fcSoil = 3, // soil data  _SO_
fcPollutants = 4, // pollutants _POLU_
fcBiomet = 5, // biomet _BIO_
fcVegetation = 6, // vegetation _VEG_
fcSolarAccess = 8, // 2D solar access _SA_
fcFacadeStatic = 9, // constant building data _BLDG_
fcViewScape = 12, // view projections  _IVS_
fcPhotocat = 13   // photocatalytic data _PHX_
);


The <data_content> tag describes the thematic content of the file. Basically, it is not needed to read and display the values, but it increases the information feedback if one knows, what is in the file (see http://www.envi-met.info/hg2e/doku.php?id=filereference:output:start.

The file content in ENVI-met is referenced in more than one way:

1. By the folder where it is stored. All output files are organised in different folders such as atmosphere or soil to make clear what is the content (see . This is a very unsafe identifier as files are moved around frequently.
2. By the filename. Each file produced by ENVI-met holds a unique identifier such as _AT_ for atmospheric data (see above). In versions before 4.0 this was the only identifer to make clear what actually is stored in the file as file might be moved out of their original folders.
3. By the <data_content> tag. In Version V4 we have added a more explicit information tag about the content of the file. This replaces any file analysis for the identifier mentioned above which might be gone by renaming a file.

#### <data_health_status>

type
tSimFileState = (
fsNormal = 0, // normal file, usable
fsCheck = 1, // just a check file, no valid data inside
fsInitialisation = 2, // very first output of sim, no usable data inside
fsPanicDump = 3); // Panic dump, probably destroyed data inside


ENVI-met knows different situations under which files are produced.

While the formal format of the files is the same for all conditions, the usage of some files is limited.

The <data_health_status> flag in the EDX files supports these simulation conditions by clearly flagging the state under which the file was produced. This also limits the conditions, under which the file should be used for analysis.

The data analysis routine in LEONARDO, for example, will ignore all files flagged as Check, Initialisation or PanicDump as they usually do not contain useful data.

#### <data_spatial_dim>

The <data_spatial_dim> is a redundant information on the spatial layout of the binary file. However, it is designed to easily interpret the file without interpreting the <data_type> tag.

#### Dimensions: <nr_xdata> <nr_ydata> <nr_zdata>

Defines the matrix layout of the binary files. It is again, to some extent, a repetition to the information store before.

#### Grid spacing: <spacing_x> <spacing_y> <spacing_z>

Defines the size of the individual grids in x-, y- and z- dimension. For each grid as given by the  <nr_xdata> <nr_ydata> <nr_zdata> tag, a grid size value needs to be defined.

In the example given above, there is only one value for the z-direction as we are using a 2D- surface file as an example which does not have a vertical extent…

### Variables

  <variables>
<Data_per_variable> 1 </Data_per_variable>
<nr_variables> 18 </nr_variables>
<name_variables> z Topo (m),Shadow Flag, (...) ,NOx flux (µg/m²s) </name_variables>
</variables>

The variables section defines the different information layers stored in the binary EDT file.

While the preceding section were used to interpret the format of the file, this section defines the information stored for each grid.

#### <Data_per_variable>

The <Data_per_variable> tag sets the amount of information that is stored for each grid. It duplicates the <data_type> tag to some extent. In most of the cases, there is just one data set for each grid point. But for the 3D data of buildings coded with ft3DFacade = 3 there are actually 3 data for each grid that must be taken into account.

#### <nr_variables>

Simply defines the number of different variables in the file.

#### <name_variables>

Lists the name of the variables as a comma-seperated list.

Please note: LEONARDO will interpret information given in “( )” as UNITS for the variable and will place it on the correct space in legends and other map layout features.

### Model Description

  <modeldescription>
<title>  Hill Valley  Situation 1955 </title>
<simulation_basename> HillValley_1955 </simulation_basename>
<simulation_date> 06.09.1955 </simulation_date>
<simulation_time> 17:00:00 </simulation_time>
<projectname> Martys Climate Study </projectname>
<locationname> Hill Valley, California </locationname>
<location_latitude> +34.141417 </location_latitude>
<location_longitude> -118.349771 </location_longitude>
<model_rotation> 0 </model_rotation>
<location_georef_x> 0.00000 </location_georef_x>
<location_georef_y> 0.00000 </location_georef_y>
</modeldescription>

The <modeldescription> section describes the simulation time and the geographical details of the model area used in the simulation. It is not required to decode the .EDT file, but it is of great help when analysing simulation sets coming for external sources. The data fields corresponds to the general model settings and hence they are not explained in detail again.

Software like LEONARDO will interpret the <modeldescription> section to allow a project-orientated user interface.

### ENVI-met Reference

  <envi-met_reference>
<envi-met_version> ENVI-met V4.0 © Michael Bruse and team 1997-2014 </envi-met_version>
<envi-met_GUID> LORRAINE2_06.09.1985_20.13.18 </envi-met_GUID>
</envi-met_reference>

The ENVI-met reference is added to keep track on which version the simulation was actually executed.

In addition, the <envi-met_GUID> holds a reference to the computer and the time the simulation was started. This will be needed for distributed computing in the near future. The <licenseholder> tag watermarks the output data with the name of the license holder in case the professional version of ENVI-met, Biomet or other future products with special licenses is used.

  <additional_info>
<sunposition> 141.84860 </sunposition>
<windinflow> 45.00000 </windinflow>
</additional_info>

The additional info section contains some more information on the general simulation, basically the position of the sun (if shining) and the direction of the wind inflow. Both data are stored in decimal degrees relative to North. If the model area is rotated, both inflow angle need to be corrected if placed on a map. This correction is applied automatically when using the DataNavigator in LEONARDO.

## The EDT Binary file

The EDT file just holds raw binary data in Intel Float format (single precision). In order to be decoded properly, the associated metadata from the EDX file must be evaluated correctly. The basic structure of the EDT file created using the logic shown below:

For i:=0 to nrVariables-1 do
begin
For z:= 0 to ZZ-1 do
For y:=0 to YY-1 do
For x:=0 to XX-1 do
For n:=0 to D-1 do
Write Variable[i].Value[x][y][z][n];
end;

Here XX,YY and ZZ are the dimensions of the model domain started counting at x=0,y=0,z=0 at the lower left corner of the model. D is the variable dimension of the file. For most files, D is 1 (=one data value for each grid coordinate and variable), for façade files e.g. D is 3 (= 3 values for each grid for the left x-, front y- and bottom z- face. The values for the right x-, rear y- and top z-face are stored in the neighbour cells).

In addition, the 3D files for building façades hold one field of dimension 1 at the very beginning:

    For z:= 0 to ZZ-1 do
For y:=0 to YY-1 do
For x:=0 to XX-1 do
Write ObjectData[x][y][z];

This field holds the general Object information like in the atmosphere files marking the position of buildings, plants, DEM and sources.