Standard Name Convention#
The “Standard Name Convention” is one realization of a convention promoted by the toolbox. It is based on the idea, that every dataset must have a physical unit (or none if it is dimensionless) and that datasets must be identifiable via an identifier attribute rather than the dataset name itself.
The key standard attributes are
standard_name: A human- and machine-readable dataset identifier based on construction rules and listed in a “Standard Name Table”,standard_name_table: List ofstandard_nametogether with the base unit (SI) and a comprehensive description. It also includes additional information about how astandard_namecan be transformed into a newstandard_nameunits: The unit attribute of a dataset. Must not be SI-unit, but must be convertible to it and then match the registered SI-unit in the Standard name table,long_name: An alternative name if nostandard_nameis applicable.
This concept is first introduced by the Climate and Forecast community and is called CF-convention. The h5RDMtoolbox adopts the concept and implements a general version of it, so that users can define their own discipline- or problem-specific standard name convention.
Main benefits of the convention are:
achieving self-describing files, which are human and machine interpretation interpretable,
validating correctness of dataset identifiers (standard_name) and their units
allowing unit-aware processing of data.
This chapter walks you through the concept and shows how to apply it
import h5rdmtoolbox as h5tbx
import warnings
warnings.filterwarnings('ignore')
from h5rdmtoolbox.convention.standard_names.table import StandardNameTable
Standard Name Tables#
Example 1: cf-convention#
The Standard name table should be defined in documents (typically XML or YAML). The corresponding object then can be initialized by the respective constructor methods (from_yaml, from_web, …).
For reading the original CF-convention table, do the following:
cf = StandardNameTable.from_web("https://cfconventions.org/Data/cf-standard-names/79/src/cf-standard-name-table.xml",
known_hash='4c29b5ad70f6416ad2c35981ca0f9cdebf8aab901de5b7e826a940cf06f9bae4')
cf
The standard names are items of the table object:
cf['x_wind']
-
- units : m/s
- description : "x" indicates a vector component along the grid x-axis, positive with increasing x. Wind is defined as a two-dimensional (horizontal) air velocity vector, with no vertical component. (Vertical motion in the atmosphere has the standard name upward_air_velocity.).
cf['x_wind'].units
cf['x_wind'].description
'"x" indicates a vector component along the grid x-axis, positive with increasing x. Wind is defined as a two-dimensional (horizontal) air velocity vector, with no vertical component. (Vertical motion in the atmosphere has the standard name upward_air_velocity.).'
Example 2: User defined table#
Initializing standard name tables from a web-resource should be the standard process, because a project or community might defined it and published it under a DOI.
The h5rdmtoolbox especially supports tables that are published on Zenodo:
snt = StandardNameTable.from_zenodo(10428795)
snt
Here are the standard names of the table:
snt.names
['absolute_pressure',
'ambient_static_pressure',
'ambient_temperature',
'auxiliary_fan_rotational_speed',
'blade_inlet_angle',
'blade_inlet_diameter',
'blade_number',
'blade_outlet_angle',
'blade_outlet_diameter',
'coordinate',
'density',
'difference_of_total_pressure_to_static_pressure_between_across_fan',
'difference_of_wall_static_pressure_across_fan',
'difference_of_wall_static_pressure_across_orifice',
'dynamic_pressure',
'dynamic_viscosity',
'fan_efficiency',
'fan_flow_coefficient',
'fan_inlet_area',
'fan_outlet_area',
'fan_power_coefficient',
'fan_pressure_coefficient',
'fan_rotational_speed',
'fan_shaft_power',
'fan_specific_speed',
'fan_torque',
'fan_volume_flow_rate',
'impeller_diameter',
'impeller_inlet_width',
'impeller_outlet_width',
'impeller_volume_flow_rate',
'impeller_weight',
'inner_diameter_of_orifice',
'kinematic_viscosity',
'mass_flow_rate',
'outer_diameter_of_orifice',
'pulse_delay',
'relative_humidity',
'static_pressure',
'temperature',
'time',
'total_pressure',
'turbulent_kinetic_energy',
'velocity',
'vorticity',
'wall_static_pressure',
'xx_reynolds_stress',
'yx_reynolds_stress',
'yy_reynolds_stress',
'yz_reynolds_stress',
'zx_reynolds_stress',
'zy_reynolds_stress',
'zz_reynolds_stress']
In a notebook, we can also get a nice overview of the table by calling dump():
snt.dump()
| description | units | vector | alias | |
|---|---|---|---|---|
| absolute_pressure | Pressure is force per unit area. Absolute air pressure is pressure deviation to a total vacuum. | Pa | NaN | NaN |
| ambient_static_pressure | Static air pressure is the amount of pressure exerted by air that is not moving. Ambient static air pressure is the static air pressure of the surrounding air. | Pa | NaN | NaN |
| ambient_temperature | Air temperature is the bulk temperature of the air, not the surface (skin) temperature. Ambient air temperature is the temperature of the surrounding air. | K | NaN | NaN |
| auxiliary_fan_rotational_speed | Number of revolutions of an auxiliary fan. | 1/s | NaN | NaN |
| blade_inlet_angle | Angle of blade at inlet. | rad | NaN | NaN |
| blade_inlet_diameter | The inner diameter of the test fan (D1). | m | NaN | NaN |
| blade_number | The blade number is the number of blades of the test fan. | NaN | NaN | |
| blade_outlet_angle | Angle of blade at inlet. | rad | NaN | NaN |
| blade_outlet_diameter | The outer diameter of the test fan (D2). | m | NaN | NaN |
| coordinate | The spatial coordinate. | m | True | NaN |
| density | Air density is defined as the mass of air divided by its volume. | kg/m**3 | NaN | NaN |
| difference_of_total_pressure_to_static_pressure_between_across_fan | The difference of static pressure at fan outlet w.r.t. the total pressure upstream of the fan. The total pressure generally is not known at the fan inlet pipe but further upstream, e.g. in a settling chamber. The dataset must provide detailed information, e.g. referencing to the respective pressure measurement device containing the exact location in the setup. | Pa | NaN | difference_of_total_pressure_to_static_pressure_between_fan_outlet_and_fan_inlet |
| difference_of_wall_static_pressure_across_fan | Static air pressure is the amount of pressure exerted by air that is not moving. Difference of wall static air pressure across a fan is the difference between the static air pressure downstream (at fan_outlet) of the fan and the total air pressure upstream of the fan at the wall (at fan_inlet). | Pa | NaN | NaN |
| difference_of_wall_static_pressure_across_orifice | Differnece of static air pressure across orifice to compute volume flow rate according to DIN EN ISO 5167. | Pa | NaN | NaN |
| dynamic_pressure | Dynamic air pressure is a measure for kinetic energy per unit volume of moving air. | Pa | NaN | NaN |
| dynamic_viscosity | Dynamic air viscosity indicates the resistance of air towards deformation under shear stress. (https://doi.org/10.1016/B978-0-08-096949-7.00020-0). | Pa*s | NaN | NaN |
| fan_efficiency | Total fan efficiency as defined in (CAROLUS, Thomas. Ventilatoren-Aerodynamischer Entwurf, Schallvorhersage. Konstruktion, 2013, 2. Jg., p.5, eq.1.16). | NaN | NaN | |
| fan_flow_coefficient | Air flow coefficient is a dimensionless number as defined in (CAROLUS, Thomas. Ventilatoren-Aerodynamischer Entwurf, Schallvorhersage. Konstruktion, 2013, 2. Jg., p.2, eq.1.3). The addition "of_fan" indicates that this coefficient applies to the deployed fan. | NaN | NaN | |
| fan_inlet_area | The fan cross-sectional area at the location "fan_inlet" for fans with a casing. The position of the referred cross-sectional area is in the pipe upstream of the fan. The area is generally taken to compute the dynamic pressure at the inlet of the fan based on the volume flow rate. | m**2 | NaN | NaN |
| fan_outlet_area | The fan cross-sectional area at the location "fan_outlet" for fans with a casing. The position of the referred cross-sectional area is in the pipe downstream of the fan. The area is generally taken to compute the dynamic pressure at the outlet of the fan based on the volume flow rate. | m**2 | NaN | NaN |
| fan_power_coefficient | Power coefficient is a dimensionless number as defined in (CAROLUS, Thomas. Ventilatoren-Aerodynamischer Entwurf, Schallvorhersage. Konstruktion, 2013, 2. Jg., p.2, eq.1.5). The addition "of_fan" indicates that this coefficient applies for the deployed fan. | NaN | NaN | |
| fan_pressure_coefficient | Total pressure coefficient is a dimensionless number as defined in (CAROLUS, Thomas. Ventilatoren-Aerodynamischer Entwurf, Schallvorhersage. Konstruktion, 2013, 2. Jg., p.2, eq.1.4). The addition "of_fan" indicates that this coefficient applies for the deployed fan. | NaN | NaN | |
| fan_rotational_speed | Number of revolutions of the test fan. | 1/s | NaN | NaN |
| fan_shaft_power | Power of fan drive shaft. | W | NaN | NaN |
| fan_specific_speed | Specific speed of the fan as defined in (CAROLUS, Thomas. Ventilatoren-Aerodynamischer Entwurf, Schallvorhersage. Konstruktion, 2013, 2. Jg., p.2, eq.1.6). | NaN | NaN | |
| fan_torque | The torque acting on the impeller of the fan. | Nm | NaN | NaN |
| fan_volume_flow_rate | Air volume flow rate is the volume of air that passes a cross section per unit time. The volume flow rate of the fan is the volume flow entering and leaving the fan. Due to gaps between the impeller and the housing, the volume flow rate is lower than the volume flow rate through the impeller (see impeller_volume_flow_rate). | m**3/s | NaN | NaN |
| impeller_diameter | The diameter of the impeller of the test fan, also D3. For some fans D2 is equal to D3. | m | NaN | NaN |
| impeller_inlet_width | The width of the impeller inlet. | m | NaN | NaN |
| impeller_outlet_width | The width of the impeller outlet. | m | NaN | NaN |
| impeller_volume_flow_rate | Air volume flow rate is the volume of air that passes a cross section per unit time. The volume flow rate of the impeller is the volume flow entering and leaving the impeller. Due to gaps between the impeller and the housing, this volume flow rate is higher than the volume flow rate through the fan (see fan_volume_flow_rate). | m**3/s | NaN | NaN |
| impeller_weight | Weight of the impeller. | kg | NaN | NaN |
| inner_diameter_of_orifice | Inner diameter of an orifice. | m | NaN | NaN |
| kinematic_viscosity | Dynamic air viscosity indicates the resistance of air towards deformation under shear stress. Kinematic viscosity. Dynamic air viscosity divided by air denisity equals kinematic air viscosity. (https://doi.org/10.1016/B978-0-12-410461-7.00007-9). | m**2/s | NaN | NaN |
| mass_flow_rate | Air mass flow rate is the mass of air that passes a certain cross sectiont per unit time. | kg/s | NaN | NaN |
| outer_diameter_of_orifice | Outer diameter of an orifice. | m | NaN | NaN |
| pulse_delay | Time between two laser pulses. | s | NaN | NaN |
| relative_humidity | Relative humidity is a measure of the water vapor content of air. | NaN | NaN | |
| static_pressure | Static air pressure is the amount of pressure exerted by air that is not moving. | Pa | NaN | NaN |
| temperature | Air temperature is the bulk temperature of the air, not the surface (skin) temperature. (CF Conventions). | degC | NaN | NaN |
| time | Recording time since start of experiment. | s | NaN | NaN |
| total_pressure | The sum of dynamic and static air pressure. | Pa | NaN | NaN |
| turbulent_kinetic_energy | The kinetic energy per unit mass of a fluid. | m**2/s**2 | NaN | NaN |
| velocity | Velocity. | m/s | True | NaN |
| vorticity | Vorticity. | 1/s | True | NaN |
| wall_static_pressure | Static air pressure is the amount of pressure exerted by air that is not moving. Wall static air pressure is the static air pressure at the wall. | Pa | NaN | NaN |
| xx_reynolds_stress | Reynolds stress is a tensor quantity. "xx" indicates that the variations of x-velocity is used. | m**2/s**2 | NaN | NaN |
| yx_reynolds_stress | Reynolds stress is a tensor quantity. "yx" indicates that the variations of x- and y-velocity are used. | m**2/s**2 | NaN | NaN |
| yy_reynolds_stress | Reynolds stress is a tensor quantity. "yy" indicates that the variations of y-velocity is used. | m**2/s**2 | NaN | NaN |
| yz_reynolds_stress | Reynolds stress is a tensor quantity. "yz" indicates that the variations of y- and z-velocity are used. | m**2/s**2 | NaN | NaN |
| zx_reynolds_stress | Reynolds stress is a tensor quantity. "zx" indicates that the variations of z- and x-velocity are used. | m**2/s**2 | NaN | NaN |
| zy_reynolds_stress | Reynolds stress is a tensor quantity. "zy" indicates that the variations of z- and y-velocity are used. in y-axis direction. | m**2/s**2 | NaN | NaN |
| zz_reynolds_stress | Reynolds stress is a tensor quantity. "zy" indicates that the variations of z-velocity is used. | m**2/s**2 | NaN | NaN |
Transformation of base standard names#
Not all allowed standard names must be included in the table. There are some so-called transformations of the listed ones. There are two ways to transform a standard name.
Using affixes: Adding a prefix or a suffix
Apply a mathematical operation to the name
1. Adding affixes#
Note, that ‘x_velocity’ is not part of the table:
'x_velocity' in snt
False
… but ‘velocity’ is. And it is a vector. The vector property tells us, if we can add a “vector component name” as a prefix, e.g. a “x” or “y”:
snt['velocity'].is_vector()
True
Which vector component exist, are defined in the table:
snt.affixes['component'].values
{'x': 'X indicates the x-axis component of the vector.',
'y': 'Y indicates the y-axis component of the vector.',
'z': 'Z indicates the z-axis component of the vector.'}
Thus, by indexing “x_velocity” the table checks whether the prefix is valid and if yes returns the new (transformed) standard name:
snt['x_velocity']
-
- units : m/s
- description : Velocity. X indicates the x-axis component of the vector.
Apply a mathematical operation#
During processing of data, often times datasets are transformed in with mathematical function like taking the square or applying a derivative of one quantity with respect to (wrt) another one. Some mathemtaical operations like these are supported in the version, e.g.:
snt['derivative_of_x_velocity_wrt_x_coordinate']
-
- units : 1/s
- description : Derivative of x_velocity with respect to x_coordinate. Velocity. X indicates the x-axis component of the vector. The spatial coordinate. X indicates the x-axis component of the vector.
snt['square_of_static_pressure']
-
- units : Pa**2
- description : Square of static_pressure. Static air pressure is the amount of pressure exerted by air that is not moving.
snt['arithmetic_mean_of_static_pressure']
-
- units : Pa
- description : Arithmetic mean of static_pressure. Static air pressure is the amount of pressure exerted by air that is not moving.
Usage with HDF5 files#
Let’s apply the convention to HDF5 files. We lazyly take the existing tutorial convention and remove some standard attributes in order to limit the example to the relevant attributes of the standard name convention:
zenodo_cv = h5tbx.convention.from_zenodo('https://zenodo.org/record/8357399')
sn_cv = zenodo_cv.pop('contact', 'comment', 'references', 'data_type')
sn_cv.name = 'standard name convention'
sn_cv.register()
h5tbx.use(sn_cv)
sn_cv
Convention("standard name convention")
Find out about the available standard names: We do this by creating a file and retrieving the attributestandard_name_table. Based on the convention, it is set by default, so it is available without explicitly setting it:
with h5tbx.File() as h5:
snt = h5.standard_name_table
print('The available (base) standard names are: ', snt.names)
The available (base) standard names are: ['coordinate', 'static_pressure', 'time', 'velocity']
One possible dataset based on the standard name table could be “x_velocity”. This is possible, because component is available in the list of affixes. Based on the transformation pattern, it is clear the “component” is a prefix. “x” is within the available components, so “x_velocity” is a valid transformed standard name from the given table:
print('Available affixes: ', snt.affixes.keys())
print('\nValues for the component prefix:')
snt.affixes['component']
Available affixes: dict_keys(['device', 'location', 'reference_frame', 'component'])
Values for the component prefix:
<Affix: name="component", description="Components are prefixes to the standard_name, e.g. x_velocity." transformation_pattern=^(.*)_(.*)$, values=['x', 'y', 'z']>
Let’s access the name from the table. It exists and the description is adjusted, too:
snt['x_velocity']
-
- units : m/s
- description : Velocity refers to the change of position over time. Velocity is a vector quantity. X indicates the x-axis component of the vector.
Creating a x-velocity dataset:
Usage with HDF5 files (update)#
from ontolutils import SSNO
with h5tbx.File(mode='w') as h5:
ds = h5.create_dataset('u', data=3)
ds.attrs['standard_name', SSNO.hasStandardName] = 'x_velocity'
ds.rdf.object['standard_name'] = SSNO.StandardName # https://matthiasprobst.github.io/ssno#StandardName
ds = h5.create_dataset('v', data=3)
ds.attrs['standard_name', SSNO.hasStandardName] = 'y_velocity'
ds.rdf.object['standard_name'] = SSNO.StandardName # https://matthiasprobst.github.io/ssno#StandardName
h5.dump(collapsed=False)
hdf_filename = h5.hdf_filename
-
-
3 [] (int64)
- standard_name
https://matthiasprobst.github.io/ssno#hasStandardName: x_velocity
https://matthiasprobst.github.io/ssno#StandardName
-
3 [] (int64)
- standard_name
https://matthiasprobst.github.io/ssno#hasStandardName: y_velocity
https://matthiasprobst.github.io/ssno#StandardName
- standard_name