Data

Selections are subsets of multidimensional datasets, i.e. collections of data points with associated observations, observation flags, and observation metadata.

Most questionnaires focus on spatiotemporal slices. In the dominant use case, they solicit data from member countries for given years. A range of previous years is included for revision or retrospective completion.

The system has been entirely designed around this use case. It assumes that:
questionnaires target a single dataset with:
a geographical dimension, typically countries.
a temporal dimension expressed in years.
an item dimension.
a measure dimension.
they select exactly one country.
they select a reporting year plus a range of previous years.
they can select an arbitrary number of items and measures.
note: we've recently relaxed some of these assumptions: datasets can now have additional dimensions.

Specifications

Selections are specified by listing their coordinates along each dimension. They are then generated by a process that combines the specified coordinates to yield all the data points. Next, selections are then loaded, i.e. filled with any observation, flag, and metadata that may already exist at their data points. Loaded selections are finally pruned of all the empty points (i.e. points that have no associated observations) which do not meet the criteria for inclusion in questionnaires.

Parameters

Some of the coordinates are specified as parameters, to be replaced with actual coordinates at generation time. Thus a single specification begets many questionnaires, each of which corresponds to a combination of the parameters, or parameter set. Typically, parameter sets are collected in parameter suites that match campaign requirements.

In the current system:
country and the reporting year are parameters.
the full year range is specified in terms of a number of years before the reporting year.
all the other coordinates are explicitly specified.
Parameter sets are provided at generation time and recorded on generated questionnaires. Parameter suites, like campaigns, are not explicitly modelled.

Structure

Specifications structure selections in thematic subsets, and structures can nest. Structure also gives scope to specification constraints, first and foremost the constraints we discuss below.

In the current system:
the structure of selections is directly derived from the structure of the questionnaire. In other words, data and layout share the same structure and this structure projects a presentation order on questionnaires: data points in one structure appear before those of the structures that follow.
two levels are available: sections and subsections.

Constraints

Constraints override the basic specification mechanism, adding or removing data points within given structures based on given criteria. There are two types of constraints:

static constraints, don't depend on the current contents of datasets and must be enforced ahead of selection loading.
dynamic constraints do depend on data and can only be enforced after selections are loaded.

The system supports two types of constraints:
item overrides: these are static constraints whereby the coordinates of measures and other dimensions inside subsections can be overridden on a per-item basis. Overrides are only available in relation to items.
item inclusions: these are dynamic constraints whereby items without observations are retained in subsections if observations about them exist in a historical year range, a superset of the range chosen for the questionnaire. Items can also be forcefully included, or even included if they are included in some previous section (only back-references are allowed).

Reference Metadata

The coordinates of selections are drawn from codelists and may have associated metadata. The name of coordinates in multiple languages is a case in point, though more information may be available about the underlying concept.

Selections include these data so that it can be presented to correspondents in addition or replacements of codes. Selections are resolved when its coordinates are replaced into metadata for presentation purposes. Resolution may occur when questionnaires are generated, or when they're rendered. The language of the questionnaire is a key parameter for the process.

In the current system:
multilingual names are a mandatory part of the model of dimensions. Additional metadata has been recently introduced as an optional extension of the model. In the extension, metadata has a JSON representation with no constraint other than well-formedness.
only item metadata can be included in selections, hence questionnaires.
the metadata of interest is identified by a path into the JSON representation and the path resolved at rendering time.

Metadata

Along data, questionnaires collect various forms of metadata.

Some metadata relates specifically to qualitative aspects of the data (e.g. source, completeness, frequency) or the focal point directly responsible for editing the data, and may vary from selection to selection. Other metadata relates to the questionnaire as a whole, including the focal point in FAO for it and focused elements of feedback from the correspondent.

In the current system,
like for selections, metadata specifications include model and presentation aspects.
for each section dedicated to a selection, there's a corresponding section that gathers metadata about the data in the selection.
there are sections dedicated to metadata about the questionnaire as a whole (e.g. so-called global metadata section).
metadata is divided in groups of correlated elements, where elements are named and have a value which is either free-from or else constrained to a set of predefined choices.

At import time, all forms of metadata are stored within the system in close proximity with the data.

In the current system:
all metadata is stored as block metadata, in association with the selection, rather than the observations.
metadata elements and groups have types that map them unambiguously to the metadata model used thorughout in the system.

PreviousData and Layout NextLayout

Last updated 6 years ago