Databank of basic data
Here you can read a general description of the databank of basic data in Denmark’s Data Portal (DDP).
Basic data (also called ‘DDP basic data’ or ‘data in the databank of basic data in Denmark’s Data Portal’) refers to the microdata that DDP offers to external users for research and statistical purposes. Statistics Denmark (DST) has collected a wide range of register data with historical information in a bank of basic data within the DDP App. The various registers come from both external sources and internal statistical offices. Thematically, the data cover a broad spectrum, and the statistical unit may be individuals, addresses, enterprises, library loans, motor vehicles, and more. All basic data must comply with a set of standards for formats and naming etc.
The data undergo extensive processing before being placed in the bank of basic data. There are several reasons for this:
- Standardization saves users time-consuming preparation: by ensuring uniform data, external users can avoid a significant amount of manual data processing.
- Key variables must be standardized to enable data linkage: combining data across years and registers requires a common standard for key variables.
- Key variables must be standardized to enable pseudonymization: data can only be pseudonymized correctly if variables follow fixed standards and naming.
Purpose of the databank of basic data
The purpose of the databank of basic data is to collect microdata for research and analysis in a way that makes it easy and straightforward to make microdata available to researchers.
Content and use of the databank of basic data
DDP aims to ensure that all DST data from the official statistical program are available as basic data. This primarily includes microdata related to individuals, enterprises, or addresses.
DST also holds data that are not part of the official statistical program but which DDP has received or collected for various reasons. This type of data is also made available as basic data to support reuse, rather than requiring the statistical offices to design customized extracts for the users.
To qualify as basic data, a number of conditions must be met. For certain data, special considerations regarding data confidentiality, funding arrangements or data quality may influence how the data can be used. DST enters into agreements with other authorities (and data owners) for regular deliveries of register data that can be made available to researchers. These data are also placed in the databank of basic data. Read more under Data from other data providers for the databank of basic data.
Before data may be used on the research server, variables that can directly identify individuals undergo pseudonymization. This means that all variables containing identification information such as CPR numbers, CVR numbers, addresses, and property numbers, are recoded in a pseudonymization process before being transferred to the user’s project. Each register includes a marking of which variables must be pseudonymized. When new variables are added to a register, DDP assesses, together with the data owner and based on the data confidentiality policy, whether the variables must be pseudonymized before the data can be released for research and analysis.
Applying the procedures and guidelines described, ensures that data stored in the databank of basic data and presented within DST follow a standardized format. This makes it easy for researchers to access the data and navigate the available datasets. Read more about where to find documentation for basic data on the page Documentation of data.