1. File Organization
You may have many projects or files. Organizing your files properly can save your time to find and manage the files in the future. Creating subfolders in your storage can help you organize your files effectively.
There are many possible methods to organize the files in your storage. These are some possible schemes in the organization:
2. File Naming Conventions
When you name a file, it is advised that the names are:
Preferred | Not preferred | |
Format | Chronological order by yyyymmdd: | Naming files by ddmmyyyy gives a messy date order: |
Example |
20180809.txt 20190302.txt 20200103.txt |
02032019.txt 03012020.txt 09082018.txt |
Here are some golden rules for storage and backup:
1. LOCKSS: Lots of Copies Keep Stuff Safe
2. 3-2-1 backup
3. Good storage
4. Poor storage
When data are preserved in a data repository, some file formats are preferred due to the obsolescence of software and file formats. Here is a reference:
Data type | Preferred format | Description | Non-preferred format |
Text | .txt .rtf .xml .pdf |
Plain text format Rich text format eXtensible Mark-up Language PDF |
Word |
Tabular | .csv .tsv |
Comma-separated values Tab-separated values |
Excel |
Image | .tif .svg .jp2 |
TIFF SVG JPEG2000 |
GIFF JPEG |
Audio | .mp3 .wav |
MP3 WAVE |
|
Video | .mp4 .avi |
MPEG-4 AVI |
Throughout the research life cycle, researchers should document the provenance, the content, and the ideas of the data in order to support the creation of metadata and readme file in later stages of the data life cycle.
1. Data Documentation: Metadata
To support the discovery of data, metadata should be put alongside the data.
Metadata is the data that describes other data. Metadata ensures data can be discovered, identified, managed, retrieved, and reused. It is vital to successful curation, although its creation takes time.
Metadata should be structured, as well as human- and machine-readable. It should be organized in a metadata standard. Dublin Core (DC) and Data Documentation Initiative (DDI) are two examples of common metadata standards.
When you choose a data repository to deposit your data, you will be asked to provide some description for your data. The way you fill in the form and the presentation of the descriptions in search results are organized as human-readable metadata. When this metadata is processed by the data repository, it is transformed into a machine-readable format, such as markup languages. A well-developed repository has its chosen metadata standard(s). Some metadata standards used by repositories around the world are available on https://www.re3data.org/.
2. Data Documentation: readme.txt
A readme.txt file allows prospective users to know how to open the data files, learn and reuse its content. It is advised to be deposited along with the data in a data repository. A readme file is usually in .txt format, and contains:
3. Data Documentation: CUHK Research Data Repository
In the CUHK Research Data Repository, each data should be accompanied with some details on Citation Metadata to facilitate the discovery of the research data. Depending on the nature of the research data, the data owner can provide more details using other metadata templates:
The data owner can also provide a readme.txt along with the deposited data in order to facilitate potential users to understand and reuse the data.
When research outputs undergo peer review, publishers sometimes request access to data for validation of research results and preparation of data sharing. You can deposit your data in a data repository temporarily in private mode in order to protect your intellectual property before publishing your research outputs. You will be provided a private URL for sharing with your publishers and trusted parties. For details on private data deposit at CUHK Research Data Repository, please refer to this page.
When your research outputs are ready to be published, you can publish the data for open access to support data sharing and reuse.