Skip to Main Content

CUHK Research Data Repository: Guidelines

This Guide introduces the CUHK Research Data Repository.

1. Scope

1.1 The CUHK Research Data Repository (the Repository) manages datasets created and deposited by CUHK faculty, students, staff, and other authorized persons.

2. Data Deposition

2.1 CUHK faculty, students, staff, and other authorized persons or organizations are eligible to deposit data into the Repository.

2.2 Depositors may only deposit their original work or datasets with copyright granted.

2.3 Submitted datasets will be moderated to ensure accuracy and compliance with requirements for publishing in the Repository.

2.4 The Repository maintains metadata about each version of a dataset.

2.5 The Repository accepts files with maximum size of 3GB. If the file size exceeds this limit, depositors of datasets (Depositors) may contact the Repository Support at

2.6 Datasets may be stored in compressed format as necessary. Details on the compression software used and compression ratio will be stored with the metadata of the dataset.

3. Metadata

3.1 Metadata is the detailed information and documentation that describes the dataset and the processes used to create it.

3.2 Datasets deposited in the Repository should be accompanied with metadata to make the datasets findable and enable reuse by other researchers.

3.3 The metadata of datasets is openly accessible.

3.4 The metadata may be used in other media provided that written permission from the Depositor is obtained, and the metadata linked with the Repository being acknowledged.

3.5 Use of the metadata for commercial purposes is not permitted unless written permission from the Depositor is obtained.

4. Rights and Responsibilities of Repository

4.1 The Repository is responsible for managing the storage and access of datasets.

4.2 The Repository may convert and reformat datasets to ensure their future preservation and accessibility.

4.3 The Repository may make copies of deposited datasets for security and backup.

4.4 The Repository may include the metadata or documentation of datasets in public access catalogs such as the CUHK LibrarySearch.

4.5 The Repository will take every care to curate and preserve datasets, however the Repository is not liable for any loss or damage to the datasets or any other data while it is stored in the Repository or subsequently migrated.

4.6 The Repository accepts no responsibility for mistakes, omissions, or legal infringements with the deposited datasets.

4.7 The Repository does not warrant that the deposited datasets are timely, accurate, complete, reliable, or correct.

4.8 The Repository reserves the right to evaluate quality of data according to the metadata and data documentation in order to make decisions on acceptance of data deposit request.

4.9 The Repository reserves the right not to accept datasets which is beyond the capacities of the Repository to process in terms of resources, staff, facilities, and operation flow.

5. Rights and Responsibilities of Depositors

5.1 The Depositor and data creator are responsible for the quality of their research data.

5.2 The Depositor is responsible for any publications of datasets.

5.3 The Depositor is responsible that the deposited dataset does not breach any law and does not infringe the copyright of any other person, organization, or institution.

5.4 If the dataset does contain copyrighted material, the Depositor is responsible for securing permission from the copyright holder(s) to reuse the material in the dataset and include this copyright information in the metadata; or the Depositor is responsible for removing any copyrighted or third party material from the dataset before deposit.

5.5 The Depositor is responsible that the deposited dataset is not derived from a licensed or commercial product.

5.6 The Depositor is responsible for all obligations to a sponsoring agency having been fulfilled.

5.7 The Depositor retains the right to deposit current or future versions of datasets outside the Repository.

6. Data Confidentiality

6.1 The Depositor should ensure that the data meet requirements of confidentiality and non-disclosure of personal data collected from human subjects.

6.2 Datasets should be free of direct and indirect identifiers that can be linked to specific individuals.

7. Embargo Status

7.1 Datasets may be deposited with an embargo period, during which it may not be available for access by the public.

7.2 The length of embargo period and embargo conditions are subject to the decision of the Depositor.

7.3 Embargoed datasets will receive the same processing arrangement as other deposited datasets at the time of deposit.

8. Access to Datasets

8.1 To promote open access of research data, deposited datasets may be downloaded by users at the Repository subject to the data sharing license agreement. All users should read and agree to the Terms and Conditions on Access and Reuse of Data from the CUHK Research Data Repository (Terms and Conditions) before downloading and reusing any datasets.

8.2 The Depositor may request users to register in order to gain access to the datasets. The registration information will only be used by the Depositor for access approval and contacting users with news, updates, or corrections of deposited datasets.

8.3 In some cases, the Repository may provide links to datasets created and deposited at other repositories by CUHK faculty, students, and staff.

9. Use and Reuse of Datasets

9.1 The Terms and Conditions include a stipulation for the datasets to be used in accordance with standards for ethical and responsible research practices.

9.2 The Depositor may assign specific terms and conditions for use and reuse of deposited datasets.

9.3 In general, the datasets in the Repository are under CC BY-NC and only for non-commercial, research, and instructional purposes.

9.4 For redistribution of datasets (whole or in part) in other media, users should seek permission from the Depositor beforehand.

9.5 Users should attribute and cite the datasets correctly in any published or unpublished research.

10. Preservation of Data

10.1 The Repository will follow established best practices for managing datasets over the long term and enable the continued access to datasets deposited into the Repository.

10.2 Some file formats may be proprietary and long term preservation may be limited or infeasible.

10.3 Some datasets may be deposited in the Repository “as is”. The Repository does not guarantee the continued use of the data.

11. Withdrawal of Data and Succession Plans

11.1 Deposited datasets may be withdrawn by the Repository Administration for any of the following reasons:

  • Copyright violation
  • Legal requirements and/or proven violations of legal requirements
  • Falsified research
  • Confidentiality concerns

11.2 If the Depositor wants to remove a dataset folder after withdrawing the datasets contained, please contact the Repository Administration.

11.3 If the Repository ceases operation in future, every effort will be made to transfer the deposited datasets to another appropriate archive.

The above terms and conditions are subject to revisions without prior notification; and the latest version shall prevail.