eDNA along Houdong riverine zonation in Taiwan

Latest version published by Taiwan Biodiversity Information Facility (TaiBIF) on 15 November 2023 Taiwan Biodiversity Information Facility (TaiBIF)
Publication date:
15 November 2023
CC-BY 4.0

Download the latest version of this resource data as a Darwin Core Archive (DwC-A) or the resource metadata as EML or RTF:

Data as a DwC-A file download 6,297 records in English (367 KB) - Update frequency: as needed
Metadata as an EML file download in English (21 KB)
Metadata as an RTF file download in English (18 KB)


[Data paper publication pending] This dataset contains species occurrence dataset and their associated DNA sequences (in DNA-derived data extension), originating from a research project carried out in Houdong River (猴洞坑), Jiaoxi Township, Yilan, Taiwan. Fieldwork was conducted on April 28th, 2022. The primary objective of this study was to determine the occurrence of eukaryotic species in riverine ecosystems through the use of environmental DNA methodology.

Data Records

The data in this occurrence resource has been published as a Darwin Core Archive (DwC-A), which is a standardized format for sharing biodiversity data as a set of one or more data tables. The core data table contains 6,297 records.

1 extension data tables also exist. An extension record supplies extra information about a core record. The number of records in each extension data table is illustrated below.

Occurrence (core)

This IPT archives the data and thus serves as the data repository. The data and resource metadata are available for download in the downloads section. The versions table lists other versions of the resource that have been made publicly available and allows tracking changes made to the resource over time.


The table below shows only published versions of the resource that are publicly accessible.

How to cite

Researchers should cite this work as follows:

Hoh D (2023): eDNA along Houdong riverine zonation in Taiwan. v1.5. Taiwan Biodiversity Information Facility (TaiBIF). Dataset/Occurrence. https://ipt.taibif.tw/resource?r=houdongkeng_water_edna&v=1.5


Researchers should respect the following rights statement:

The publisher and rights holder of this work is Taiwan Biodiversity Information Facility (TaiBIF). This work is licensed under a Creative Commons Attribution (CC-BY) 4.0 License.

GBIF Registration

This resource has been registered with GBIF, and assigned the following GBIF UUID: 2615342d-7349-4e75-ae34-cda6cb403e2e.  Taiwan Biodiversity Information Facility (TaiBIF) publishes this resource, and is itself registered in GBIF as a data publisher endorsed by Taiwan Biodiversity Information Facility.


SamplingEvent; river ecosystem; species occurrence; metabarcoding; cytochrome c oxidase I gene; Eukaryota


Daphne Hoh
  • Metadata Provider
  • Originator
  • Point Of Contact
Postdoctoral Researcher
Taiwan Biodiversity Information Facility
Min-Chen Wang
  • Point Of Contact
Postdoctoral Researcher
Institute of Cellular and Organismic Biology, Academia Sinica

Geographic Coverage

Houdong River (猴洞坑), Jiaoxi Township, Yilan County, Taiwan

Bounding Coordinates South West [24.824, 121.768], North East [24.871, 121.846]

Taxonomic Coverage

We detected eukaryotic organisms in the water samples using the COI mitochondrial gene. A total of 2,736 OTUs were identified and 421 of the OTUs were assigned to at least the kingdom level.

Domain Eukaryota
Kingdom Animalia, Chromista, Fungi, Plantae, Protozoa
Phylum Amoebozoa, Bryophyta, Cryptophyta, Nemertea, Sulcozoa, Annelida, Bryozoa, Gastrotricha, Ochrophyta, Tracheophyta, Arthropoda, Chaetognatha, Glomeromycota, Oomycota, Zygomycota, Ascomycota, Chlorophyta, Haptophyta, Platyhelminthes, Basidiomycota, Chordata, Mollusca, Rhodophyta, Blastocladiomycota, Cnidaria, Mycetozoa, Rotifera

Temporal Coverage

Start Date 2022-04-28

Project Data

No Description available

Title Environmental DNA-based biodiversity profiling along Houdong riverine zonation in Taiwan

The personnel involved in the project:

Min-Chen Wang

Sampling Methods

Water samples for eDNA were collected from the near surface of the river. Before the collection, the water containers were rinsed with the local water at each sampling site. 1L of water was collected for eDNA analysis, and 200mL of water was collected for water quality measurements and were measured using multiple handheld probes on-site at each sampling site. Temperature, pH, and dissolved oxygen were measured through a multi-parameter meter (Multiline® Multi 3620 IDS, WTW, Weilheim, Germany) equipped with an IDS pH electrode (SenTix 940, WTW) and an optical IDS dissolved oxygen sensor (FDO® 925, WTW). Turbidity was measured by the turbidity meter (TUB-430, EZDO, Taiwan). Salinity was determined by the sanity refractometer (2491 MASTER-S/Milla Salinity Refractometer, Atago, Japan). Three replicate measurements of each parameter were taken. Water samples were transported and processed through two filtration steps at the Marine Research Station (MRS, Yilan, Taiwan) of the Institute of Cellular and Organismic Biology (Academia Sinica, Taipei, Taiwan). In the first step, the larger particles were removed by a 75µm pore size sieve. Afterward, the 1L water sample from each site was vacuum-filtered with a 0.22µm filter (PC651-0024, GeneDireX, USA). The filter membranes were placed in sterile Petri dishes and stored at -80°C until DNA extraction.

Study Extent This is a one-time sampling event. Water samples were collected along the Houdong River in Jiaoxi Township, Yilan County, Taiwan. The Houdong River is a popular tourist destination running across the city of Yilan. The river system originates east of the Sidu mountains and flows through primary and secondary forests, agricultural lands (rice), and developed areas (light industrial and residential) until it eventually drains into the Pacific Ocean. We selected four sites along the Houdong River. We collected water samples at each site for eDNA analysis and measured in-situ water quality.

Method step description:

  1. Wet lab process: The DNA was extracted at the Biodiversity Research Center (Academia Sinica, Taipei, Taiwan). Each filtered membrane was cut into quarters. 3 of 4 pieces of filtered membranes were used in the study as 3 experimental replicates. The final quarter was saved as the sample backup. DNA from each quarter membrane piece was extracted using the Presto™ Stool DNA Extraction Kit (STLD100, Geneaid Biotech Ltd., Taiwan) following the manufacturer's instructions (Instruction Manual Ver. 10.21.17). The quality and quantity of the extracted DNA was assessed using a Nanodrop 2000 (Thermo Fisher Scientific Inc., USA) and the Qubit 4 dsDNA High Sensitivity Assay Kit (Thermo Fisher Scientific Inc., USA). The MinibarF1 (5'TCCACTAATCACAARGATATTGGTAC) and MinibarR1 (5'GAAAATCATAATGAAGGCATGAGC) primers that were designed by Meusnier et al. (2008), were used to amplify the 5' region (ca. 120-150bp) of the mitochondrial Cytochrome c oxidase I (COI) gene. The universality of the primers was recommended for distinguishing the highly diverse DNA from the environmental mixture. We conducted PCR using a one-step single-indexed approach, with a 13bp tag attached to the MinibarR1 primer. The PCR reaction volume was 16μl, which included 8μl KAPA HiFi HotStart ReadyMix (KK2602, Roche Molecular Systems Inc., USA), 5μl ddH20, 1μl of each primer (10μM), and 1μl of DNA template. To optimize the protocol, we performed a preliminary PCR using an annealing temperature gradient and found that 54°C gave the best results. The PCR mixture was denatured at 95°C for 15 minutes, followed by 35 cycles of 94°C for 30 seconds, 54°C for 30 seconds, and a final elongation at 72°C for 10 minutes. The PCR products were checked on a 1.5% agarose gel and quantified with the Invitrogen Qubit 4 fluorometer (Thermo Fisher Scientific Inc., USA). Afterwards, all the PCR products were pooled in one tube for next-generation sequencing. Sequencing was performed on the Illumina NovaSeq 6000 platform with 2*150 paired-end reads by Genomics Co., Taipei, Taiwan.
  2. Data processing and analyzing: The Illumina raw reads were demultiplexed by Genomics Co., Taipei, Taiwan. FastQC (v0.11.9; https://github.com/s-andrews/FastQC) was used to the quality-checked. The forward and reverse primers of the demultiplexed reads were trimmed using Cutadapt (version 4.2; Martin 2011). The USEARCH platform (v11.0.667; Edgar 2010) was used to verify if the primer sequences were completely removed from the demultiplexed reads. The "denoised-paired" function was used to create an amplicon sequence variant (ASV) data output from the demultiplexed reads. The "denoise-paired" function in QIIME2 (v2023.2.0; Bolyen et al. 2019) can automatically trim, filter, denoise, merge reads, and remove chimeric reads in one step. The maximum expected error of forward and reverse reads was set to 1.0. The reads with a quality score of less than 20 were truncated. The minimum overlap length for the forward and reverse reads merger was set to 16 bp. Other parameters followed the default settings in the "denoise-paired" function. No reads were trimmed or truncated during the ASV creation process. The ASV output was then further clustered through the "cluster-features-de-novo" function provided by QIIME 2. ASVs with more than 97% sequence identity were clustered into one operational taxonomic unit (OTU). The sequences that were shorter than 100bp in the ASV and OTU results were discarded. Taxonomic assignments were conducted using Constax (v2.0.18; Liber et al. 2021) against the MIDORI database (vGB250; Machida et al. 2017). The R package phyloseq was used to analyze the preprocessed sequencing data (v1.40.0; McMurdie and Holmes 2013). The pie chart figure was produced with ggplot2 (v3.4.2; Villanueva and Chen 2019). Lowest taxon level-annotation of each OTU was extracted to perform a secondary species mapping to the GBIF Backbone Taxonomy (GBIF Secretariat 2011) using the "name_backbone_checklist" function in R package rgbif (v3.7.7; Chamberlain et al. 2022) in R (R Core Team 2023).
  3. Open data and code: DNA sequence data have been deposited on ENA at EMBL-EBI under accession number PRJEB60905. We converted the occurrence data into Darwin Core Archive standard (Darwin Core Task Group 2009) and validated data in the event and occurrence data sheets using the GBIF Data Validator (Global Biodiversity Information Facility 2017). We then published the dataset containing one event core and four extensions following GBIF registered extensions: occurrence, DNA-derived data, measurement or facts, and resource relationship using the Integrated Publishing Toolkit (IPT) of GBIF installed under the Taiwan Biodiversity Information Facility (TaiBIF). All source code used in the project can be found in the project's GitHub repository.

Bibliographic Citations

  1. European Nucleotide Archive EMBL-EBI Project. https://www.ebi.ac.uk/ena/browser/view/PRJEB60905
  2. Chamberlain S, Barve V, Mcglinn D, Oldoni D, Desmet P, Geffert L, Ram K (2023). rgbif: Interface to the Global Biodiversity Information Facility API. R package version 3.7.7. https://CRAN.R-project.org/package=rgbif
  3. R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/

Additional Metadata

Alternative Identifiers 2615342d-7349-4e75-ae34-cda6cb403e2e