More
- Aman Goel
- Arun Subramaniyan
- Nishil Talati
- Samira Khan
- Sihang Liu
Submitted
Submission (3.9MB) Aug 1, 2021, 2:06:36 AM UTC b6ec382d0f99b699650cc2af82b2cae0e710aee788a3349e34b5926fc9a9758cb6ec382d
T. Dunn, H. Sadasivan, J. Wadden, K. Goliya, K. Chen, D. Blaauw, R. Das, S. Narayanasamy
- Tim Dunn (University of Michigan) <timdunn@umich.edu>
- Harisankar Sadasivan (University of Michigan) <hariss@umich.edu>
- Jack Wadden (University of Michigan) <jackwadden@gmail.com>
- Kush Goliya (University of Michigan) <kgoliya@umich.edu>
- Kuan-Yu Chen (University of Michigan) <knyuchen@umich.edu>
- David Blaauw (University of Michigan) <blaauw@umich.edu>
- Reetuparna Das (University of Michigan) <reetudas@umich.edu>
- Satish Narayanasamy (University of Michigan) <nsatish@umich.edu>
Artifact URL
https://doi.org/10.5281/zenodo.5150974
Hardware Dependencies
The SquiggleFilter Jupyter Notebook code requires approximately 5-10GB of RAM, and the datasets used require approximately 40GB of disk space. For hardware evaluation, Xilinx Vivado comes with the following additional requirements on the processor: minimum 2.2GHz, Intel Pentium 4, Intel Core Duo, or Xeon Processors; SSE2 minimum. If necessary, we can provide reviewers with remote access to a fully setup Jupyter Notebook session or Vivado installation.
Software Dependencies
Any Linux OS can be used, but a recent Ubuntu release is recommended for ease of installation. The Jupyter Notebook containing our software artifact has multiple Python package dependencies, which will be installed by an installation script. For hardware evaluation, a recent installation of the licensed Vivado Design Suite is recommended; we used release 2019.1.
Data Dependencies
The installation script we provide will download two publicly available datasets (from the CADDE Centre and ONT Open Datasets) containing approximately 40GB of raw nanopore signal (FAST5) files total.
Key Results to be Reproduced
For evaluating the hardware, our artifact contains the RTL and testbench SystemVerilog code for our SquiggleFilter accelerator, which can be simulated using Vivado.
The Jupyter Notebook pipeline contains:
- Our custom software sDTW implementation, run on 1000 random human and viral reads from the selected datasets. Figures 11 and 17a from our paper are regenerated, showing the human and virus alignment cost distributions and classification accuracies, respectively.
- Our Read Until runtime model. This estimates and plots expected Read Until runtime (Figures 17b/c) based on statistics measured after running our sDTW algorithm on the random sub-sample of reads.
- The scripts used for generating multiple figures from our paper (Figures 2, 5, 10, 16a, 16b, 18, 19, and 21).
Expected Completion Time
The Jupyter notebook pipeline runs in ~10 minutes on a 56-core system, but the dataset download (40GB) may take several hours. Vivado simulations ran in ~20 minutes on a 4-core laptop.
- Accelerators
- Health/Bio
- Workload Characterization
To edit this submission, sign in using your email and password.
RevExp | ArtPubAva | ArtFun | KeyResRep.1 | DisArtNom | ||
---|---|---|---|---|---|---|
Review #12A | 2 | 2 | 2 | 1 | 1 | |
Review #12B | 2 | 2 | 2 | 1 | 1 |