Quick clusters: a GPU-Parallel partitioning for efficient path tracing of unstructured volumetric grids

Date

2022-09-22

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

IEEE Transactions on Visualization and Computer Graphics

Print ISSN

1077-2626

Electronic ISSN

1941-0506

Publisher

Institute of Electrical and Electronics Engineers

Volume

29

Issue

1

Pages

537 - 547

Language

English

Journal Title

Journal ISSN

Volume Title

Citation Stats
Attention Stats
Usage Stats
21
views
2
downloads

Series

Abstract

We propose a simple yet effective method for clustering finite elements to improve preprocessing times and rendering performance of unstructured volumetric grids without requiring auxiliary connectivity data. Rather than building bounding volume hierarchies (BVHs) over individual elements, we sort elements along with a Hilbert curve and aggregate neighboring elements together, improving BVH memory consumption by over an order of magnitude. Then to further reduce memory consumption, we cluster the mesh on the fly into sub-meshes with smaller indices using a series of efficient parallel mesh re-indexing operations. These clusters are then passed to a highly optimized ray tracing API for point containment queries and ray-cluster intersection testing. Each cluster is assigned a maximum extinction value for adaptive sampling, which we rasterize into non-overlapping view-aligned bins allocated along the ray. These maximum extinction bins are then used to guide the placement of samples along the ray during visualization, reducing the number of samples required by multiple orders of magnitude (depending on the dataset), thereby improving overall visualization interactivity. Using our approach, we improve rendering performance over a competitive baseline on the NASA Mars Lander dataset from 6× (1 frame per second (fps) and 1.0 M rays per second (rps) up to now 6 fps and 12.4 M rps , now including volumetric shadows) while simultaneously reducing memory consumption by 3× (33 GB down to 11 GB) and avoiding any offline preprocessing steps, enabling high-quality interactive visualization on consumer graphics cards. Then by utilizing the full 48 GB of an RTX 8000, we improve the performance of Lander by 17 × (1 fps up to 17 fps, 1.0 M rps up to 35.6 M rps) .

Course

Other identifiers

Book Title

Degree Discipline

Degree Level

Degree Name

Citation

Published Version (Please cite this version)