This dataset is shared in conjunction with a manuscript, titled "MozzaVID: Mozzarella Volumetric Image Dataset"
The data is provided in three versions/sizes (Small, Base, Large), reflecting the setup proposed in the paper. Use the "models" folder to recreate reported results.
For details, see below liks:
1. Webpage of the project - https://papieta.github.io/MozzaVID/
2. Manuscript - https://arxiv.org/abs/2412.04880
3. GitHub repository (data loaders, examples) - https://github.com/PaPieta/MozzaVID/tree/main
--------
Unique scans can be explored through the "raw_dataset" folder. It contains 25 subfolders, one for each coarse-grained class. Inside the subfolders are .tiff files containing cleaned up CT scans. Each scan is ~5.1 GB in size, (2156, 1601, 1601)px saved with uint8 data type. The scans are cropped to conain only the cheese microstructure (no surrounding air), and their intensity unified.
The names should be interpreted as {cheese_idx}_{sample_idx}_{scan_idx}.tiff, so e.g sample 1_3_4.tiff is:
* Cheese idx/coarse-grained class: 1
* Sample idx/fine-graiend class: 3
* Scan/local tomography idx: 4
---------
To download the data, click on a chosen file/folder, then click the download button in the top right corner.
Alternatively, the data can be easily fetched through the command line, for example with:
> wget https://archive.compute.dtu.dk/downloads/public/projects/MozzaVID/[Small, Base, Large].zip
---------
If you are low on disk space, you can stream the dataset splits during training using our WebDataset setup on HuggingFace (check our GitHub for details):
https://huggingface.co/datasets/dtudk/[MozzaVID_Small, MozzaVID_Base, MozzaVID_Large]
If you use our data, please consider citing our work:
@misc{pieta2024b,
title={MozzaVID: Mozzarella Volumetric Image Dataset},
author={Pawel Tomasz Pieta and Peter Winkel Rasmussen and Anders Bjorholm Dahl and Jeppe Revall Frisvad and Siavash Arjomand Bigdeli and Carsten Gundlach and Anders Nymark Christensen},
year={2024},
howpublished={arXiv:2412.04880 [cs.CV]},
eprint={2412.04880},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.04880},
}
| # |
Change |
User |
Description |
Committed |
|
|
#3
|
16559 |
hmkj |
Update, copying raw_dataset from private to public depot |
|
|
|
#2
|
16555 |
tsal |
Updating to Oct 2025 version |
|
|
|
#1
|
16539 |
hmkj |
Readme to public |
|
|