I noticed several datasets utilizing LibriVox data for audio datasets. This is an obvious solution as LibriVox recordings are all public domain. However, I am concerned as this involves using someone’s voice without expressed consent. What is the general policy or rule of thumb for using this data for research purposes?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Help me with my PhD research on voice dataset documentation by completing this survey | 1 | 465 | May 13, 2023 | |
| Dataset contains privately owned media unauthorized to be released under a Creative Commons public domain license | 2 | 176 | April 13, 2024 | |
| Creating a dataset with Librispeech Train_clean_100, Test_clean, and Dev_clean | 0 | 284 | January 27, 2024 | |
| Hosting BookCorpus | 0 | 46 | July 11, 2025 | |
| Create the Moxilla Common Voice Data | 2 | 874 | November 15, 2022 |