Usage¶
Discovering Corpora¶
import krank
# List all available corpora
krank.list_corpora()
# Filter by language
krank.list_corpora(language="en")
# Get metadata without downloading
krank.info("zhang2019")
# Corpus: zhang2019
# Title: Zhang & Wamsley, 2019
# Description: Dream reports collected from a laboratory polysomnography study
# Version: 1
# Citations: Zhang, J., & Wamsley, E. J. (2019); Wong, W., Herzog, R., ... (2025)
Loading Data¶
corpus = krank.load("zhang2019")
# Print corpus info
print(corpus)
# Corpus: zhang2019
# Title: Zhang & Wamsley, 2019
# Description: Dream reports collected from a laboratory polysomnography study
# Version: 1
# Citations: Zhang, J., & Wamsley, E. J. (2019); Wong, W., Herzog, R., ... (2025)
The data is not downloaded until you access it:
Corpus Attributes¶
| Attribute | Description |
|---|---|
corpus.reports |
DataFrame of dream reports (tidy format) |
corpus.authors |
DataFrame of author metadata (deduplicated) |
corpus.n_reports |
Number of reports in corpus |
corpus.n_authors |
Number of unique authors in corpus |
corpus.metadata |
Dict of corpus metadata from registry |
corpus.path |
Local path to cached file |
corpus.name |
Corpus name |
Collections¶
Some corpora belong to collections (e.g., DreamBank):
# List collections
krank.list_collections()
# Get collection info
krank.collection_info("dreambank")
Caching¶
Downloaded files are cached locally. krank uses pooch for caching, storing files in your system's default cache directory.