Getting Started#

Installation#

pip install --upgrade liwca

For development:

git clone https://github.com/remrama/liwca.git
cd liwca
uv pip install -e ".[dev]"

Fetching dictionaries#

Each registered dictionary has a dedicated fetch_* function that downloads the file on first use and returns it as a DataFrame:

import liwca

dx = liwca.fetch_threat()

See Fetching Dictionaries for all available dictionaries and their options.

Downloaded files are cached locally via Pooch. By default, files are cached in your OS data directory. You can override this by setting the LIWCA_DATA_DIR environment variable before importing liwca:

export LIWCA_DATA_DIR=/path/to/my/cache

Counting words#

count takes texts and a dictionary DataFrame, and returns a documents × categories table:

texts = ["The threat of danger loomed over the city", "A calm morning"]
results = liwca.count(texts, dx)

Values are percentages of total words per document by default. See count for options including raw counts and custom tokenizers.

DDR (semantic scoring)#

ddr scores texts against dictionary categories using cosine similarity in embedding space, following the Distributed Dictionary Representation method (Garten et al., 2018). This captures semantic proximity even when exact dictionary words are absent from the text.

Pass a gensim model name to automatically download embeddings (requires pip install liwca[ddr]):

results = liwca.ddr(texts, dx, "glove-wiki-gigaword-100")

Or bring your own embeddings as a dict-like mapping:

results = liwca.ddr(texts, dx, my_embeddings)

Values are cosine similarities in [-1, 1]. See ddr for full parameter details.

Reading and writing local files#

dx = liwca.read_dx("my_dictionary.dicx")   # auto-detects .dic or .dicx
liwca.write_dx(dx, "my_dictionary.dic")
merged = liwca.merge_dx(dx_a, dx_b)

LIWC-22 wrapper#

If LIWC-22 is installed, call it from Python. The LIWC-22 desktop application (or its license server) must be running when you call the CLI:

liwca.liwc22("wc", input="data.csv", output="results.csv")

Pass auto_open=True to let liwca start and stop LIWC-22 automatically.

See liwc22 for the full argument reference, and the LIWC CLI documentation and Python CLI example for more details.