---
title: "Python Environment"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Python Environment}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
```

```{r}
library(TextAnalysisR)

tokens <- quanteda::tokens(SpecialEduTech$abstract[1:5])
dispersion <- calculate_lexical_dispersion(
  tokens,
  terms = c("learning", "instruction")
)
head(dispersion)
```

Python enables features: NLP with spaCy, embeddings, and neural sentiment analysis.

## Quick Setup

`setup_python_env()` automatically:

1. Creates virtual environment `textanalysisr-env`
2. Installs **spacy** and **pdfplumber**
3. Downloads spaCy English model (`en_core_web_sm`)

Uses virtualenv (or conda if available).

## Check Status

Run `check_python_env()` to verify the environment.

## Common Issues

### "Another Python already initialized"

Set preferred environment in `.Rprofile` with `Sys.setenv(RETICULATE_PYTHON_ENV = "textanalysisr-env")`, then restart R.

### Environment in OneDrive

Avoid OneDrive paths. Use `setup_python_env(envname = "textanalysisr-env")`.

## spaCy Models

The default `en_core_web_sm` model is installed automatically. For word vectors (similarity), the medium model is 91 MB and the large model is 560 MB:

```bash
python -m spacy download en_core_web_md
python -m spacy download en_core_web_lg
```

## Deep Learning (Optional)

For embeddings and neural sentiment:

```bash
pip install sentence-transformers transformers torch
```

## Diagnostics

Use `reticulate::py_config()` and `reticulate::virtualenv_list()` to inspect the active Python.