Skip to content

Docker usage

Below are examples for using the llm-dataset-converter library via its Docker images.

Interactive session#

The following command starts an interactive session, mapping the current working directory to /workspace:

docker run --rm -u $(id -u):$(id -g) \
    -v `pwd`:/workspace \
    -it waikatodatamining/llm-dataset-converter:latest

Conversion pipeline#

The following converts the Alpaca dataset from JSON into CSV format:

docker run --rm -u $(id -u):$(id -g) \
    -v `pwd`:/workspace \
    -it waikatodatamining/llm-dataset-converter:latest \
    llm-convert \
      -l INFO \
      from-alpaca \
        --input /workspace/alpaca_data_cleaned.json \
        -l INFO \
      to-csv-pr \
        --output /workspace/alpaca_data_cleaned.csv
        -l INFO

NB: The input and output directories are located below the current working directory (pwd).