ParlAI (pronounced “par-lay”) is a framework for dialog AI research, implemented in Python.

Its goal is to provide researchers:

Over 20 tasks are currently supported, including popular datasets such as SQuAD, bAbI tasks, MS MARCO, MCTest, WikiQA, WebQuestions, SimpleQuestions, WikiMovies, QACNN & QADailyMail, CBT, BookTest, bAbI Dialog tasks, Ubuntu Dialog, OpenSubtitles, Cornell Movie, VQA-COCO2014, VisDial and CLEVR. See here for the current complete task list.

Included are examples of training neural models with PyTorch and Lua Torch, with batch training on GPU or hogwild training on CPUs. Using Theano or Tensorflow instead is also straightforward.

Our aim is for the number of tasks and agents that train on them to grow in a community-based way.

ParlAI is described in the following paper: “ParlAI: A Dialog Research Software Platform", arXiv:1705.06476.

We are in an early-release Beta. Expect some adventures and rough edges.
See the news page for the latest additions & updates, and the website for further docs.

Please also note there is a ParlAI Request For Proposals funding university teams, 7 awards are available - deadline Aug 25.


Unified framework for evaluation of dialogue models

End goal is general dialogue, which includes many different skills

End goal is real dialogue with people

Set of datasets to bootstrap a working dialogue model for human interaction


Basic Examples

Note: If any of these examples fail, check the requirements section to see if you have missed something.

Display 10 random examples from task 1 of the "1k training examples" bAbI task:

python examples/ -t babi:task1k:1

Displays 100 random examples from multi-tasking on the bAbI task and the SQuAD dataset at the same time:

python examples/ -t babi:task1k:1,squad -n 100

Evaluate on the bAbI test set with a human agent (using the local keyboard as input):

python examples/ -m local_human -t babi:Task1k:1 -dt valid

Evaluate an IR baseline model on the validation set of the Movies Subreddit dataset:

python examples/ -m ir_baseline -t "#moviedd-reddit" -dt valid

Display the predictions of that same IR baseline model:

python examples/ -m ir_baseline -t "#moviedd-reddit" -dt valid

Train a seq2seq model on the "1k training examples" bAbI task 1 with batch size of 8 examples for one epoch (requires pytorch):

python examples/ -m seq2seq -t babi:task1k:1 -bs 8 -e 1 -mf /tmp/model_s2s

Trains an attentive LSTM model on the SQuAD dataset with a batch size of 32 examples (pytorch and regex):

python examples/ -m drqa -t squad -bs 32 -mf /tmp/model_drqa


ParlAI currently requires Python3.

Dependencies of the core modules are listed in requirement.txt.

Several models included (in parlai/agents) have additional requirements. DrQA requires installing PyTorch, and the MemNN model requires installing Lua Torch. See their respective websites for installation instructions.

Installing ParlAI

Run the following commands to clone the repository and install ParlAI:

git clone ~/ParlAI
cd ~/ParlAI; python develop

This will link the cloned directory to your site-packages.

This is the recommended installation procedure, as it provides ready access to the examples and allows you to modify anything you might need. This is especially useful if you if you want to submit another task to the repository.

All needed data will be downloaded to ~/ParlAI/data, and any non-data files (such as the MemNN code) if requested will be downloaded to ~/ParlAI/downloads. If you need to clear out the space used by these files, you can safely delete these directories and any files needed will be downloaded again.

Worlds, agents and teachers

The main concepts (classes) in ParlAI:

After defining a world and the agents in it, a main loop can be run for training, testing or displaying, which calls the function world.parley(). The skeleton of an example main is given in the left panel, and the actual code for parley() on the right.

Actions and Observations

All agents (including teachers) speak to each other with a single format -- the observation/action object (a python dict). This is used to pass text, labels and rewards between agents. It’s the same object type when talking (acting) or listening (observing), but a different view (with different values in the fields). The fields are as follows:

Each of these fields are technically optional, depending on your dataset, though the 'text' field will most likely be used in nearly all exchanges.

For a fixed supervised learning dataset like bAbI, a typical exchange from the training set might be as follows (the test set would not include labels):

Teacher: {
    'text': 'Sam went to the kitchen\nPat gave Sam the milk\nWhere is the milk?',
    'labels': ['kitchen'],
    'label_candidates': ['hallway', 'kitchen', 'bathroom'],
    'episode_done': False
Student: {
    'text': 'hallway'
Teacher: {
    'text': 'Sam went to the hallway\nPat went to the bathroom\nWhere is the milk?',
    'labels': ['hallway'],
    'label_candidates': ['hallway', 'kitchen', 'bathroom'],
    'episode_done': True
Student: {
    'text': 'hallway'
Teacher: {
    ... # starts next episode


The code is set up into several main directories:

Each directory is described in more detail below, ordered by dependencies.


The core library contains the following files:


The agents directory contains agents that have been approved into the ParlAI framework for shared use. We encourage you to contribute new ones! Currently available within this directory:


This directory contains a few particular examples of basic loops.


Our first release included the following datasets (shown in the left panel), and accessing one of them is as simple as specifying the name of the task as a command line option, as shown in the dataset display utility (right panel):

Over 20 tasks were supported in the first release, including popular datasets such as SQuAD, bAbI tasks, MCTest, WikiQA, WebQuestions, SimpleQuestions, WikiMovies, QACNN, QADailyMail, CBT, BookTest, bAbI Dialog tasks, Ubuntu, OpenSubtitles, Cornell Movie, VQA-COCO2014. Since then, several datasets have been added such as VQAv2, VisDial, MNIST_QA, Personalized Dialog, InsuranceQA, MS MARCO, TriviaQA, and CLEVR. See here for the current complete task list.

Choosing a task in ParlAI is as easy as specifying it on the command line, as shown in the above image (right). If the dataset has not been used before, ParlAI will automatically download it. As all datasets are treated in the same way in ParlAI (with a single dialog API), a dialog agent can in principle switch training and testing between any of them. Even better, one can specify many tasks at once (multi-tasking) by simply providing a comma-separated list, e.g. the command line “-t babi,squad”, to use those two datasets, or even all the QA datasets at once (-t #qa) or indeed every task in ParlAI at once (-t #all). The aim is to make it easy to build and evaluate very rich dialog models.

Each task folder contains:

To add your own task:


An important part of ParlAI is seamless integration with Mechanical Turk for data collection, training and evaluation.

Human Turkers are also viewed as agents in ParlAI and hence person-person, person-bot, or multiple people and bots in group chat can all converse within the standard framework, switching out the roles as desired with no code changes to the agents. This is because Turkers also receive and send via a (pretty printed) version of the same interface, using the fields of the observation/action dict.

We currently provide three examples: collecting data, human evaluation of a bot, and round-robin chat between local humans and remote Turkers.

The mturk library contains the following directories:

To run an MTurk task:

To add your own MTurk task:

Please see the MTurk tutorial to learn more about the MTurk examples and how to create and run your own task.


If you have any questions, bug reports or feature requests, please don't hesitate to post on our Github Issues page.

The Team

ParlAI is currently maintained by Alexander H. Miller, Will Feng and Jason Weston. A non-exhaustive list of other major contributors includes: Adam Fisch, Jiasen Lu, Antoine Bordes, Devi Parikh, Dhruv Batra, Filipe de Avila Belbute Peres and Chao Pan.


Please cite the arXiv paper if you use ParlAI in your work:

  title={ParlAI: A Dialog Research Software Platform},
  author={{Miller}, A.~H. and {Feng}, W. and {Fisch}, A. and {Lu}, J. and {Batra}, D. and {Bordes}, A. and {Parikh}, D. and {Weston}, J.},
  journal={arXiv preprint arXiv:{1705.06476}},


ParlAI is BSD-licensed. We also provide an additional patent grant.