dbdc3

Dataset

Below shows the descriptions of the development/evaluation data for English and Japanese. This page also describes the file format for the dialogues in the dataset.

Development Data

Development Data for English

Each zip file contains 100-115 dialogue sessions as individual JSON files. All utterances are annotated by 30 annotators with dialogue breakdown labels.

For CIC dataset, context files are also provided. They served as the topics of the conversation during the dialogue. The context comes from SQuAD dataset.

Development Data for Japanese

The following data can be used for development data for Japanese. These data are those used in the past DBDCs.

Evaluation Data

Evaluation Data for English

At the time of the formal-run, we will distribute 50 dialogues each for IRIS, TickTock, CIC and YI. (200 dialogues in total)

Evaluation Data for Japanese

At the time of the formal-run, we will distribute 50 dialogues each for DCM, DIT, and IRS. (150 dialogues in total)

Format of the JSON file

File name

Each JSON file contains one dialogue session, which conforms with the naming convention: <dialogue-id>.log.json. Each context file (CIC dataset only) has the name <dialogue-id>.log.context, which is a plain text file.

File format

Each file is in JSON format with UTF-8 encoding. A user and the system converse interchangeably and both utter ten times.

Following are the top-level fields:

Each element of the ‘turns’ field contains the following fields:

Each element of the ‘annotations’ field contains the following fields:

NOTE: Only the turn-index field is numerical. All the others are textual.

Speakers in the dialogues

For IRIS and TickTock datasets, refer to WOCHAT website. For CIC, refer also to CIC website. For YI, the speakers were from AMT.

Annotators

For IRIS and TickTock datasets, we used crowd workers from CrowdFlower for annotation. They are ‘level-2’ annotators from Australia, Canada, New Zealand, United Kingdom, and United States. We asked the non-native English speaking workers to refrain from joining this annotation task but this is not guaranteed. For CIC and YI datasets, we used crowd workers from AMT.

Miscellaneous Notes

Due to the subjective nature of this task, we did not provide any check question to be used in CrowdFlower. Actual IRIS dialogue sessions start with a fixed system prompt. We cut out the initial prompt.

Acknowledgements

The development of these datasets were supported by the track sponsors and the Japanese Society of Artificial Intelligence (JSAI). We thank these supporters and the providers of the original dialogue data.