Dataset Splits

Dataset splits define the subset of test-set on which the submissions will be evaluated on. Generally, most challenges have three splits:

train_split (Allow participants to make a large number of submissions, let them see how they are doing, and let them overfit)
test_split (Allow a small number of submissions so that they cannot mimic test_set. Use this split to decide the winners for the challenge)
val_split (Allow participants to make submissions and evaluate on the validation split)

A dataset split has the following subfields:

id (required)

Type: integer

Description: Unique numeric identifier for the dataset split. Used internally to reference this split in phase-split mappings.

Example:
```
id: 1
```
name (required)

Type: string

Constraints: Must be unique.

Description: Human-readable name of the dataset split. This will be shown in the EvalAI UI and should clearly describe the split’s purpose.

Example:
```
name: Train Split
```
codename (required)

Type: string

Constraints: Must be unique and must match the codename used in the evaluation script.

Description: A unique identifier used to map evaluation results to the correct dataset split. This is critical for EvalAI to interpret the scores returned by your evaluation script.

Example:
```
codename: train_split
```
Example

Here’s how the dataset splits configuration will look like in challenge_config.yaml file of a sample challenge:
```
dataset_splits:
  - id: 1
    name: Train Split
    codename: train_split
  - id: 2
    name: Test Split
    codename: test_split
  - id: 3
    name: Validation Split
    codename: val_split
```