Remote Evaluation
Each challenge has an evaluation script, which evaluates the submission of participants and returns the scores which will populate the leaderboard. The logic for evaluating and judging a submission is customizable and varies from challenge to challenge, but the overall structure of evaluation scripts is fixed due to architectural reasons.
Writing Remote Evaluation Script
The starter template for remote challenge evaluation can be found here.
Here are the steps to configure remote evaluation:
Setup Configs:
To configure authentication for the challenge set the following environment variables:
AUTH_TOKEN: Go to profile page -> Click on
Get your Auth Token-> Click on the Copy button. The auth token will get copied to your clipboard.API_SERVER: Use
https://eval.aiwhen setting up challenge on production server. Otherwise, usehttps://staging.eval.ai

QUEUE_NAME: Go to the challenge manage tab to fetch the challenge queue name.
CHALLENGE_PK: Go to the challenge manage tab to fetch the challenge primary key.

SAVE_DIR: (Optional) Path to submission data download location.
Write
evaluatemethod: Evaluation scripts are required to have anevaluate()function. This is the main function, which is used by workers to evaluate the submission messages.The syntax of evaluate function for a remote challenge is:
def evaluate(user_submission_file, phase_codename, test_annotation_file = None, **kwargs) pass
It receives three arguments, namely:
user_annotation_file: It represents the local path of the file submitted by the user for a particular challenge phase.phase_codename: It is thecodenameof the challenge phase from the challenge configuration yaml. This is passed as an argument so that the script can take actions according to the challenge phase.test_annotation_file: It represents the local path to the annotation file for the challenge. This is the file uploaded by the Challenge host while creating a challenge.
You may pass the
test_annotation_fileas default argument or choose to pass separately in themain.pydepending on the case. Thephase_codenameis passed automatically but is left as an argument to allow customization.After reading the files, some custom actions can be performed. This varies per challenge.
The
evaluate()method also accepts keyword arguments.IMPORTANT ⚠️: If the
evaluate()method fails due to any reason or there is a problem with the submission, please ensure to raise anExceptionwith an appropriate message.