EvalAI: Evaluating AI systems at scale
MetadataShow full item record
Artificial Intelligence research has progressed tremendously in the last few years. There has been the introduction of several new multi-modal datasets and tasks due to which it is becoming much harder to compare new algorithms with existing ones. To solve this problem, this thesis introduces EvalAI, an open source platform for evaluating and comparing machine learning and artificial intelligence algorithms at scale. This platform is built to provide an open source, standardized, scalable solution for evaluating learned models using automatic metrics as well as with human-in-the-loop evaluation. By simplifying and standardizing the process of benchmarking, EvalAI seeks to lower the barrier to entry for participating in the global scientific effort to push the frontiers of machine learning and artificial intelligence, increasing the rate of measurable progress in these communities.