Generation of Linux commands using natural language descriptions
MetadataShow full item record
Translating natural language into source code or programs is an important problem in natural language understanding -- both in terms of practical applications and in terms of understanding usage of language to affect action. In this domain, we consider the problem of translating natural language descriptions of LINUX commands into the corresponding commands. This is useful from the point of view of users who want to get commands executed but lack expertise to come up with them on the bash terminal. The major contribution of this thesis is a parallel corpus for translating natural language into LINUX commands. The corpus contains 4561 unique commands and 3-4 descriptions for each command, making a total of 11177 pairs. Along with the corpus, simple classification settings using Support Vector Machines and translation settings using Sequence to Sequence Recurrent Neural Network based models are studied to provide benchmarks for machine learning model performance on the collected dataset. This document provides analysis of the collected dataset, and describes the results and findings from models trained on the dataset.