Show simple item record

dc.contributor.authorXu, Ruinian
dc.contributor.authorChen, Hongyi
dc.contributor.authorLin, Yunzhi
dc.contributor.authorVela, Patricio A.
dc.date.accessioned2022-03-04T21:40:45Z
dc.date.available2022-03-04T21:40:45Z
dc.date.issued2022-02
dc.identifier.urihttp://hdl.handle.net/1853/66305
dc.descriptionThis item contains one of three datasets which supplement the manuscript https://arxiv.org/abs/2202.12912en_US
dc.description.abstractThe affiliated paper investigates robot manipulation based on human instruction with ambiguous requests. The intent is to compensate for imperfect natural language via visual observations. Early symbolic methods, based on manually defined symbols, built modular framework consist of semantic parsing and task planning for producing sequences of actions from natural language requests. Modern connectionist methods employ deep neural networks to automatically learn visual and linguistic features and map to a sequence of low-level actions, in an endto-end fashion. These two approaches are blended to create a hybrid, modular framework: it formulates instruction following as symbolic goal learning via deep neural networks followed by task planning via symbolic planners. Connectionist and symbolic modules are bridged with Planning Domain Definition Language. The vision-and-language learning network predicts its goal representation, which is sent to a planner for producing a task-completing action sequence. For improving the flexibility of natural language, we further incorporate implicit human intents with explicit human instructions. To learn generic features for vision and language, we propose to separately pretrain vision and language encoders on scene graph parsing and semantic textual similarity tasks. Benchmarking evaluates the impacts of different components of, or options for, the vision-and-language learning model and shows the effectiveness of pretraining strategies. Manipulation experiments conducted in the simulator AI2THOR show the robustness of the framework to novel scenarios.en_US
dc.description.sponsorshipNational Science Foundation Award #2026611en_US
dc.publisherGeorgia Institute of Technologyen_US
dc.subjectDeep learning in grasping and manipulationen_US
dc.subjectAI-enabled roboticsen_US
dc.subjectRepresentation learningen_US
dc.titleSGL: Symbolic Goal Learning in a Hybrid, Modular Framework for Human Instruction Following - Symbolic Goal Learning Dataseten_US
dc.title.alternativeSGL: Symbolic Goal Learning in a Hybrid, Modular Framework for Human Instruction Followingen_US
dc.typeDataseten_US
dc.contributor.corporatenameGeorgia Institute of Technology. School of Electrical and Computer Engineeringen_US
dc.relation.issupplementtohttps://arxiv.org/abs/2202.12912v1


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record