Spatial reasoning via deep vision models for robotic sequential manipulation

Zhou, H.; Schubert, I.; Toussaint, M.; Öğüz, Salih Özgür

Spatial reasoning via deep vision models for robotic sequential manipulation

Files

Spatial_reasoning_via_deep_vision_models_for_robotic_sequential_manipulation.pdf (330.64 KB)

Date

2023-10-01

Authors

BUIR Usage Stats

52
views

32
downloads

Citation Stats

Abstract

In this paper, we propose using deep neural architectures (i.e., vision transformers and ResNet) as heuristics for sequential decision-making in robotic manipulation problems. This formulation enables predicting the subset of objects that are relevant for completing a task. Such problems are often addressed by task and motion planning (TAMP) formulations combining symbolic reasoning and continuous motion planning. In essence, the action-object relationships are resolved for discrete, symbolic decisions that are used to solve manipulation motions (e.g., via nonlinear trajectory optimization). However, solving long-horizon tasks requires consideration of all possible action-object combinations which limits the scalability of TAMP approaches. To overcome this combinatorial complexity, we introduce a visual perception module integrated with a TAMP-solver. Given a task and an initial image of the scene, the learned model outputs the relevancy of objects to accomplish the task. By incorporating the predictions of the model into a TAMP formulation as a heuristic, the size of the search space is significantly reduced. Results show that our framework finds feasible solutions more efficiently when compared to a state-of-the-art TAMP solver.

Source Title

IEEE International Conference on Intelligent Robots and Systems

Publisher

Institute of Electrical and Electronics Engineers

Keywords

Computer vision, Decision making, Deep learning, Intelligent robots

Permalink

https://hdl.handle.net/11693/114794

Published Version (Please cite this version)

https://dx.doi.org/10.1109/IROS55552.2023.10342010

Collections

Scholarly Publications - Computer Engineering

Language

English

Type

Conference Paper

Full item page

Spatial reasoning via deep vision models for robotic sequential manipulation

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Spatial reasoning via deep vision models for robotic sequential manipulation

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type