Spatial reasoning via deep vision models for robotic sequential manipulation
buir.contributor.author | Öğüz, Salih Özgür | |
dc.citation.epage | 11335 | en_US |
dc.citation.spage | 11328 | |
dc.contributor.author | Zhou, H. | |
dc.contributor.author | Schubert, I. | |
dc.contributor.author | Toussaint, M. | |
dc.contributor.author | Öğüz, Salih Özgür | |
dc.coverage.spatial | Detroit, USA | |
dc.date.accessioned | 2024-03-15T11:09:40Z | |
dc.date.available | 2024-03-15T11:09:40Z | |
dc.date.issued | 2023-10-01 | |
dc.department | Department of Computer Engineering | |
dc.description | Conference Name: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023 | |
dc.description | Date of Conference: 1 October 2023through 5 | |
dc.description.abstract | In this paper, we propose using deep neural architectures (i.e., vision transformers and ResNet) as heuristics for sequential decision-making in robotic manipulation problems. This formulation enables predicting the subset of objects that are relevant for completing a task. Such problems are often addressed by task and motion planning (TAMP) formulations combining symbolic reasoning and continuous motion planning. In essence, the action-object relationships are resolved for discrete, symbolic decisions that are used to solve manipulation motions (e.g., via nonlinear trajectory optimization). However, solving long-horizon tasks requires consideration of all possible action-object combinations which limits the scalability of TAMP approaches. To overcome this combinatorial complexity, we introduce a visual perception module integrated with a TAMP-solver. Given a task and an initial image of the scene, the learned model outputs the relevancy of objects to accomplish the task. By incorporating the predictions of the model into a TAMP formulation as a heuristic, the size of the search space is significantly reduced. Results show that our framework finds feasible solutions more efficiently when compared to a state-of-the-art TAMP solver. | |
dc.description.provenance | Made available in DSpace on 2024-03-15T11:09:40Z (GMT). No. of bitstreams: 1 Spatial_reasoning_via_deep_vision_models_for_robotic_sequential_manipulation.pdf: 341536 bytes, checksum: f238442dbc1dfe4c5b44207fbbeffac8 (MD5) Previous issue date: 2023-10-01 | en |
dc.identifier.doi | 10.1109/IROS55552.2023.10342010 | en_US |
dc.identifier.eissn | 2153-0866 | en_US |
dc.identifier.issn | 2153-0858 | en_US |
dc.identifier.uri | https://hdl.handle.net/11693/114794 | en_US |
dc.language.iso | English | en_US |
dc.publisher | Institute of Electrical and Electronics Engineers | en_US |
dc.relation.isversionof | https://dx.doi.org/10.1109/IROS55552.2023.10342010 | |
dc.source.title | IEEE International Conference on Intelligent Robots and Systems | |
dc.subject | Computer vision | |
dc.subject | Decision making | |
dc.subject | Deep learning | |
dc.subject | Intelligent robots | |
dc.title | Spatial reasoning via deep vision models for robotic sequential manipulation | |
dc.type | Conference Paper |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Spatial_reasoning_via_deep_vision_models_for_robotic_sequential_manipulation.pdf
- Size:
- 330.64 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 2.01 KB
- Format:
- Item-specific license agreed upon to submission
- Description: