Multimodal
TouchThinker: Scaling Tactile Commonsense Reasoning to the Open World with Large-scale Data and Action-aware Representation
TouchThinker is a new tactile-language framework designed to enhance commonsense reasoning in open-world settings by addressing limitations in existing tactile reasoning datasets and representation methods. It introduces TouchThinker-1M, a dataset comprising over a million entries, covering 415 objects, 8 scenarios, and 7 sensor types, alongside TouchThinker-Bench, an open-world benchmark for diverse tasks. The framework employs an action-aware modeling mechanism to improve representation efficiency, achieving competitive performance against state-of-the-art models, which is significant for practitioners aiming to develop more effective embodied agents that can understand tactile interactions in real-world environments.
tactile reasoningembodied agentscommonsense reasoning