Research
What Really Matters for Table LLMs? A Meta-Evaluation of Model and Data Effects
This study presents a meta-evaluation of table LLMs, replicating four models by instruction-tuning three foundation models on four datasets, resulting in 12 distinct models evaluated across 16 benchmarks. The findings indicate that the choice of base model significantly influences performance, overshadowing the effects of training data, and highlight ongoing challenges in generalization and reasoning within table modeling. These insights are critical for practitioners, as they underscore the importance of model selection in optimizing performance for table-related tasks.
table llmevaluationinstruction tuning