Models
Datalab Releases lift: A 9B Open-Weights Vision Model That Extracts Structured JSON From PDFs Using Schemas
Datalab has released lift, a 9B open-weights vision model designed to convert PDFs and images into structured JSON that adheres to predefined schemas. It employs schema-constrained decoding to ensure valid output and utilizes trained abstention to avoid hallucinating absent fields, achieving a field accuracy of 90.2% on a benchmark of 225 documents. This model is significant for practitioners as it enhances the extraction of structured data from unstructured documents, improving data processing workflows in various applications.
datalabvision_modeljson