ChinaXiv.org 中国科学院科技论文预发布平台

Submitted Date

2024
2

Subjects

Authors

王家晔
2

Institution

result total 2.

Hide Summary

Hits

Date

Downloads

Your conditions: 腾讯PCG

1. ChinaXiv:202404.00273
Download

The Slides for Guiding Large Language Models to Generate Computer-Parsable Content

Subjects: Computer Science >> Computer Software Subjects: Linguistics and Applied Linguistics >> Linguistics and Applied Linguistics submitted time 2024-04-21

Jiaye Wang

Abstract： This slide presentation describes the research on Guiding Large Language Models to Generate Computer-Parsable Content in terms of Background, Motivation, Method, Effect, Prospect and Acknowledgements. For the full paper, please refer to: https://arxiv.org/abs/2404.05499

Peer Review Status:Awaiting Review

Hits 1036 Downloads 260 Comment
2. ChinaXiv:202403.00340
Download

Constraining Large Language Model for Generating Computer-Parsable Content

Subjects: Computer Science >> Computer Software Subjects: Linguistics and Applied Linguistics >> Linguistics and Applied Linguistics submitted time 2024-04-07

Jiaye Wang

Abstract： Large language models (LLMs) have demonstrated remarkable capabilities in learning patterns from massive text corpora, including word relationships, sentence structures, and even complex semantic and pragmatic information. However, it remains challenging to induce pre-trained language models to generate structured content that strictly follows specific conventions.We propose a scheme for guiding LLMs to generate highly usable content for computers without the need for fine-tuning and additional neural network inference, by introducing coroutine-based content generation constraints through a pre-agreed context-free grammar (CFG), which guides the autoregressive model Transformer to sample the correct tokens during its decoding phase to form a program-compliant form in the decoding phase of the autoregressive model Transformer to form a formal language that conforms to the program conventions. This will effectively improve the stability and consistency of LLMs in generating target data structures, types or instructions, and reduce the difficulty of application development and integration.We first verified that the error rate of models such as GPT-2 and Gemma reaches 95% when the length of the generated DSLs are greater than 36 and 282, respectively, through the experiment of matching bracket pairs , which illustrates the performance problem of some current LLMs in the generation of specific DSLs. We also present YieldLang, a coroutine-based DSL generation framework, and conduct experiments using LLMs on multiple task datasets, including tasks such as JSON, Mermaid flowchart, and function call expression generation. These experiments show that the approach in this paper improves its accuracy by a factor of 1.09 to 11.6 compared to the benchmarks, and in the best case is able to reduce the number of samples used by the LLMs to generate JSON to about 16.5% of the benchmarks, which will effectively improve the usability of the content generated by the LLMs for computer programs.

Peer Review Status:Awaiting Review

Hits 1775 Downloads 387 Comment

The Slides for Guiding Large Language Models to Generate Computer-Parsable Content

Constraining Large Language Model for Generating Computer-Parsable Content