All Products
Search
Document Center

Platform For AI:LLM-Remove LaTeX Bibliography (DLC)

Last Updated:Jun 20, 2026

The LLM-Remove LaTeX Bibliography (DLC) component processes TeX documents by removing the bibliography section from the end of the text. Input data must be from an OSS file in JSONL format (example), where each line is a valid JSON object but the file as a whole is not.

Supported computing resources

DLC

Algorithm

This component identifies the bibliography in a LaTeX text using the regular expression r'(\\appendix|\\begin\{references\}|\\begin\{REFERENCES\}|\\begin\{thebibliography\}|\\bibliography\{.*\}).*$'. In this expression, multiple match patterns are separated by a vertical bar (|).

The component removes all strings that match the regular expression. The following example shows the results before and after processing.

Before

The Current field value pop-up displays a LaTeX code snippet from the end of the sample-sigconf.tex file. This snippet includes statements such as \begin{references}, \end{document}, and \endinput, as well as comment lines at the beginning and end.

After

The Current field value pop-up shows the result after processing. Only the header comment text from the LaTeX file remains: %% This is file `sample-sigconf.tex\clearpage.

Configure component

In Designer, add the LLM-Remove LaTeX Bibliography (DLC) component to your workflow and configure its parameters in the right-side panel.

Parameter type

Parameter

Required

Description

Default

Field settings

Field to process

Yes

The name of the field to process.

None

Output OSS directory

No

The OSS directory for storing the processed data. If this parameter is left empty, the default workspace path is used.

None

Execution tuning

Number of processes

No

The number of concurrent processes to use for the job.

8

Select resource group

public resource group

No

Allows you to configure the node specification (CPU or GPU instance), number of nodes, and VPC.

None

dedicated resource group

No

Allows you to configure the number of CPU cores, memory, shared memory, number of GPUs, and number of nodes.

None

Maximum runtime

No

The maximum time allowed for the job to run. The job is terminated if it exceeds this limit.

None