Unifying Large Language Models and Knowledge Graphs: A Roadmap 论文阅读笔记
Key Words:
NLP, LLM, Generative Pre-training, KGs, Roadmap, Bidirectional Reasoning
Abstract:
LLMs are black models and can't capture and access factual knowledge. KGs are structured knowledge models that explicitly store rich factual knowledge. The combinations of KGs and LLMs have three frameworks,
KG-enhanced LLMs, pre-training and inference stages to provide external knowledge, used for analyzing LLMs and providing interpretability.
LLM - augmented KGs, KG embedding, KG completion, KG construction, KG-to text generation, KGQA.
Synergized LLMs+KGs, enhance performance in knowledge representation and reasoning.
Background
Introduction of LLMs
Encoder-only LLMs
Use the encoder to encode the sentence and understand the relationships between words.
Predict the mask words in an input sentence. Text classification, named entity recognition.
Encoder-decoder LLMs
Adopt both encoder and decoder modules. The encoder module works for encoding the input into a hidden-space, and the decoder is used to generate the target output text. Summarization, translation, question answering.
Decoder-only LLMs
Adopt the decoder module to generate target output text.
Prompt Engineering
Prompt is a sequence of natural language inputs for LLMs that specified for the task, including:
Instruction: instructs the model to do a specific task.
Context: provides the context for the input text or few-shot examples.
Input text: the text that needs to be processed by the model.
Improve the capacity of LLMs in deverse complex tasks. CoT prompt enables complex reasoning capabilities throught intermediate reasoning steps.
Introduction of KGs

Roadmap
KG-enhanced LLMs
Pre-training stage
Integrating KGs into Training Objective
Integrating KGs into LLMs Input
KGs Instruction-tuning
Inference stage
Retrieval-Augmented Knowledge Fusion
RAG
KGs Prompting
Interpretability
KGs for LLM probing
KGs for LLM Analysis
LLM-augmented KGs
Knowledge Graph embedding aims to map each entity and relation into a low-dimensional vector space.
Text encoders for KG-related tasks
LLM processes the original corpus and entities for KG construction.
End-to-End KG Construction
Distilling Knowledge Graphs from LLMs
KG prompt, KG completion and KG reasoning.
PaE (LLM as Encoders)
PaG (LLM as Generators)
LLM-augmented KG-to-text Generation
Leveraging Knowledge from LLMs
Constructing large weakly KG-text aligned Corpus
LLM-augmented KG Question Answering
LLMs as Entity/relation Extractors
LLMs as Answer Reasoners
Synergized LLMs + KGs

Synergized Knowledge Representation
Aims to design a synergized model can represent knowledge from both LLMs and KGs.
Synergized Reasoning
LLM-KG Fusion Reasoning
