Knowledge-Learning-Toolkits

An implementation for Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning (ACL 2025).
Please contact @Ruoxi Xu (ruoxi2021@iscas.ac.cn) for questions and suggestions.

Key Innovation

Moves beyond simple memorization to enable deep knowledge reasoning through automated data augmentation and multi-level evaluation.

Diversified Data Augmentation
Multi-level Test Cases Generation
- Recall-level: Basic knowledge recall
- Extraction-level: Knowledge extraction from complex contexts
- Reasoning-level: Logical inference and knowledge application

Project Structure

knowledge-learning-toolkits
├── data_augumentation  # Knowledge augmentation module
├── ── augment.py  # Data augmentation
├── ── prompt_hub.py  # Defines prompts for data generation and functions for parsing returned results
├── data_loader  # Data preprocessing module
├── ── load_data.py  # Reads and preprocesses data; converts structured and unstructured data into a unified format: [{'idx': 0, 'text': text, 'knowledges': [{'text': text, 'triplet': [subject, relation, object]}]}]
├── evaluator  # Evaluation module
├── ── generate_test_cases.py  # Automatically constructs test cases
├── trainer  # Train module
├── utils  # Common utility functions
├── config.yaml  # Configuration settings
└── main.py  # Main entry script

Quick Start

Data Preparation

The toolkit currently accepts two types of data formats:

For data in the form of triples, it should be stored in a .jsonl file, with each entry in the following format: {'idx': 0, 'triplet': [subject, relation, object]}
For free-text data, it should also be stored in a .jsonl file, with each entry in the following format: {'idx': 0, 'text': text}

Run Pipeline

python main.py

Citation

If you find this project helpful, please use the following to cite it:

@article{xu2025memorizing,
  title={Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning},
  author={Xu, Ruoxi and Ji, Yunjie and Cao, Boxi and Lu, Yaojie and Lin, Hongyu and Han, Xianpei and He, Ben and Sun, Yingfei and Li, Xiangang and Sun, Le},
  journal={arXiv preprint arXiv:2504.00472},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Knowledge-Learning-Toolkits

Key Innovation

Project Structure

Quick Start

Data Preparation

Run Pipeline

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
data_augmentation		data_augmentation
data_loader		data_loader
evaluator		evaluator
utils		utils
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
main.py		main.py

icip-cas/knowledge-learning-toolkits

Folders and files

Latest commit

History

Repository files navigation

Knowledge-Learning-Toolkits

Key Innovation

Project Structure

Quick Start

Data Preparation

Run Pipeline

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages