note/知识图谱/教科书-数学/all/README.md
2025-11-19 10:16:05 +08:00

305 lines
7.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 高中数学知识图谱数据库方案
这是一个基于PostgreSQL的高中数学知识图谱数据库系统支持知识的结构化存储、关系网络分析、全文搜索和个性化推荐。
## 📁 文件结构
```
all/
├── database_schema.sql # 数据库架构SQL文件
├── data_import.py # 数据导入脚本
├── config.py # 配置文件
├── query_examples.sql # 常用查询示例
├── analysis_queries.sql # 分析查询SQL
├── README.md # 说明文档(本文件)
├── knowledge-*.json # 知识点JSON文件
├── method-*.json # 方法JSON文件
└── problems-*.json # 题目JSON文件
```
## 🏗️ 数据库架构
### 三层架构设计
1. **Knowledge知识层**
- 知识点定义、原理说明、适用条件
- 支持前置关系、包含关系、关联内容
2. **Method方法层**
- 解题方法、步骤说明、常见错误
- 支撑知识点、数学思想、解题策略
3. **Problem题目层**
- 题目内容、题型分类、难度评估
- 知识点标注、方法标注、来源信息
### 主要数据表
- `knowledge_points` - 知识点主表
- `methods` - 解题方法主表
- `problems` - 题目主表
- `knowledge_relations` - 知识点关系表
- `problem_knowledge_tags` - 题目知识点标注
- `problem_method_tags` - 题目方法标注
## 🚀 快速开始
### 1. 环境准备
```bash
# 安装依赖
pip install psycopg2-binary
# 确保PostgreSQL已启动
pg_ctl -D /usr/local/var/postgres start
```
### 2. 创建数据库
```bash
# 连接到PostgreSQL
psql -U postgres
# 创建数据库
CREATE DATABASE math_knowledge_graph;
\c math_knowledge_graph
# 执行架构脚本
\i database_schema.sql
```
### 3. 配置数据库连接
编辑 `config.py` 文件,修改数据库连接信息:
```python
DATABASE_CONFIG = {
'host': 'localhost',
'port': 5432,
'database': 'math_knowledge_graph',
'user': 'postgres',
'password': 'your_actual_password', # 修改为实际密码
}
```
### 4. 导入数据
```bash
python data_import.py
```
## 📊 使用示例
### 基础查询
```sql
-- 查询所有核心知识点
SELECT code, name, level, importance
FROM knowledge_points
WHERE importance = '核心'
ORDER BY code;
-- 查询特定知识点的详细信息
SELECT * FROM knowledge_detail_view
WHERE code = 'K1-1-1-01';
-- 查询题目及其知识点标注
SELECT p.code, p.stem, kp.name, pkt.tag_type
FROM problems p
JOIN problem_knowledge_tags pkt ON p.id = pkt.problem_id
JOIN knowledge_points kp ON pkt.knowledge_id = kp.id
WHERE p.code = 'T1-1-1-E01';
```
### 关系网络查询
```sql
-- 查询知识点的前置关系
WITH RECURSIVE knowledge_chain AS (
SELECT k.id, k.code, k.name, 1 as level
FROM knowledge_points k
WHERE k.code = 'K1-1-3-01' -- 目标知识点
UNION ALL
SELECT k.id, k.code, k.name, kc.level + 1
FROM knowledge_points k
JOIN knowledge_relations kr ON k.id = kr.source_knowledge_id
JOIN knowledge_chain kc ON kr.target_knowledge_id = kc.id
WHERE kr.relation_type = '前置知识'
)
SELECT * FROM knowledge_chain ORDER BY level;
-- 查询题目的知识点覆盖度
SELECT
c.chapter_name,
COUNT(DISTINCT kp.id) as knowledge_count,
COUNT(DISTINCT p.id) as problem_count
FROM chapters c
JOIN sections s ON c.id = s.chapter_id
JOIN knowledge_points kp ON s.id = kp.section_id
LEFT JOIN problem_knowledge_tags pkt ON kp.id = pkt.knowledge_id
LEFT JOIN problems p ON pkt.problem_id = p.id
GROUP BY c.id, c.chapter_name;
```
### 全文搜索
```sql
-- 搜索知识点
SELECT code, name, importance
FROM knowledge_points
WHERE search_vector @@ to_tsquery('chinese', '函数 & 定义')
ORDER BY ts_rank(search_vector, to_tsquery('chinese', '函数 & 定义')) DESC;
-- 搜索题目
SELECT code, problem_type, stem
FROM problems
WHERE search_vector @@ to_tsquery('chinese', '定义域')
ORDER BY ts_rank(search_vector, to_tsquery('chinese', '定义域')) DESC;
```
### 数据分析
```sql
-- 难度分布统计
SELECT
type,
COUNT(*) as count,
AVG(difficulty_level) as avg_difficulty
FROM methods
GROUP BY type
ORDER BY avg_difficulty;
-- 知识点考查频次统计
SELECT
kp.name,
COUNT(pkt.id) as problem_count,
AVG(p.difficulty_info->>'整体难度')::float as avg_problem_difficulty
FROM knowledge_points kp
LEFT JOIN problem_knowledge_tags pkt ON kp.id = pkt.knowledge_id
LEFT JOIN problems p ON pkt.problem_id = p.id
WHERE pkt.tag_type = '主要考查'
GROUP BY kp.id, kp.name
HAVING COUNT(pkt.id) > 0
ORDER BY problem_count DESC
LIMIT 10;
```
## 🛠️ 数据维护
### 数据备份
```bash
# 备份数据库
pg_dump -U postgres math_knowledge_graph > backup.sql
# 恢复数据库
psql -U postgres math_knowledge_graph < backup.sql
```
### 索引重建
```sql
-- 重建搜索向量索引
UPDATE knowledge_points SET search_vector =
setweight(to_tsvector('chinese', COALESCE(name, '')), 'A') ||
setweight(to_tsvector('chinese', COALESCE(core_content::text, '')), 'B');
-- 重建所有索引
REINDEX DATABASE math_knowledge_graph;
```
### 性能优化
```sql
-- 分析查询性能
EXPLAIN ANALYZE SELECT * FROM knowledge_points WHERE importance = '核心';
-- 更新表统计信息
ANALYZE knowledge_points;
ANALYZE methods;
ANALYZE problems;
```
## 📈 扩展功能
### 个性化学习路径
```sql
-- 基于知识点关系的学习路径推荐
WITH RECURSIVE learning_path AS (
SELECT k.id, k.code, k.name, k.level, 0 as step, ARRAY[k.id] as path_ids
FROM knowledge_points k
WHERE k.importance = '核心' AND k.level = '二级'
UNION ALL
SELECT k.id, k.code, k.name, k.level, lp.step + 1, lp.path_ids || k.id
FROM knowledge_points k
JOIN knowledge_relations kr ON k.id = kr.source_knowledge_id
JOIN learning_path lp ON kr.target_knowledge_id = lp.id
WHERE kr.relation_type = '前置知识'
AND k.id != ALL(lp.path_ids)
AND lp.step < 5
)
SELECT code, name, level, step
FROM learning_path
WHERE step <= 3
ORDER BY step, name;
```
### 错误模式分析
```sql
-- 常见错误类型统计
SELECT
m.name as method_name,
COUNT(mce.id) as error_count,
ARRAY_AGG(DISTINCT mce.error_description) as error_types
FROM methods m
LEFT JOIN method_common_errors mce ON m.id = mce.method_id
GROUP BY m.id, m.name
HAVING COUNT(mce.id) > 0
ORDER BY error_count DESC;
```
## 🔍 故障排除
### 常见问题
1. **连接失败**
- 检查PostgreSQL服务是否启动
- 验证数据库配置信息
- 确认防火墙设置
2. **导入失败**
- 检查JSON文件格式是否正确
- 验证文件编码应为UTF-8
- 查看导入日志文件
3. **查询缓慢**
- 检查索引是否创建
- 分析查询执行计划
- 考虑优化SQL语句
### 日志文件
```bash
# 查看导入日志
tail -f data_import.log
# 查看PostgreSQL日志
tail -f /usr/local/var/postgres.log
```
## 📞 技术支持
如有问题,请检查:
1. PostgreSQL版本要求≥ 12
2. Python版本要求≥ 3.7
3. 所需Python包psycopg2-binary
## 📄 许可证
本项目采用MIT许可证详见LICENSE文件。