# 高中数学知识图谱数据库方案 这是一个基于PostgreSQL的高中数学知识图谱数据库系统,支持知识的结构化存储、关系网络分析、全文搜索和个性化推荐。 ## 📁 文件结构 ``` all/ ├── database_schema.sql # 数据库架构SQL文件 ├── data_import.py # 数据导入脚本 ├── config.py # 配置文件 ├── query_examples.sql # 常用查询示例 ├── analysis_queries.sql # 分析查询SQL ├── README.md # 说明文档(本文件) ├── knowledge-*.json # 知识点JSON文件 ├── method-*.json # 方法JSON文件 └── problems-*.json # 题目JSON文件 ``` ## 🏗️ 数据库架构 ### 三层架构设计 1. **Knowledge(知识层)** - 知识点定义、原理说明、适用条件 - 支持前置关系、包含关系、关联内容 2. **Method(方法层)** - 解题方法、步骤说明、常见错误 - 支撑知识点、数学思想、解题策略 3. **Problem(题目层)** - 题目内容、题型分类、难度评估 - 知识点标注、方法标注、来源信息 ### 主要数据表 - `knowledge_points` - 知识点主表 - `methods` - 解题方法主表 - `problems` - 题目主表 - `knowledge_relations` - 知识点关系表 - `problem_knowledge_tags` - 题目知识点标注 - `problem_method_tags` - 题目方法标注 ## 🚀 快速开始 ### 1. 环境准备 ```bash # 安装依赖 pip install psycopg2-binary # 确保PostgreSQL已启动 pg_ctl -D /usr/local/var/postgres start ``` ### 2. 创建数据库 ```bash # 连接到PostgreSQL psql -U postgres # 创建数据库 CREATE DATABASE math_knowledge_graph; \c math_knowledge_graph # 执行架构脚本 \i database_schema.sql ``` ### 3. 配置数据库连接 编辑 `config.py` 文件,修改数据库连接信息: ```python DATABASE_CONFIG = { 'host': 'localhost', 'port': 5432, 'database': 'math_knowledge_graph', 'user': 'postgres', 'password': 'your_actual_password', # 修改为实际密码 } ``` ### 4. 导入数据 ```bash python data_import.py ``` ## 📊 使用示例 ### 基础查询 ```sql -- 查询所有核心知识点 SELECT code, name, level, importance FROM knowledge_points WHERE importance = '核心' ORDER BY code; -- 查询特定知识点的详细信息 SELECT * FROM knowledge_detail_view WHERE code = 'K1-1-1-01'; -- 查询题目及其知识点标注 SELECT p.code, p.stem, kp.name, pkt.tag_type FROM problems p JOIN problem_knowledge_tags pkt ON p.id = pkt.problem_id JOIN knowledge_points kp ON pkt.knowledge_id = kp.id WHERE p.code = 'T1-1-1-E01'; ``` ### 关系网络查询 ```sql -- 查询知识点的前置关系 WITH RECURSIVE knowledge_chain AS ( SELECT k.id, k.code, k.name, 1 as level FROM knowledge_points k WHERE k.code = 'K1-1-3-01' -- 目标知识点 UNION ALL SELECT k.id, k.code, k.name, kc.level + 1 FROM knowledge_points k JOIN knowledge_relations kr ON k.id = kr.source_knowledge_id JOIN knowledge_chain kc ON kr.target_knowledge_id = kc.id WHERE kr.relation_type = '前置知识' ) SELECT * FROM knowledge_chain ORDER BY level; -- 查询题目的知识点覆盖度 SELECT c.chapter_name, COUNT(DISTINCT kp.id) as knowledge_count, COUNT(DISTINCT p.id) as problem_count FROM chapters c JOIN sections s ON c.id = s.chapter_id JOIN knowledge_points kp ON s.id = kp.section_id LEFT JOIN problem_knowledge_tags pkt ON kp.id = pkt.knowledge_id LEFT JOIN problems p ON pkt.problem_id = p.id GROUP BY c.id, c.chapter_name; ``` ### 全文搜索 ```sql -- 搜索知识点 SELECT code, name, importance FROM knowledge_points WHERE search_vector @@ to_tsquery('chinese', '函数 & 定义') ORDER BY ts_rank(search_vector, to_tsquery('chinese', '函数 & 定义')) DESC; -- 搜索题目 SELECT code, problem_type, stem FROM problems WHERE search_vector @@ to_tsquery('chinese', '定义域') ORDER BY ts_rank(search_vector, to_tsquery('chinese', '定义域')) DESC; ``` ### 数据分析 ```sql -- 难度分布统计 SELECT type, COUNT(*) as count, AVG(difficulty_level) as avg_difficulty FROM methods GROUP BY type ORDER BY avg_difficulty; -- 知识点考查频次统计 SELECT kp.name, COUNT(pkt.id) as problem_count, AVG(p.difficulty_info->>'整体难度')::float as avg_problem_difficulty FROM knowledge_points kp LEFT JOIN problem_knowledge_tags pkt ON kp.id = pkt.knowledge_id LEFT JOIN problems p ON pkt.problem_id = p.id WHERE pkt.tag_type = '主要考查' GROUP BY kp.id, kp.name HAVING COUNT(pkt.id) > 0 ORDER BY problem_count DESC LIMIT 10; ``` ## 🛠️ 数据维护 ### 数据备份 ```bash # 备份数据库 pg_dump -U postgres math_knowledge_graph > backup.sql # 恢复数据库 psql -U postgres math_knowledge_graph < backup.sql ``` ### 索引重建 ```sql -- 重建搜索向量索引 UPDATE knowledge_points SET search_vector = setweight(to_tsvector('chinese', COALESCE(name, '')), 'A') || setweight(to_tsvector('chinese', COALESCE(core_content::text, '')), 'B'); -- 重建所有索引 REINDEX DATABASE math_knowledge_graph; ``` ### 性能优化 ```sql -- 分析查询性能 EXPLAIN ANALYZE SELECT * FROM knowledge_points WHERE importance = '核心'; -- 更新表统计信息 ANALYZE knowledge_points; ANALYZE methods; ANALYZE problems; ``` ## 📈 扩展功能 ### 个性化学习路径 ```sql -- 基于知识点关系的学习路径推荐 WITH RECURSIVE learning_path AS ( SELECT k.id, k.code, k.name, k.level, 0 as step, ARRAY[k.id] as path_ids FROM knowledge_points k WHERE k.importance = '核心' AND k.level = '二级' UNION ALL SELECT k.id, k.code, k.name, k.level, lp.step + 1, lp.path_ids || k.id FROM knowledge_points k JOIN knowledge_relations kr ON k.id = kr.source_knowledge_id JOIN learning_path lp ON kr.target_knowledge_id = lp.id WHERE kr.relation_type = '前置知识' AND k.id != ALL(lp.path_ids) AND lp.step < 5 ) SELECT code, name, level, step FROM learning_path WHERE step <= 3 ORDER BY step, name; ``` ### 错误模式分析 ```sql -- 常见错误类型统计 SELECT m.name as method_name, COUNT(mce.id) as error_count, ARRAY_AGG(DISTINCT mce.error_description) as error_types FROM methods m LEFT JOIN method_common_errors mce ON m.id = mce.method_id GROUP BY m.id, m.name HAVING COUNT(mce.id) > 0 ORDER BY error_count DESC; ``` ## 🔍 故障排除 ### 常见问题 1. **连接失败** - 检查PostgreSQL服务是否启动 - 验证数据库配置信息 - 确认防火墙设置 2. **导入失败** - 检查JSON文件格式是否正确 - 验证文件编码(应为UTF-8) - 查看导入日志文件 3. **查询缓慢** - 检查索引是否创建 - 分析查询执行计划 - 考虑优化SQL语句 ### 日志文件 ```bash # 查看导入日志 tail -f data_import.log # 查看PostgreSQL日志 tail -f /usr/local/var/postgres.log ``` ## 📞 技术支持 如有问题,请检查: 1. PostgreSQL版本要求:≥ 12 2. Python版本要求:≥ 3.7 3. 所需Python包:psycopg2-binary ## 📄 许可证 本项目采用MIT许可证,详见LICENSE文件。