## 外部接口 ### 1. 分割试卷API: 请求: ```bash curl --location --request POST 'https://ekt1.suzhou.edu.cn/autocut/detect' \ --header 'Accept: */*' \ --header 'Host: ekt1.suzhou.edu.cn' \ --header 'Connection: keep-alive' \ --header 'Content-Type: multipart/form-data; boundary=--------------------------687058231513459413708701' \ --form 'image=@"/Users/robertmaxwell/code/zstp/AutoCut/pic.png"' \ --form 'enable_error_detection="false"' ``` 返回: ```json { "data": { "crop_results": { "cropped_count": 14, "cropped_images": [ { "bbox": [ 970, 69, 1909, 246 ], "filename": "question_001.jpg", "question_id": 1, "vertical_id": 1 }, { "bbox": [ 41, 107, 970, 253 ], "filename": "question_002.jpg", "question_id": 2, "vertical_id": 2 }, { "bbox": [ 970, 245, 1909, 479 ], "filename": "question_003.jpg", "question_id": 3, "vertical_id": 3 }, { "bbox": [ 41, 254, 970, 394 ], "filename": "question_004.jpg", "question_id": 4, "vertical_id": 4 }, { "bbox": [ 41, 395, 970, 506 ], "filename": "question_005.jpg", "question_id": 5, "vertical_id": 5 }, { "bbox": [ 970, 480, 1909, 661 ], "filename": "question_006.jpg", "question_id": 6, "vertical_id": 6 }, { "bbox": [ 41, 507, 970, 655 ], "filename": "question_007.jpg", "question_id": 7, "vertical_id": 7 }, { "bbox": [ 41, 656, 970, 820 ], "filename": "question_008.jpg", "question_id": 8, "vertical_id": 8 }, { "bbox": [ 970, 662, 1909, 904 ], "filename": "question_009.jpg", "question_id": 9, "vertical_id": 9 }, { "bbox": [ 41, 821, 970, 968 ], "filename": "question_010.jpg", "question_id": 10, "vertical_id": 10 }, { "bbox": [ 970, 905, 1909, 1183 ], "filename": "question_011.jpg", "question_id": 11, "vertical_id": 11 }, { "bbox": [ 41, 969, 970, 1111 ], "filename": "question_012.jpg", "question_id": 12, "vertical_id": 12 }, { "bbox": [ 41, 1112, 970, 1364 ], "filename": "question_013.jpg", "question_id": 13, "vertical_id": 13 }, { "bbox": [ 970, 1184, 1909, 1364 ], "filename": "question_014.jpg", "question_id": 14, "vertical_id": 14 } ], "mode": "all_questions", "zip_url": "/download/20251125_082929_f1975f5c" }, "error_count": 8, "error_details": [ { "error_boxes": [ { "bbox": [ 1772, 678, 44, 44 ], "class_name": "cuo", "confidence": 0.7618222236633301 } ], "match_method": "中心点包含", "question_box": { "bbox": [ 970, 662, 939, 242 ] } }, { "error_boxes": [ { "bbox": [ 1209, 163, 31, 32 ], "class_name": "cuo", "confidence": 0.7555574178695679 } ], "match_method": "中心点包含", "question_box": { "bbox": [ 970, 69, 939, 177 ] } }, { "error_boxes": [ { "bbox": [ 1225, 533, 46, 44 ], "class_name": "cuo", "confidence": 0.7304249405860901 } ], "match_method": "中心点包含", "question_box": { "bbox": [ 970, 480, 939, 181 ] } }, { "error_boxes": [ { "bbox": [ 797, 827, 40, 42 ], "class_name": "cuo", "confidence": 0.7287607192993164 } ], "match_method": "中心点包含", "question_box": { "bbox": [ 41, 821, 929, 147 ] } }, { "error_boxes": [ { "bbox": [ 1299, 363, 33, 33 ], "class_name": "cuo", "confidence": 0.7262661457061768 } ], "match_method": "中心点包含", "question_box": { "bbox": [ 970, 245, 939, 234 ] } }, { "error_boxes": [ { "bbox": [ 1226, 1028, 36, 36 ], "class_name": "cuo", "confidence": 0.7171130776405334 } ], "match_method": "中心点包含", "question_box": { "bbox": [ 970, 905, 939, 278 ] } }, { "error_boxes": [ { "bbox": [ 814, 1116, 36, 37 ], "class_name": "cuo", "confidence": 0.7121084332466125 } ], "match_method": "中心点包含", "question_box": { "bbox": [ 41, 1112, 929, 252 ] } }, { "error_boxes": [ { "bbox": [ 793, 127, 39, 30 ], "class_name": "xie", "confidence": 0.5237091183662415 } ], "match_method": "中心点包含", "question_box": { "bbox": [ 41, 107, 929, 146 ] } } ], "error_symbol_count": 8, "question_count": 14, "session_id": "20251125_082929_f1975f5c", "visualization_images": { "error_symbols": "/visualizations/20251125_082929_f1975f5c/error_symbols.jpg", "matched_errors": "/visualizations/20251125_082929_f1975f5c/matched_errors.jpg", "questions": "/visualizations/20251125_082929_f1975f5c/questions.jpg" } }, "success": true } ``` 被切割后的图片拼接方式: `https://ekt1.suzhou.edu.cn/autocut/files//` 如上图,则被切割了14个图片: ``` https://ekt1.suzhou.edu.cn/autocut/files/20251125_082929_f1975f5c/question_001.jpg https://ekt1.suzhou.edu.cn/autocut/files/20251125_082929_f1975f5c/question_002.jpg https://ekt1.suzhou.edu.cn/autocut/files/20251125_082929_f1975f5c/question_003.jpg ………… ………… https://ekt1.suzhou.edu.cn/autocut/files/20251125_082929_f1975f5c/question_014.jpg ``` question_id 也需要记录。其他字段忽略. 返回错误包括:只支持图片格式: { "message": "不支持的文件格式", "success": false } 字段缺失或错误: { "message": "没有上传图片文件", "success": false } ### 2. ocr api: 输入:上面的切割试卷api所生成的url; 输出:markdown形式的带有数学公式的题目文本内容。 大模型api地址:https://ekt1.suzhou.edu.cn/llm/v1/chat/completions openai兼容格式 模型名称:qwen-vl,是一个VL模型。 上传需要先把url解析成base64。模型不能直接识别url。 设定温度为0; 设定提示词是: ``` 你是一名数学文字识别专家。你的任务是,将图片中的文字,严格转换为markdown格式的文字;如图中有数学公式,你必须使用$或者$$进行包裹,生成markdown可解析的latex格式的数学公式;你无需输出其他任何解释性文字,只需严格输出图片中的文字。 ``` 输入示例: `POST: https://ekt1.suzhou.edu.cn/llm/v1/chat/completions` `HEADER: Authonrization: Bearer sk-PqY5sI1od5Hbl6hVsniLDp7G0DOOBlUzE8E22p8HdVSTM5Rs` ```json { "model": "qwen-vl", "messages": [ { "role": "system", "content": "你是一名数学图片文字识别专家。" }, { "role": "user", "content": [ { "type": "text", "text": "你的任务是,将图片中的文字,严格转换为markdown格式的文字;如图中有数学公式,你必须使用$或者$$进行包裹,生成markdown可解析的latex格式的数学公式;你无需输出其他任何解释性文字,只需严格输出图片中的文字。" }, { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAABfYAAAGCCAIAAAAt4wW+AAAACXBIWXMAAAsTAAALEwEAm==CC" } } ] } ], "temperature": 0 } ``` 输出格式: ```json { "id": "chatcmpl-08cc11398014472fa86bd8cbdb2ded8b", "object": "chat.completion", "created": 1764146440, "model": "qwen-vl", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "12. 已知集合 $A=\\{1, 2, 4, 8\\}$, $B=\\{z|z=xy, x \\in A, y \\in A\\}$, 则下列四个结论正确的是 ( $\\not A B$ )\n\nA. $A \\nsubseteq B$\n\nB. $\\{1, 4, 16, 64\\} \\subseteq B$\n\nC. 集合 $B$ 中元素的个数为 8\n\nD. 集合 $B$ 的真子集的个数为 127", "refusal": null, "annotations": null, "audio": null, "function_call": null, "tool_calls": [], "reasoning_content": null }, "logprobs": null, "finish_reason": "stop", "stop_reason": null } ], "service_tier": null, "system_fingerprint": null, "usage": { "prompt_tokens": 841, "total_tokens": 972, "completion_tokens": 131, "prompt_tokens_details": null }, "prompt_logprobs": null, "kv_transfer_params": null } ``` "content": 这里面的文字就是ocr的结果。 ### 3. 题目分析api: