Files
DM_rewrite_3.31/booway_kg_api/Untitled.ipynb
T
2025-03-31 15:17:47 +08:00

143 lines
4.7 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "code",
"execution_count": 5,
"id": "e8f39ebb-71ab-4389-8bc3-29577470f948",
"metadata": {},
"outputs": [],
"source": [
"from WikijsTool import WikijsTool"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "0a6bde6e-5507-48d9-8e64-d804f6085723",
"metadata": {},
"outputs": [],
"source": [
"info = WikijsTool.get_all_documents()"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "009c3e8d-6ff6-4e0b-83b8-740ed195b5c5",
"metadata": {},
"outputs": [],
"source": [
"html_text = WikijsTool.query_doc_info(8663)['content']"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "621bb76a-aa5c-4f57-8574-c852f24b64e3",
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"\n",
"cleaned_img_text = re.sub(r'<img\\s+[^>]*>', '', html_text)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "941e1e3b-7b8e-47f1-96ff-fa89898a1dd7",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'<h1>使用场景</h1>\\n<p>组合件或组合件下的消耗量想要修改所属项目划分</p>\\n<h1>功能入口</h1>\\n<p>【组合件】界面-——“组合件列表”页签-——选择已录入消耗量的组合件或组合件下的消耗量,鼠标右键-——选择”设置所属项目“。</p>\\n<figure class=\"image\">\\n <figcaption>设置所属项目划分</figcaption>\\n</figure>\\n<h1>操作步骤</h1>\\n<p>1.设置所属项目划分</p>\\n<p>方法一:【组合件】界面——“组合件列表”页签-——选择已录入消耗量的组合件或组合件下的消耗量,鼠标右键——选择”设置所属项目划分“,在弹窗中选择项目划分,点击确定;</p>\\n<figure class=\"image\">\\n <figcaption>批量设置</figcaption>\\n</figure>\\n<p>方法二:对于组合件下单条工程量,在此工程量的所属项目划分列双击,在弹窗中选择项目划分,点击确定;</p>\\n<figure class=\"image\">\\n <figcaption>单条设置</figcaption>\\n</figure>\\n<h1>内部补充</h1>\\n<p>1.如工程量下方还有子级消耗量,需选择到父级消耗量进行操作,子级工程量的“设置所属项目”为灰色不可选。</p>\\n<figure class=\"image\">\\n <figcaption>子级不可设置</figcaption>\\n</figure>\\n<p>&nbsp;</p>\\n<p>&nbsp;</p>\\n'"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cleaned_img_text"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "13973a4a-1f19-4b4d-b3c8-441d69e0091b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"使用场景\n",
"组合件或组合件下的消耗量想要修改所属项目划分\n",
"功能入口\n",
"【组合件】界面-——“组合件列表”页签-——选择已录入消耗量的组合件或组合件下的消耗量,鼠标右键-——选择”设置所属项目“。\n",
"\n",
"设置所属项目划分\n",
"\n",
"操作步骤\n",
"1.设置所属项目划分\n",
"方法一:【组合件】界面——“组合件列表”页签-——选择已录入消耗量的组合件或组合件下的消耗量,鼠标右键——选择”设置所属项目划分“,在弹窗中选择项目划分,点击确定;\n",
"\n",
"批量设置\n",
"\n",
"方法二:对于组合件下单条工程量,在此工程量的所属项目划分列双击,在弹窗中选择项目划分,点击确定;\n",
"\n",
"单条设置\n",
"\n",
"内部补充\n",
"1.如工程量下方还有子级消耗量,需选择到父级消耗量进行操作,子级工程量的“设置所属项目”为灰色不可选。\n",
"\n",
"子级不可设置\n",
"\n",
" \n",
" \n",
"\n"
]
}
],
"source": [
"from bs4 import BeautifulSoup\n",
"\n",
"soup = BeautifulSoup(cleaned_img_text, \"html.parser\")\n",
"plain_text = soup.get_text()\n",
"print(plain_text)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2802c4bd-34a7-4c61-bb15-4dd9739ecc1d",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "dify_lab",
"language": "python",
"name": "dify_lab"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}