143 lines
4.7 KiB
Plaintext
143 lines
4.7 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"id": "e8f39ebb-71ab-4389-8bc3-29577470f948",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from WikijsTool import WikijsTool"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"id": "0a6bde6e-5507-48d9-8e64-d804f6085723",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"info = WikijsTool.get_all_documents()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"id": "009c3e8d-6ff6-4e0b-83b8-740ed195b5c5",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"html_text = WikijsTool.query_doc_info(8663)['content']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 29,
|
||
"id": "621bb76a-aa5c-4f57-8574-c852f24b64e3",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import re\n",
|
||
"\n",
|
||
"cleaned_img_text = re.sub(r'<img\\s+[^>]*>', '', html_text)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"id": "941e1e3b-7b8e-47f1-96ff-fa89898a1dd7",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'<h1>使用场景</h1>\\n<p>组合件或组合件下的消耗量想要修改所属项目划分</p>\\n<h1>功能入口</h1>\\n<p>【组合件】界面-——“组合件列表”页签-——选择已录入消耗量的组合件或组合件下的消耗量,鼠标右键-——选择”设置所属项目“。</p>\\n<figure class=\"image\">\\n <figcaption>设置所属项目划分</figcaption>\\n</figure>\\n<h1>操作步骤</h1>\\n<p>1.设置所属项目划分</p>\\n<p>方法一:【组合件】界面——“组合件列表”页签-——选择已录入消耗量的组合件或组合件下的消耗量,鼠标右键——选择”设置所属项目划分“,在弹窗中选择项目划分,点击确定;</p>\\n<figure class=\"image\">\\n <figcaption>批量设置</figcaption>\\n</figure>\\n<p>方法二:对于组合件下单条工程量,在此工程量的所属项目划分列双击,在弹窗中选择项目划分,点击确定;</p>\\n<figure class=\"image\">\\n <figcaption>单条设置</figcaption>\\n</figure>\\n<h1>内部补充</h1>\\n<p>1.如工程量下方还有子级消耗量,需选择到父级消耗量进行操作,子级工程量的“设置所属项目”为灰色不可选。</p>\\n<figure class=\"image\">\\n <figcaption>子级不可设置</figcaption>\\n</figure>\\n<p> </p>\\n<p> </p>\\n'"
|
||
]
|
||
},
|
||
"execution_count": 31,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"cleaned_img_text"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 35,
|
||
"id": "13973a4a-1f19-4b4d-b3c8-441d69e0091b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"使用场景\n",
|
||
"组合件或组合件下的消耗量想要修改所属项目划分\n",
|
||
"功能入口\n",
|
||
"【组合件】界面-——“组合件列表”页签-——选择已录入消耗量的组合件或组合件下的消耗量,鼠标右键-——选择”设置所属项目“。\n",
|
||
"\n",
|
||
"设置所属项目划分\n",
|
||
"\n",
|
||
"操作步骤\n",
|
||
"1.设置所属项目划分\n",
|
||
"方法一:【组合件】界面——“组合件列表”页签-——选择已录入消耗量的组合件或组合件下的消耗量,鼠标右键——选择”设置所属项目划分“,在弹窗中选择项目划分,点击确定;\n",
|
||
"\n",
|
||
"批量设置\n",
|
||
"\n",
|
||
"方法二:对于组合件下单条工程量,在此工程量的所属项目划分列双击,在弹窗中选择项目划分,点击确定;\n",
|
||
"\n",
|
||
"单条设置\n",
|
||
"\n",
|
||
"内部补充\n",
|
||
"1.如工程量下方还有子级消耗量,需选择到父级消耗量进行操作,子级工程量的“设置所属项目”为灰色不可选。\n",
|
||
"\n",
|
||
"子级不可设置\n",
|
||
"\n",
|
||
" \n",
|
||
" \n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"from bs4 import BeautifulSoup\n",
|
||
"\n",
|
||
"soup = BeautifulSoup(cleaned_img_text, \"html.parser\")\n",
|
||
"plain_text = soup.get_text()\n",
|
||
"print(plain_text)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "2802c4bd-34a7-4c61-bb15-4dd9739ecc1d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "dify_lab",
|
||
"language": "python",
|
||
"name": "dify_lab"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.11.9"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|