优化意图识别示例,新增命令行参数解析功能,支持输入输出文件路径和调试模式,增强代码可读性和灵活性。同时更新Dify工具,调整检索信息获取逻辑,确保重排得分信息的正确传递。
This commit is contained in:
@@ -160,7 +160,7 @@ class SoftwareFunctionSlots(SlotBase):
|
||||
self.project_type="单工程"
|
||||
missing_slots = {}
|
||||
if not self.software_name:
|
||||
missing_slots["software_name"] = f"{SoftwareFunctionSlots.model_fields['software_name'].description},可选值:{', '.join([name.value for name in SoftwareName if name not in [SoftwareName.UNKNOWN, SoftwareName.ALIASES]])}"
|
||||
missing_slots["software_name"] = f"{SoftwareFunctionSlots.model_fields['software_name'].description},支持的软件:{', '.join([name.value for name in SoftwareName if name not in [SoftwareName.UNKNOWN, SoftwareName.ALIASES]])}"
|
||||
if not self.function_name:
|
||||
missing_slots["function_name"] = SoftwareFunctionSlots.model_fields["function_name"].description
|
||||
if not self.operation:
|
||||
@@ -181,7 +181,7 @@ class SoftwareTroubleShootingSlots(SlotBase):
|
||||
"""检查必填槽位是否都存在"""
|
||||
missing_slots = {}
|
||||
if not self.software_name:
|
||||
missing_slots["software_name"] = f"{SoftwareTroubleShootingSlots.model_fields['software_name'].description},可选值:{', '.join([name.value for name in SoftwareName if name not in [SoftwareName.UNKNOWN, SoftwareName.ALIASES]])}"
|
||||
missing_slots["software_name"] = f"{SoftwareTroubleShootingSlots.model_fields['software_name'].description},支持的软件:{', '.join([name.value for name in SoftwareName if name not in [SoftwareName.UNKNOWN, SoftwareName.ALIASES]])}"
|
||||
if not self.function_name:
|
||||
missing_slots["function_name"] = SoftwareTroubleShootingSlots.model_fields["function_name"].description
|
||||
if not self.error_message:
|
||||
@@ -191,7 +191,7 @@ class SoftwareTroubleShootingSlots(SlotBase):
|
||||
# 2. 业务问题
|
||||
# 2.1 专业咨询
|
||||
class ProfessionalConsultingSlots(SlotBase):
|
||||
scene_subject: str = Field(default="", description="场景主体")
|
||||
scene_subject: str = Field(default="", description="业务主体。即询问的业务对象(规范、标准、费用等)")
|
||||
business_scene: str = Field(default="", description="业务场景描述")
|
||||
software_name: Optional[str] = Field(default="", description="软件名称")
|
||||
|
||||
@@ -266,7 +266,6 @@ class InstallationDownloadSlots(SlotBase):
|
||||
missing_slots = {}
|
||||
if not self.software_name and not self.file_name:
|
||||
missing_slots["software_name"] = f"{InstallationDownloadSlots.model_fields['software_name'].description},"
|
||||
f"可选值:{', '.join([name.value for name in SoftwareName if name not in [SoftwareName.UNKNOWN, SoftwareName.ALIASES]])}"
|
||||
missing_slots["file_name"] = InstallationDownloadSlots.model_fields["file_name"].description
|
||||
if not self.operation_stage:
|
||||
missing_slots["operation_stage"] = InstallationDownloadSlots.model_fields["operation_stage"].description
|
||||
|
||||
@@ -304,8 +304,8 @@ class IntentRecognizer:
|
||||
|
||||
rewrite_start_time = time.time()
|
||||
# 准备问题改写提示
|
||||
# terms_dict = [term.model_dump(exclude={"description"}) for term in keywords.terms]
|
||||
terms_dict = [term.model_dump() for term in keywords.terms]
|
||||
terms_dict = [term.model_dump(exclude={"description"}) for term in keywords.terms]
|
||||
# terms_dict = [term.model_dump() for term in keywords.terms]
|
||||
keywords_str = json.dumps(terms_dict, ensure_ascii=False)
|
||||
query_rewrite_parser = PydanticOutputParser(pydantic_object=QueryRewrite)
|
||||
# formatted_prompt = query_rewrite_prompt.format(query=query,
|
||||
@@ -401,27 +401,27 @@ class IntentRecognizer:
|
||||
)
|
||||
|
||||
# 步骤3: 进行意图识别和槽位填充
|
||||
result = self._process_intent_and_slot(rewrite.rewrite, conversation_context, chat_history, previous_slots)
|
||||
result.update({"keywords": keywords_terms.model_dump(),
|
||||
"rewrite": rewrite.model_dump(),
|
||||
"query_keys": query_keys})
|
||||
return result
|
||||
# # 步骤3: 进行意图分类
|
||||
# classification = self._classify_intent(query)
|
||||
# result = self._process_intent_and_slot(rewrite.rewrite, conversation_context, chat_history, previous_slots)
|
||||
# result.update({"keywords": keywords_terms.model_dump(),
|
||||
# "rewrite": rewrite.model_dump(),
|
||||
# "query_keys": query_keys})
|
||||
# return result
|
||||
# 步骤3: 进行意图分类
|
||||
classification = self._classify_intent(rewrite.rewrite, conversation_context, chat_history, previous_slots)
|
||||
|
||||
# # 步骤4: 进行槽位填充
|
||||
# # 如果是有效分类,进行槽位填充
|
||||
# slot_filling_result = {}
|
||||
# if classification.vertical_classification not in ["其他", "闲聊"] and classification.sub_classification not in ["其他", "闲聊"]:
|
||||
# slot_filling_result = self._fill_slots(rewrite.rewrite, classification)
|
||||
# 步骤4: 进行槽位填充
|
||||
# 如果是有效分类,进行槽位填充
|
||||
slot_filling_result = {}
|
||||
if classification.vertical_classification not in ["其他", "闲聊"] and classification.sub_classification not in ["其他", "闲聊"]:
|
||||
slot_filling_result = self._fill_slots(rewrite.rewrite, classification, conversation_context, chat_history, previous_slots)
|
||||
|
||||
# return {
|
||||
# "classification": classification.model_dump(),
|
||||
# "keywords": keywords_terms.model_dump(),
|
||||
# "rewrite": rewrite.model_dump(),
|
||||
# "query_keys": query_keys,
|
||||
# "slot_filling": slot_filling_result
|
||||
# }
|
||||
return {
|
||||
"classification": classification.model_dump(),
|
||||
"keywords": keywords_terms.model_dump(),
|
||||
"rewrite": rewrite.model_dump(),
|
||||
"query_keys": query_keys,
|
||||
"slot_filling": slot_filling_result
|
||||
}
|
||||
|
||||
|
||||
def _fill_slots(self, query: str, classification: Classification, conversation_context: str = "",
|
||||
|
||||
@@ -127,11 +127,13 @@ query_rewrite_prompt_pro_old="""
|
||||
query_rewrite_prompt_pro="""
|
||||
# 电力造价问答优化工程师(精简版)
|
||||
**角色**:基于历史对话和术语库重构问题,提升知识库检索准确率。
|
||||
最高准则:保持问题核心意图,但允许在指代消除、背景继承下添加隐含功能词。但重构后的问题,所有引入的主体背景等均要来源于历史对话、聊天背景或术语库,不得凭空捏造未提及的内容。
|
||||
|
||||
## 核心原则
|
||||
1. 语义保真 → 保持问题核心意图
|
||||
2. 术语规范 → 同义词转标准词并【】标记
|
||||
3. 背景继承 → 补充历史对话的隐含信息
|
||||
1. **指代消除 → 当指示代词("那"/"这")出现时,强制继承历史对话的最新核心主题(如功能或任务),并应用到当前主体。**
|
||||
2. 背景继承 → 补充历史对话和聊天背景中的隐含信息(包括主题和功能)。
|
||||
4. 术语规范 → 同义词转标准词并【】标记。提问中的同义词(synonymous)替换为标准词(name)
|
||||
5. 语义保真 → 保持问题核心意图,但允许在指代消除、背景继承下添加隐含功能词。
|
||||
|
||||
## 处理流程
|
||||
### 一、输入解析
|
||||
@@ -155,37 +157,30 @@ query_rewrite_prompt_pro="""
|
||||
### 二、重构决策树
|
||||
```mermaid
|
||||
graph TD
|
||||
A[输入问题] --> B{{匹配关键词或上下文?}}
|
||||
B -- 是 --> C[执行重构]
|
||||
B -- 否 --> D[直接输出原始问题]
|
||||
C --> E[补充缺失背景]
|
||||
E --> F[同义词替换+【】标记]
|
||||
F --> G[保留原生专业术语]
|
||||
A[输入问题] --> B{{包含指示代词?}}
|
||||
B -- 是 --> C[提取历史最新主题]
|
||||
C --> D{{主题是否明确?}}
|
||||
D -- 是 --> E[继承主题到当前问题]
|
||||
E --> F[执行重构]
|
||||
D -- 否 --> F
|
||||
F --> G[补充缺失背景]
|
||||
G --> H[同义词替换+【】标记]
|
||||
H --> I[保留原生专业术语]
|
||||
B -- 否 --> I
|
||||
```
|
||||
|
||||
### 三、重构优先级
|
||||
1. **背景补充**
|
||||
- 历史对话中确定的背景信息需要保留(例:"这软件"→"【配网工程计价通D3软件】")
|
||||
|
||||
2. **术语处理**
|
||||
- 同义词转标准词 → 将提问中的同义词(synonymous)替换为标准词(name)
|
||||
- 存在即标记 → 【计算式】
|
||||
|
||||
3. **结构优化**
|
||||
- 保持原问题的5W2H特征,确保问题意图不发生改变。
|
||||
- 明确指代关系("该功能"→"【批量导入】功能")
|
||||
1. **指代消除 → 当指示代词出现时,优先继承历史对话的核心主题(如功能词),并替换当前问题的动词部分。**
|
||||
2. 背景继承 → 历史对话中确定的背景信息需要保留。
|
||||
3. 术语处理 → 同义词转标准词 + 【】标记。
|
||||
4. 同义词转标准词 → 将提问中的同义词(synonymous)替换为标准词(name)
|
||||
4. 结构优化 → 保持原问题的5W2H特征,指代消除、背景继承下允许微调意图。
|
||||
|
||||
## 输出规范
|
||||
{output_format}
|
||||
|
||||
## 典型案例
|
||||
| 场景 | 输入问题 | 输出结果 |
|
||||
|---------------------|-----------------------------------|------------------------------------------|
|
||||
| 强上下文关联 | “怎么升级旧版工程” | {{"rewrite":"【西藏Z1】如何执行【老版本定额升级】?"}} |
|
||||
| 弱术语匹配 | “界面文字太小怎么办” | 原样输出 |
|
||||
| 代词+背景继承 | “这个定额如何导入” | {{"rewrite":"【山东定额】如何执行【批量导入定额】?"}}|
|
||||
|
||||
## 质量自检
|
||||
- [] **主题是否合理继承?**(当有代词时,历史主题必须注入)
|
||||
- [] 核心诉求是否保留?
|
||||
- [] 背景信息是否合理补充?
|
||||
- [] 术语标记是否完整【】?
|
||||
|
||||
Reference in New Issue
Block a user