Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running graphrag.index after enable claim_extraction [Bug]: <title> #776

Öffnen Sie
3 tasks
mavershang opened this issue Jul 30, 2024 · 8 comments
Öffnen Sie
3 tasks
Assignees
Labels
awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response bug Something isn't working stale Used by auto-resolve bot to flag inactive issues

Kommentare

@mavershang
Copy link

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the bug

After enabled claim_extraction, graphrag_index runs to error in create_final_covariates step. Error as attached. It runs ok if I disable claim_extraction.

image

image

Steps to reproduce

No response

Expected Behavior

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Zusätzliche Informationen

  • GraphRAG Version:
  • Operating System:
  • Python Version:
  • Related Issues:
@mavershang mavershang added bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Jul 30, 2024
@shaoqing404
Copy link

这个东西似乎不应该被启用。在社区中我被提示需要修改提示词才开启这个协变量。在没有新版本之前,我的建议是先关闭。
此外,graphrag对中文不太友好,需要更新中文分词器改善效果。

It seems that this thing should not be enabled. In the community I was prompted that I need to modify the prompt word to turn on this covariate. My suggestion is to turn it off before there is a new version. Also, graphrag is not very friendly to Chinese, and the Chinese word segmentation needs to be updated to improve the effect.

@AlonsoGuevara AlonsoGuevara self-assigned this Jul 30, 2024
@9prodhi
Copy link

9prodhi commented Aug 1, 2024

I am unable to reproduce the issue on my end. Could you please provide additional details so that I can replicate the problem? Below is my current configuration:

llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: mistral
  model_supports_json: true # Recommended if available for your model.
  api_base: http://localhost:11434/v1

parallelization:
  stagger: 0.3

async_mode: threaded # or asyncio

embeddings:
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: nomic-ai/nomic-embed-text-v1.5-GGUF

claim_extraction:
  enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 1

@huqianghui
Copy link

I am unable to reproduce the issue on my end. Could you please provide additional details so that I can replicate the problem? Below is my current configuration:

llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: mistral
  model_supports_json: true # Recommended if available for your model.
  api_base: http://localhost:11434/v1

parallelization:
  stagger: 0.3

async_mode: threaded # or asyncio

embeddings:
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: nomic-ai/nomic-embed-text-v1.5-GGUF

claim_extraction:
  enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 1

If you use chinese documents,then I think you can reproduce the same issue just like me as blow.

Screenshot 2024-08-02 at 10 02 24 AM

@Guiwith
Copy link

Guiwith commented Aug 5, 2024

I used english document,but the same erro.
2024-08-05 152733

@shaoqing404
Copy link

这些问题在我这里并没有复现。我才用deepseek兼容了json格式的api,中文英文均正常

@natoverse natoverse added awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response and removed triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Aug 8, 2024
Copy link

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

@github-actions github-actions bot added the stale Used by auto-resolve bot to flag inactive issues label Aug 16, 2024
@AlonsoGuevara
Copy link
Contributor

Folks, we recently added better support for non ASCII characters.
Can you confirm if this still happens on 0.3.0?

@shaoqing404
Copy link

伙计们,我们最近添加了对非 ASCII 字符的更好支持。您能否确认这在 0.3.0 上是否仍然发生?

为你进行了测试。0.3.0对非ASCII字符表现良好

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response bug Something isn't working stale Used by auto-resolve bot to flag inactive issues
Projects
None yet
Development

No branches or pull requests

7 participants