Looking for Mental Health Support Datasets for building a Multi-turn Chatbot

Hi everyone,

I’m currently developing a multi-turn chatbot aimed at helping users manage anxiety, depression, and other mental health challenges. I’m seeking datasets related to mental health support conversations to train the chatbot. So far, I have found two datasets: ESConv and AUGESC.

However, these kinds of datasets seem to be quite rare. Could anyone recommend similar mental health conversation datasets or point me to other resources? I’d greatly appreciate any help or suggestions!

Thank you in advance!

I’ve seen some of these in my search for other datasets, but I don’t remember where…
There seem to be a few mental health LLMs out there, so why not try to find a dataset that was used in a training course for them?
A search on GGUF will take you to a lot of LLMs. Then add keywords that look like that to narrow it down.

Someone may have a collection. If you can find a good collection, it’s a quick story.

https://huggingface.co/victunes/TherapyBeagle-11B-v2
https://huggingface.co/victunes/TherapyLlama-8B-v1
https://huggingface.co/models?sort=modified&search=gguf

P.S.

Users found by the above method. There seems to be also datasets. This is one way to find them.
https://huggingface.co/CalebE

By the way, this is what a collection is, and this is my HF utility collection as an example, which also includes space to assist in searching the dataset. You might be better to try using something like that.
https://huggingface.co/collections/John6666/spaces-for-model-space-useful-utilities-in-hugging-face-6685598385e2e2adac9d35a2

P.S.

This is a last resort, or in a sense a legitimate one, but if you find someone who seems to know a lot, you can open a Discussion with their appropriate repo and ask them. You can shorten the process in one fell swoop.

1 Like

Thank you so much <3

That’s fine. It seems worthwhile.
I hope this will help.
HF’s search function has quite a few omissions, so it might be faster to do this on Google.
https://www.google.co.jp/search?q=mental+site%3Ahuggingface.co

1 Like

I really appreciate your guidance. I’ll try your suggestion, and if I have any further questions, I hope it’s okay to ask for your advice again. Your help means a lot! ^^

1 Like

Hi thanhcao,
I am currently working on a research project focused on Fine-tune Llama 2 Model with LoRa and QLoRa techniques. I am using dataset “Amod/mental health counseling dataset”. However, I noticed that there is no paper or detailed documentation associated with the dataset.

Could you please provide any additional information or resources related to this dataset? If there are any related research papers or documentation that you are aware of, I would greatly appreciate it

Hi Mohsinali046,
I came across the GitHub page associated with the dataset you’re interested in. It seems the raw data can be found in CSV format on this GitHub page: Counsel Chat Dataset. After reviewing the CSV file, it looks like the data originates from counseling questions asked on the Counsel Chat website, such as this example.

Let me know if you need further assistance!

Here’s a short, engaging reply you can post. It’s helpful, safety-aware, and not salesy.


Hi, good question and you are right, high quality mental health dialogue data is rare for privacy and safety reasons.

A few directions that usually help:

  1. Broaden search terms beyond “mental health chatbot”
    Try “empathetic dialogue”, “supportive conversation”, “counseling dialogue”, “distress support”, “crisis counseling”, “peer support forum”.

  2. Look at adjacent dialogue datasets
    Even if they are not strictly clinical, empathetic and supportive conversation datasets can work well for training multi turn responses and then you add your own safety rules for self harm or crisis escalation.

  3. Consider a hybrid approach
    Use RAG with vetted mental health resources for factual guidance, and use conversation data mainly to learn tone, reflection, and asking gentle clarifying questions. This reduces the risk of the model inventing advice.

  4. Safety note
    If you deploy for anxiety or depression, plan for guardrails. Clear disclaimers, crisis escalation paths, and refusal behaviors for self harm content are important.

Quick question so people can give better pointers. Are you aiming for general supportive coaching, or clinical style counseling, and do you need it in English only or multilingual?

1 Like