PlanetZ Chatbot (LLM vector DB)

Please remember the terms of your membership agreement.

Moderators: valis, garyb

Post Reply
User avatar
valis
Posts: 7677
Joined: Sun Sep 23, 2001 4:00 pm
Location: West Coast USA
Contact:

PlanetZ Chatbot (LLM vector DB)

Post by valis »

The database for this site is rather large, which is half the reason that it required my primary hosting platform for all clients/projects scale to support this place (DDOSes were part of this too of course).

No, this isn't a donation post, rather know that to do a RAG or vector db required pruning it down to only 5 fields (posts, attachments, topics, users, and the forum structure). At less than half the size, it's still ~220MB on disk and larger in RAM when processing it. A resulting vector db can be 300-700MB, if in jsonl or another simple format, or you lose the context. Preprocessing the data to restore more structure before the final vector output might help this somewhat, but since it takes me hours to crunch output from the sql input(s)

The goal of this wasn't to put a chatbot online here, but rather to play around with using various (local and remote) LLMs to query this site on topics I knew well. I did get an AnythingLLM working with LMStudio, Jan, and a few external API modules in ComfyUI (it's surprisingly flexible as an inference host and can host many in parallel). I used the existing versions of Gemma, Qwen2.x/3, QWQ and etc, often unsloth distilled to control where portions of the model load so I keep as much VRAM free for context as possible on my 24GB 3090).

All processing into embeddings is just python scripting, and of course I was using Gemini, Claude, ChatGPT and Grok to generate the scripts and prompts (comparing outputs and testing how well it interpreted our fairly narrow and vertical interest for each in turn). For outputs I currently chose json (for portability or the ability to process to any other output) and chose ChromeDB for a direct conversion to native embeddings as it was supported by a few tools inluding AnythingLLM. This gives you essentially the same thing as using Nvidia's ChatRTX or any other chatbot that can access files directly, but in the form of a much faster pre-embedded database that better matches the native model's structure.

That's about as far as I've gotten, at least on that task.
User avatar
garyb
Moderator
Posts: 23379
Joined: Sun Apr 15, 2001 4:00 pm
Location: ghetto by the sea

Re: PlanetZ Chatbot (LLM vector DB)

Post by garyb »

still quite a task, thank you.
User avatar
valis
Posts: 7677
Joined: Sun Sep 23, 2001 4:00 pm
Location: West Coast USA
Contact:

Re: PlanetZ Chatbot (LLM vector DB)

Post by valis »

I know there's a few people here who played with AI at various points, so open where to take this.
Post Reply