> --
> Andreas
Based on my experience, locally deploying large language models requires one or both of two things: 1) a computer with many cores, or 2) a computer with (almost definitely) an NVIDA processor. Why? Because generative-AI is computing intensive and it only really scales if the process is run in parallel.
I have gotten away with a 64-core computer running Linux to do generative-AI, but that was only good for my specific applications. I don't think it would scale to an entire institution. Having a computer with an NVIDA card makes this MUCH more scalable.
Either way, you might consider using OpenWeb UI:
https://openwebui.com/
OpenWeb UI is an open source tool/interface allowing you to run large language model applications on a central computer but also allow many people to use it. Remember, as open source software, you get what you pay for. It works, but it requires practice when it comes to installation and deployment.
Another suggestion is to use Ollama. Ollama is a server that runs on just about any computer:
http://ollama.com
You then install large language models. You then can interact with the server through a Web interface or any number of programming languages. (I use Python.) Ollama now supports "cloud" models. These work exactly like the locally deployed models but they run on Ollama's hardware and the response times very fast.
Once you get this far, I think RAG (retrieval-augmented generation) is the best use-case of large language models in LAM. More specifically, RAG takes search results and then either summarizes them or allows you to address questions posed to them. Simple examples might include:
* search a library catalog and return the
results as a JSON stream with then gets
converted to any number of citations formats
* search a set of EAD files and ask the system
to summarize the results
* given a pile o' plain text files, output the
names of people, places, and/or organizations
mentioned in the files (but this can easily
be done sans the use of generative AI)
Personally, I have created collections of classic literature and entire runs of scholarly journals. I have then posed questions to the collections such as "What is honor?", "How has librarianship changed over time?", or "Who is Ishmael and why should I care?" The responses I get are more than plausible but I never accept them as truth. Instead, the responses are intended discussion points and food for thought.
HTH
--
Eric Lease Morgan
Librarian Emeritus, Hesburgh Libraries
University of Notre Dame