The first thing you need is training data. Many terabytes of it. Then you need enough computing hardware so that the training is completed within your lifetime. This usually means that you have a data center-sized facility. There was a Chinese lab a couple of months ago that astonished the world by creating an LLM from scratch with a budget of “only” a few tens of millions of dollars.
I don’t think you want to start from zero.
The best way is to download an LLM model from Hugging Face and then fine-tune it to fit your needs. That is a realistic project that can be done by only one person working alone. You will still need to buy hardware. What you need depends on the number of parameters in your downloaded model. A reasonable size for private use is about 8 billion parameters. For that you would need as a minimum a GPU with more than 8GB VRAM. Apple Macs with their unified RAM work well for 8 billion parameter models if the Mac has at least 16GB or RAM. I’ve run one at decent speed on mine.
Going larger then 8 billion gets expensive because you will need a corresponding larger GPU. There is a limit to what you can install in a home computer and that limit is very much smaller than the current state-of-the-art 400B parameter LLMs.
I found that an 8B parameter Llama modal could understand most English input and had very basic “smarts”. It did not compare to ChatGPT but I think it could handle questions like “How many legs does a dog have?” or “how many beers in a 6-pack?” These small models will “BS" badly. For example I asked for a review of Bethoven’s 13th symphony and got a detailed three paragraph easy, all made up. Then I asked for short biography of George Washington’s daughter, “Kate" and got another nicely written fictional story.
You have to understand the limits of an LLM that wil run on personal-sized computers. 8B is the current sweet spot. I would go with Facebook's “Llama” whatever the current version is. As said, it runs reasonably well on my Apple M2-Pro Mac mini with 16GB RAM. A Linux PC with new 16GB Nvidia card would be better but the Mac is half the price and I already had it. Do not even try using a Raspberry Pi. A gamer-style PC would be best.
Here is where I got my copy