Should I run Free-GPT2 solo?

I’m looking at getting a Lenovo ThinkCenter M910Q to run start9 on. I’ve already got one set up to do all of the Bitcoin and lightning things and was going to get a separate one for running everything else. I would like to run free-gpt2 to do some light LLM work but didn’t know if it would hog too much resource from everything else. Should free-gpt2 be standalone on its own machine or can it play well with other things?

You can run Free-GPT2 alongside other services without much trouble, as long as you have sufficient resources available. This is especially important when it comes to the amount of RAM you have on your server and which specific model you choose to run.

The bigger the model, the more RAM and processing power will be required.

From the moderate research, I’ll probably run something around 8B. Nothing too fancy, but hopefully not too slow though. It’ll have 32GB RAM but not really any graphics card.

You should be fine with 32GB running 8B models.

No worries about the graphics card, since Free-GPT2 does not use it.

Curious if you can run Ollama with 100% CPU and how that compares on your hardware?

Such as these, to name a few of the newer models:

Allegedly aimed at laptops: gemma3n


I wonder this more now since 0.11.5 ( now 0.11.6 ) became much higher performance with the new memory estimation approach, which is opt-in by environment variable.

Would recommend the docker image, with these variables:

    environment:
      - OLLAMA_FLASH_ATTENTION=1
      - OLLAMA_KV_CACHE_TYPE=q8_0
      - OLLAMA_NEW_ESTIMATES=1