Setting up speech to text - STT settings

positron · March 9, 2025, 8:06pm

Is there a step by step guide to setting up the audio speech to text settings? And loading in an mp3?

Alvaro · March 9, 2025, 9:18pm

On which service in the StartOS Marketplace are you trying to use speech to text?

The specific service documentation would likely be the place to search for further instructions.

positron · March 9, 2025, 9:32pm

Hi thanks for the quick response. Yes I should have been more specific.
On FreeGPT-2 in settings under audio. I am not sure how to load the model kokoro.js or if that is even the right one to use.

Alvaro · March 10, 2025, 11:58pm

If you take a look at the FreeGPT-2 service instructions, you will find the following steps:

Open the Admin Panel and navigate to Settings > Models. Here, you will find an entry box titled "Pull a model from Ollama.com". In this box, you can type the tag of the model you want to use. We recommend starting with the mistral:7b model.
After entering the model tag (e.g., mistral:7b), click the download button on the right. The software will automatically download the model.
To discover other compatible models and their tags, visit the Ollama Library. This page provides information about all the supported models. You can explore these models and choose the one that best suits your needs.
Please note that the size of the model you choose should not exceed the available RAM memory of your server. Using a model larger than your available memory could lead to performance issues or crashes.
Gradually explore larger models as needed, keeping in mind the memory constraint.

Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.

You can adapt these steps to install any model you want, but always make sure that you have enough resources to run them.

positron · March 11, 2025, 7:31am

Thank you again or the response. I have pulled several models via ollama so I am familiar with that process. Where I am at a loss is loading the speech to text under the audio section. I do not see where I can use the models I pulled. It also starts to load kokoro.js and stays loading after it goes to 100%. Am I doing something wrong? Am I able to load a model like mistral and use it with the audio settings for speech to text? Basically I am trying to transcribe a bunch of mp3s of lectures. Thanks again for all your help.

positron · March 12, 2025, 11:17am

Ok I found a whisper model and pulled it, but I guess I never tested the other ones I already had and I am getting this error when I try and start a chat with any of them:

500: Ollama: 500, message=‘Internal Server Error’, url=‘http://localhost:11434/api/chat’

Also if I just download and import a gguf file would that simplify things?

h0mer · March 12, 2025, 4:25pm

Hi positron!

The latest version of FreeGPT-2 introduces an experimental feature designed to handle this task effortlessly.

To enable it, navigate to Admin Panel → Settings → Models, then click the Download icon and select Show under the Experimental section.

For the best results, ensure you download the GGUF version of the model. Use the download link provided on the Hugging Face models page, which should look like this:

https://huggingface.co/mradermacher/SatoshiNv5-GGUF/resolve/main/SatoshiNv5.Q4_K_S.gguf?download=true

If you encounter a 500 Error, ensure you are using the latest version of FreeGPT-2 and try downloading a smaller model. Please note that some models may not work at all. It’s somewhat unpredictable which ones will be supported, as this is subject to rapid and dynamic changes.

positron · March 13, 2025, 11:20am

Thank you for all this info! I will try that. I do see there is a pop up for a later version however in the updates section of the start os it does not show the option to update. I am downloading the zip file now from github. Do you by any chance know the process to update it manually? I am going to delete all the models that I pulled that are not working as I would rather the ggufs. I also got whisper to work on another computer which is awesome. But was unable to get insanely fast whisper to work because of some python enviroment conflicts via numpy and torch which is annoying.

h0mer · March 14, 2025, 2:54pm

You can’t update FreeGPT-2 from inside the service itself. Updates must be packaged and released by Start9 developers, after which you will see them in the update tab of your StartOS GUI. The latest version of FreeGPT-2 is 2.511.0514, released on February 24, 2025, at 11:07:03 PM. I hope this helps.

positron · March 14, 2025, 7:46pm

Thanks H0mer!
Yes I see it now, have to learn how to read better ha.
And yes that is the version I have.

Is there a list of models on huggingface that might work with FreeGPT-2, or any GGUF might?

Thanks again