Posts Tagged ‘AI’

No Comments

Running the Dutch LLM fietje-2-chat with reasonable speed on your Android Phone

Thursday, October 10th, 2024

To run Fietje-2-Chat on your Android Phone locally on a reasonable speed, you’ll need to create a special quantized version of fietje-2b-chat.

One of the best apps to run a LLM’s on your Android Phone is ChatterUI (https://github.com/Vali-98/ChatterUI).

You can download the APK from Github and transfer it to your phone and install it. It’s not yet available in F-Droid.

As most Android Phones have a ARM CPU, use special `quants` that run faster on ARM, because the use NEON extensions, int8mm and SVE instructions.

Note that these optimized kernels require the model to be quantized into one of the formats: Q4_0_4_4 (Arm Neon), Q4_0_4_8 (int8mm) or Q4_0_8_8 (SVE). The SVE mulmat kernel specifically requires a vector width of 256 bits. When running on devices with a different vector width, it is recommended to use the Q4_0_4_8 (int8mm) or Q4_0_4_4 (Arm Neon) formats for better performance.

https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md#arm-cpu-optimized-mulmat-kernels

How to create a special ARM optimized version of Fietje-2-Chat

Download the f16 guff version of Fietje-2-Chat:

wget https://huggingface.co/BramVanroy/fietje-2-chat-gguf/resolve/main/fietje-2b-chat-f16.gguf?download=true

Install a Docker version of LLama to do the conversion

mkdir p ~/llama/models
sudo docker run -v /home/user/llama/models:/models ghcr.io/ggerganov/llama.cpp:full --all-in-one "/models/" 7B

To convert the f32 or f16 gguf to another format Q4_0_4_4 (Arm Neon):

docker run --rm -v /home/user/llama/models:/models ghcr.io/ggerganov/llama.cpp:full --quantize "/models/fietje-2b-chat-f16.gguf" "/models/fietje-2b-chat-Q4_0_4_4.gguf" "Q4_0_4_4"

Transfer the fietje-2b-chat-Q4_0_4_4.gguf to your Android Device.

Open ChatterUI

Import Model:

Go to menu -> API -> Model -> import -> fietje-2b-chat-Q4_0_4_4.gguf

Load Model:

menu ->API -> Model -> select fietje-2b-chat-Q4_0_4_4.gguf -> Load

Then leave the settings and start typing in the prompt. The first cold run will be a little slow, but once it’s running you’ll get about 10 tokens/s on a Snapdragon 865 phone.

That’s not bad.

If you’re interested in a LLM that can generate much better Dutch than LLama3.2 or Phi3 on your phone, give Fietje a try.

No Comments

Creating graphics with LLM’s: generate SVG!

Wednesday, September 25th, 2024

LLM’s can generate text, and Scalable Vector Graphics (SVG) is an XML-based vector image format for defining two-dimensional graphic. XML is human readable, so it can be generated by LLM’s.

How good do they perform?

https://duckduckgo.com/?q=DuckDuckGo+AI+Chat&ia=chat&duckai=1

Generate a dog in svg

Looks like a cute mix between a rubber duck and a dog to me.

Generate a chair in svg

More a stool then chair.

"A mix between a rubber duck and a dog."

LLM’s have humor in a way.

No Comments

Open source coding with Ollama as Copilot

Tuesday, September 24th, 2024

Among the many tools available to developers today, support from AI as a coding assistant is a nice must-have.

Zed is a Rust-written modern open source code editor designed for collaborative editing and multiplayer teamwork. It works fine as a stand-alone editor with git support.

Ollama offers easy and privacy-friendly local driven LLM support. Get up and running with large language models on your own machine.

Zed does offer AI integration with ChatGPT or Claude, but it can also connect to your local Ollama install.

To try this out, just add this to the settings-file in Zed CTRL + ,:

 

"assistant": {
"version": "1",
"provider": {
"default_model": {
"name": "qwen2.5-coder",
"display_name": "codeqwen",
"max_tokens": 2048,
"keep_alive": -1
},
"name": "ollama",
// Recommended setting to allow for model startup
"low_speed_timeout_in_seconds": 30
}
}

Open a new assistant tab and you can change context of the assistant to your tab content, and let the AI assistant annotate your script:

/tab
Annotate the JS file

And can LLM’s create graphics? Of course they can.

How to create images with a Large Language Model (LLM)

SVG!

SVG is an xml based format that can be understood by humans and machines, so when a coding assistant can write code, it can also write SVG. 😉

Does it create sensible graphics? Not really.

I asked qwen2.5-coder  in Zed:

Create a SVG file that shows a chair.

Does this look like a chair?

 

Another attempt:

Does it really look like a chair? What does it look like. Let me know in the comments!