Posts Tagged ‘ llm’ - dev.webonomic.nl

Posts Tagged ‘llm’

Setting up unified memory for Strix Halo correctly on Ubuntu 25.04 or 25.10

Wednesday, November 12th, 2025

You have followed the online instructions, for example this great toolbox, but upon running the Strix Halo Toolbox, you are still encountering memory errors on your 128GB Strix Halo system, and for example qwen-image-studio fails to run.

File "/opt/venv/lib64/python3.13/site-packages/torch/utils/_device.py", line 104, in torch_function
return func(*args, **kwargs)
torch.OutOfMemoryError: HIP out of memory. Tried to allocate 3.31 GiB. GPU 0 has a total capacity of 128.00 GiB of which 780.24 MiB is free. Of the allocated memory 57.70 GiB is allocated by PyTorch, and 75.82 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation.

(https://github.com/kyuz0/amd-strix-halo-image-video-toolboxes/issues)

You’ve verified your configuration using sudo dmesg | grep "amdgpu:.*memory", and the output indicates that the GTT size is correct.

It is likely that you configured the GTT size using the now outdated and deprecated parameter amdgpu.gttsize, which may explain why the setting is not taking effect. Alternatively, you may have used an incorrect prefixamdttm. instead of the correct ttm..

Please verify your configuration to ensure the proper syntax is used:

How to check the unified memory setting on AMD Strix Halo/Krackan Point:

cat /sys/module/ttm/parameters/p* | awk '{print $1 / (1024 * 1024 / 4)}'

The last two lines must be the same, and the number you see is amount of unified memory in GB.

How to setup unified memory correctly

In the BIOS, set the GMA (Graphics Memory Allocation) to the minimum value: 512MB. Then, add a kernel boot parameter to enable unified memory support.

Avoid outdated methods, they no longer work. Also, note that the approach differs slightly depending on your hardware: AMD Ryzen processors require a different configuration (ttm) compared to Instinct-class (professional workstation) GPUs (amdttm).

To max out unified memory:

Edit /etc/default/grub for and change the GRUB_CMDLINE_LINUX_DEFAULT line to:

128GB HALO

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=off ttm.pages_limit=33554432 ttm.page_pool_size=33554432"

96GB HALO

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=off ttm.pages_limit=25165824 ttm.page_pool_size=25165824"

64GB HALO

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=off ttm.pages_limit=16777216 ttm.page_pool_size=16777216"

32GB HALO

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=off ttm.pages_limit=8388608 ttm.page_pool_size=8388608"

The math here is 32GB: 32 x 1024 * 1024 * 1024 / 4096 -> 32 * 1024 * 256

The default page = 4096.

After you’ve edited the /etc/default/grub:

sudo update-grub2
reboot

You probably should leave some memory (~4GB) for your system to run smoothly, so edit and adapt above lines accordingly. Maybe 2Gb is enough if you don’t run a GUI.

Check your config

To check if you’ve done it correctly, reboot and check:

cat /sys/module/ttm/parameters/p* | awk '{print $1 / (1024 * 1024 / 4)}'
96
96

Here the total memory of 96GB is set on a 96GB Strix Halo

If the last two numbers are not the same, try debugging the problem.

Debugging unified memory problem

Check dmesg for AMD VRAM and GTT size:

sudo dmesg | grep “amdgpu:.*memory”

[ 10.290438] amdgpu 0000:64:00.0: amdgpu: amdgpu: 512M of VRAM memory ready
[ 10.290440] amdgpu 0000:64:00.0: amdgpu: amdgpu: 131072M of GTT memory ready.

This seem correct, but setting set the GTT size with the old method amdgpu.gtt_size, will report the right size of GTT memory, but it can’t be used by ROCm unless you set the TTM memory correctly You’ll notice an other warning in the dmesg output during early boot.

sudo dmesg | grep “amdgpu”

[ 17.652893] amdgpu 0000:c5:00.0: amdgpu: [drm] Configuring gttsize via module parameter is deprecated, please use ttm.pages_limit
[ 17.652895] amdgpu 0000:c5:00.0: amdgpu: [drm] GTT size has been set as 103079215104 but TTM size has been set as 48956567552, this is unusual

Furthermore you see a lot of sources mentioning to set amdttm.pages_limit or amdttm.page_pool_size. This won’t work with your Strix Halo but these settings are for AMD Instinct machines.

Confusing, yes, but just be careful to use the right settings in /etc/default/grub for GRUB_CMDLINE_LINUX_DEFAULT

And don’t forget just check it with the mentioned oneliner:

cat /sys/module/ttm/parameters/p* | awk '{print $1 / (1024 * 1024 / 4)}'

Links and resources:

https://github.com/ROCm/ROCm/issues/5562#issuecomment-3452179504
https://strixhalo.wiki/
https://blog.linux-ng.de/2025/07/13/getting-information-about-amd-apus/
https://www.jeffgeerling.com/blog/2025/increasing-vram-allocation-on-amd-ai-apus-under-linux

Tags: AI, amd, llm, strix halo
Posted in Webtechnology | No Comments »

Comments Off

Creating graphics with LLM’s: generate SVG!

Wednesday, September 25th, 2024

LLM’s can generate text, and Scalable Vector Graphics (SVG) is an XML-based vector image format for defining two-dimensional graphic. XML is human readable, so it can be generated by LLM’s.

How good do they perform?

https://duckduckgo.com/?q=DuckDuckGo+AI+Chat&ia=chat&duckai=1

Generate a dog in svg

Looks like a cute mix between a rubber duck and a dog to me.

Generate a chair in svg

More a stool then chair.

"A mix between a rubber duck and a dog."

LLM’s have humor in a way.

Tags: AI, llm
Posted in browser, SVG | Comments Off on Creating graphics with LLM’s: generate SVG!

Comments Off

Open source coding with Ollama as Copilot

Tuesday, September 24th, 2024

Among the many tools available to developers today, support from AI as a coding assistant is a nice must-have.

Zed is a Rust-written modern open source code editor designed for collaborative editing and multiplayer teamwork. It works fine as a stand-alone editor with git support.

Ollama offers easy and privacy-friendly local driven LLM support. Get up and running with large language models on your own machine.

Zed does offer AI integration with ChatGPT or Claude, but it can also connect to your local Ollama install.

To try this out, just add this to the settings-file in Zed CTRL + ,:

"assistant": {
"version": "1",
"provider": {
"default_model": {
"name": "qwen2.5-coder",
"display_name": "codeqwen",
"max_tokens": 2048,
"keep_alive": -1
},
"name": "ollama",
// Recommended setting to allow for model startup
"low_speed_timeout_in_seconds": 30
}
}

Open a new assistant tab and you can change context of the assistant to your tab content, and let the AI assistant annotate your script:

/tab
Annotate the JS file

And can LLM’s create graphics? Of course they can.

How to create images with a Large Language Model (LLM)

SVG!

SVG is an xml based format that can be understood by humans and machines, so when a coding assistant can write code, it can also write SVG. 😉

Does it create sensible graphics? Not really.

I asked qwen2.5-coder in Zed:

Create a SVG file that shows a chair.

Does this look like a chair?

Another attempt:

Does it really look like a chair? What does it look like. Let me know in the comments!

Tags: AI, llm, ollama
Posted in SVG, Webtechnology | Comments Off on Open source coding with Ollama as Copilot