Baffled at PC freezing on Linux, but not on Windows for the same workload

submitted by edited

This issue is a long time coming. I got a mini pc (Asrock Deskmini h110, i5-6400, 16gb) that I have used for a long time with Kubuntu/Kde Neon, and most of its life, it worked great. Some years ago, it started freezing, especially at Graphic intensive workload, so I thought some hardware issue and converted it into a NAS and it worked absolutely fine as well for a couple of years there too. Recently my wife needed a Windows PC to do some work, and since I had upgraded my NAS, I repurposed the same PC and installed Windows on it, and it worked absolutely fine for her too. Then I decided to check some Graphics intensive workload, like 3d benchmarking stuff, and it didn’t freeze once. I was delighted, and thought maybe I didn’t investigated the issue the first time, and the PC was fine all along. So I reinstalled Debian 13, and lo behold, the issue came back. I found out while I was using IKEA’s 3d kitchen planner. So I replaced distros, and it froze on Ubuntu and CachyOS as well. I tried switching between Wayland and X11, switched browsers, but PC freezes seconds logging into IKEA’s kitchen planner (as soon as 3d graphics are loaded). I reinstalled windows, and my wife has been designing a kitchen in IKEA’s 3d kitchen planner for over an hour now, and it hasn’t frozen once. What’s going on? How do I even investigate this?

I have reinstalled Linux and had sudo dmesg -w running, but no logs are captured before it’s frozen. I have reproduced the issue multiple times now on Linux, and not once it froze on Windows. I have also done memtests, and tried multiple disks both nvme and sata. Also have tried multiple browsers with apt and flatpaks. I really need Lemmy’s collective intelligence to help me here.

Update: Well system stopped hanging with ’nomodeset’ as boot parameter, confirming that it’s Intel 915 driver. I tried variety of Intel related kernel parameters like psr, dc, guc, even messed with Intel cstate, but it hanged every time Intel 915 driver is even loaded. So ???

18
50

Log in to comment

18 Comments

A quick google search shows that many people have issues with the Intel Integrated Graphics on this particular PC (including on Windows) and this seems to be the solution:
https://forum.asrock.com/forum_posts.asp?TID=17021&title=solution-for-intel-graphics-freezing-system

The main solution is Windows based but someone does offer a Linux route to the same solution, although a linked file that sounds like setting it up may be easier is missing. Essentially it looks like the chipset needs tweaking to throttle the GPU slightly to prevent the flaw triggering.

Well you sent me down a rabbit hole and for a while I thought it was definitely the issue. I found a script that does what the suggested solution in the link says, and I adjusted iccmax values for GPU, but it doesn’t matter what parameters I adjusted, it kept freezing. I tried BIOS firmware downgrade and updates, changed undervolt parameters and so many other stuff, but didn’t help. In fact it kept freezing without going into graphic intensive work, so I tried reinstalling different distro with different DE, yet all of them froze so much much that I gave up. Reinstalled Windows and it hasn’t freeze once. It’s hard to admit as a Linux evangelist but windows works for some bloody reason where Linux simply fails.

It’s hard to admit as a Linux evangelist but windows works for some bloody reason where Linux simply fails.

Why would that be hard to admit? This is a bug in a specific hardware that has a crappy driver for Linux, Intel doesn’t care about Linux, if Linux was 90% of the market instead of 2% I assure you this bug wouldn’t exist.

Because I have been asking my wife to switch to Linux for sometime and one time where she finally agrees, this is what happens. Your reasons are valid but even if I could explain all that to her, she already got one of the worst impressions about Linux with this fiasco. I don’t know how to explain this but this small little thing, this bug, not only has kinda killed my credibility as a tech person in my house, it has also killed the hope of moving away from big corporations towards self hosted services.

Are you sure you are not hitting swap too hard? Windows by default makes a bonkers massive page file, while most distributions try to limit wasting so much for swap.

If so, try installing SwapSpace

I got 12gb swap which I think is plenty. I don’t think that’s the reason.

Yeah I tried that, didn’t help. I don’t think it’s swap. It has to do with graphics or display driver.

This is unrelated but I wonder if I could get better Windows VM performance by disabling swap for the VM. I use an old laptop with slow drive. I wonder if aggressive swapping could be the reason why my Windows machine feels frozen all the time.

Windows 10 and 11 really dislike HDDs, that’s probably why you can’t admit to using HDDs online without getting stones thrown at you (I’ve been there before).

I’ve disabled paging files (= swap) for one of my Windows VMs, unfortunately - to my surprise - that only had a small performance boost, and I still need to let the VM chug for a few mintes before it even lets me open File Explorer.

… but it does improve performance, definitely consider doing it if you don’t need swap/paging/whatever they call it now.

Not sure this is the answer, but going to throw my two cents. If you try and it works let us know.

When I got back into linux a couple years ago I hopped through all of the distros you mentioned. The last one being KDE Neon. When I first found it I absolutely loved it. Decided that was going to be my main distro and started migrating all my systems which is a couple laptops, gaming desktop, and mini PC.

Over time I found that I was having minor, but consistent hardware issues. Similar to yours, freezing, and other gpu issues. It was most apparent on systems that had newer hardware. Looking at the specs for your mini PC it seems a bit older than what I have so again, not sure this applies to you, but I found my saving grace in Fedora. My issue specifically was the older kernel Neon uses not interacting well with my newer hardware and in some cases not having access to some hardware features. Fedora had a KDE spin otherwise I wouldn’t have done it. It has been my daily driver on all systems since.

TL;DR: Try Feodra KDE Spin or any distro that ships with a more up to date kernel

journalctl -f (-f is follow mode) or as root user (sudo journalctl -f), open kitchen planner with the terminal visible and look at the (frozen) logs.

Already did. No logs were added before system freezing

Linux is kind of dogshit at memory management unfortunately, due to a thing called overcommit it can essentially comit more memory than there actually exists on the system, cue hard locks, stutters and freezing, there are some mitigations

Now the common recommendation is brute forcing the issue by making a swap partition or file, this is generally a bad idea on a modern system, due to the speed of ssds and even some fast spinning hard drives its possible for the kernel to get confused and think there’s enough memory spare where it becomes less snappy about clearing memory, often ironically making the problem worse, generally only use a swap partition if you need something else like hibernation

What should instead be done these days is setting up zram which essentially does compression of your memory in ram so that it stays away from swapspace for as long as possible, zswap does something similar but will also fallback to a swap partition soon after, the two are very similar so just setup zram on installs that dont have or need a swap partition, and setup zswap on ones that need or already have a swap partition(as they’re annoying to remove in an existing install)

But we’re not done yet because while this will buy you a lot of time there is still another massive issue at play, the linux kernel’s oom killer doesn’t generally care how full your memory is unless it starts biting into the kernel’s needs, this means it can take ages before it does anything even if everything is so overloaded where your display server has been hard frozen for 20 minutes, and since it has no real priority outside of the kernel it may force shut off vital processes first rather than secondary ones

To fix this you want a userspace oom killer that keeps in mind the entire OS and prioritises what needs to go first, many exist but if you want something that just works then EarlyOOM is great

With these mitigations in place your OS should become much less prone to freezing and stutters, if you really want a nuclear option there’s also cgroups

I though many distros setup zram automatically? I know Fedora does. I would expect Ubuntu to do the same.

+1 for zram and earlyoom!