Skip to main content

Free Shipping on Domestic Orders $75+

A2115 / 2020 / Processors from 3.1 GHz 6-core i5, up to 3.8 GHz 8-core i7. Released August 4, 2020.

Kernel panic when CPU is hot for long time

Hi everyone!

TLDR: T2 / PCH related kernel panic if CPU is hot for a long time, boots only after I let it cool down. No kernel panic during normal/light use. Possibly a faulty component on the motherboard that has a bad connection? Possible fix if opened?

EDIT: More tests in the comments

Long version:

I obtained a faulty 2020 iMac 5K with an i7-10700K and 5500XT to be used as a DIY 5K project base. The ad said that the GPU is faulty, it randomly restarts but during normal use, no problem.

The screen is the most important for me (only slight pink hue around the edge, no problem), I did not really care about the issue but here is what I discovered and it made me more interested in fixing the issue.

I benchmarked the GPU using Heaven Benchmark for 1-2 hours running at max fan speed, the GPU was at 80-90 degrees and it did not restart.

Then I benchmarked the CPU using Cinebench, survived 10 minute single-core but crashed 2-3 seconds after starting multi-core. Later when it cooled down, I tested the multi-core again and it lasted a lot longer but not 10 minutes.

When it restarts, sometimes it crashes on boot but mostly it gets to the login screen, can stay there for hours but after entering my password and it would start loading everything, it crashes until I let it cool down so it has a thermal headroom or something. Macs Fan Control turns up the fan speed immediately after login but still not early enough, I also turned off Intel Turbo Boost to decrease the temp generation.

The kernel panic logs (when present) show T2 / PCH / SEP related crashes (BAD MAGIC, x86 global reset detected - CORE 0 is the one that panicked / void AppleEmbeddedPCIeUpLinkMgmt::_linkInterruptAction(IOInterruptEventSource *, int): A link timeout has been seen after 650000 microseconds and 49999 iterations - CORE 0 is the one that panicked

But the weird thing is that I have been using this thing everyday for basic tasks, logging in, sleeping, passwd auth, everything seems to be working as usual. I guess normal tasks use the T2 as well but it does not heat up that much maybe?

What I'm planning to do in the coming weeks is to open it up, check visual defects on the motherboard, get an LGA1200 PC motherboard to test if the CPU is okay or not.

This whole issue seems to be only happening when the CPU is over 75-80 degrees for longer period of time when the nearby components are also heated up, I suspect a faulty connection somewhere that is when hot, not connecting correctly. Maybe the T2 chip's connection is bad or something?

What do you think, what would be the best steps to troubleshoot this issue? Is there a tool that only stress tests the T2 chip and not the CPU? Maybe a feature in macOS that really stresses that?

Thank you in advance!

Answer this question I have this problem too

Is this a good question?

Score 0
Add a comment

2 Answers

Most Helpful Answer

I would try replacing the thermal pads or paste... seems like thermal throttling.

Was this answer helpful?

Score 1

19 Comments:

Will definitely try but the previous owner said it has been replaced already on the CPU/GPU. During a thermal throttle, the CPU/GPU would decrease the performance, not kernel panic, no?

Maybe the T2 has thermal pad/paste as well? I suspect that T2 overheats or something there and that is the reason for the crash. That is why I want to try stressing only the T2, not the CPU to validate this theory

by

Could also be a GPU failure

by

You can also try Apple Diagnostics. Just turn off your Mac, turn it back on and immediately press and hold the D key until you see a language selection or progress bar.

by

I tested the GPU under full load for 1-2 hours with temps reaching above 80 degrees for the GPU and it did not restart. I once tried to use Apple Diag after many reboots and it also crashed during the test, did not get any result code, will try to take the computer outside and run the test in 10 degrees ambient temp

by

Just ran a diagnostic test outside, no issues were found

by

Show 14 more comments

Add a comment

Have you installed a good thermal monitoring App which also allows you to boost the fans RPM/ I personally like TG-Pro it will allow you to see what's getting too hot and you can boost the fan's RPM so you don't cook things. I also like it as it can create a log (CVS file) tracking the temps so you can see when the error pops what was happening

I would also make sure the fan blades and the heatsink fin area is full clean of dust and debris.

Was this answer helpful?

Score 1

14 Comments:

I performed many tests this evening, the results are documented under Amazing FiXeR’s answer as comments, I use Macs Fan Control to check the temps and set the fan speed.

The previous owner said that it has been cleaned in a tech shop, but I will open it up when I have time, perform a visual inspection and maybe replace the thermal paste but according to my tests, the issue is not with the CPU or GPU or RAM.

The temps are normal, or actually what is visible in the app. The CPU under heavy load can reach 100 degrees but it throttles down to 90-95 as usual but the test keeps going for 20 or more minutes (outside with ambient temps below 10 celsius) if the GPU test is not running. GPU test can go for hours even inside

During normal use (Safari, code editing, document editing, chatting) it does not heat up, I also set the fan speed manually to speed up when the CPU temp is at 55 degrees but I suspect there is another component either on the motherboard or the PSU itself that heats up and causing the crash

by

@scania471 - yes I think you're right this is a deeper logic board fault. There are six VRM models if I remember which regulate the power to the CPU in this series which can overheat as they sit quite close to the CPU. It could be as simple as a cold solder joint on one of these and there support components.

by

@danj I just opened the iFixit teardown of this iMac this afternoon and saw a comment about these VRM modules. I have been thinking about them for hours and the fact that the i5 version has less of these modules then the i7/i9 versions but this machine came with an i7 from the factory, so no problem there.

If one of these modules are bad, that could be an answer for the crashes in warm environment and making it past 20 minutes in cold weather but I can't seem to find an answer for crashing if I started both CPU and GPU tests even not fully loaded (like 70 degrees max) and crashing in 2-3 minutes but I will definitely check those modules visually when I get to open this thing

by

@danj I took it apart, checked the VRM modules and this is the only weird thing I could find: https://imgur.com/a/HrKm5N0

It does not seem to be cracked, just a scratch or similar. I tried to poke every component on the motherboard that is this size but none of them moved. If one side is not connecting perfectly, it should move at least a little bit, right?

by

@scania471 - Sorry I don't see the crack, are you speaking about the darker mold seam line on the inductor? They look OK from what I can see. The view of the VRM chips are being blocked, can you take a picture straight down nice and tight like this one?.

As far as a cold solder joint, that doesn't mean it's physically loose. I was thinking the VRM chip it's self or maybe the capacitor or resistors around them.

by

Show 9 more comments

Add a comment

Add your answer

Martin Terhes will be eternally grateful.
View Statistics:

Past 24 Hours: 0

Past 7 Days: 6

Past 30 Days: 23

All Time: 215