- Постоянная проблема c GPU 0
- webkost
- GPU instability/bad performance during bakes in Toolbag/Painter after Windows Fall Creators Update
- Console: GPU error
- Not enough memory, failed to allocate memory — VRAM, BUFFER, or DAG
- Unresponsive GPUs
- Temperature limit reached
- OpenCL or CUDA crash and/or other unknown errors
- Run lolminer 1.3 error report #652
- Comments
- wulong2020 commented Jul 6, 2021 •
- After running for a few hours, if the following error message is reported, lolminer will restart:
- jgonzis commented Jul 6, 2021
- After running for a few hours, if the following error message is reported, lolminer will restart:
Постоянная проблема c GPU 0
webkost
Пляшущий с бубном
Доброго времени суток.
Прошу помощи, т.к. бьюсь с проблемой уже очень долго.
(Риг — ASUS 6х1060-3GB, БП Corsair TX750M 2×750, Мать ASUS Prime Z270-P, ОП Kingston 4GB)
Идет постоянный сбой GPU 0, на винде при запуске клеймора, он видит 6 карточек, затем выдает ошибку, перезапускается и уже видит 5 карточек и работает нормально. Разгон пробовал разный ставить, и на стандарт скидывал, не помогло. Винда видит что все 6 карточек в порядке. Подумал сначала проблема в карте/разъеме/райзере, поковырялся и понял, что клеймор ругается именно на GPU 0, то есть я подключаю одну видеокарту, ни чего не работает, подключаю две — GPU 0 отваливается, вторая карточка работает нормально. И самое интересное в том, что это не зависит от разъема PCE и от видеокарты. Я менял их местами, подключал в разные разъемы, итог один, всегда проблема с GPU 0. Через время поставил Hive OS думал в винде косяк, проблема оказалась та же самая. Не важно какая карта, какое их количество подключено, или какой разъем PCE — всегда не работает GPU 0 а остальные работают в полном порядке и с разгоном и без. Вот что пишет hive os — >Claymore Reboot: WATCHDOG: GPU error, you need to restart miner (в логах — watchdog — thread 0 (gpu0), hb time 8171), я так понимаю ватчдог обращается к карточки и не получив ответа идет на ребут.
Очень прошу помощи, может кто сталкивался или знает как решить проблему, или кто советом поможет, буду очень благодарен.
Источник
GPU instability/bad performance during bakes in Toolbag/Painter after Windows Fall Creators Update
First off, I’m not sure if this is the best place to post this so mods feel free to move/close the thread.
Alright, so I’m having a bit of an issue with my GPU. After the Windows Fall Creators Update my GPU tends to hang quite a bit and it’s generally slower when using Toolbag 3 and Substance Painter. Toolbag loads projects super slowly, the baker takes a long time to initialize and baking a 4K AO map is a guaranteed crash.
In Marmoset, whenever I try baking a 4K AO map I always get a «Fatal GPU error». Then Toolbag crashes and I get a desktop notification saying toolbag «has been denied Graphics Device usage». This is the error message I see inside Toolbag:
I tried changing the «Baker GPU Priority» but none of the settings had an effect on the baker’s stability. The baker doesn’t just crash, it’s also very, very slow compared to what it was before the Windows update. Before I could easily crank out a 4K AO map in around a minute — now I can barely bake a 2K AO map without crashing the baker and the baking process takes forever. Normal map bakes are almost instant though, even at 16bit 4K + max AA, which is kinda weird.
In Painter loading a project is very slow. Painter’s baker doesn’t crash though, it’s just very slow compared to before the update. Then, after the update, I was greeted by this notification when booting up Painter. This never came up before the update.
I added the «TdrDelay» and «TdrDdiDelay» registry keys into the registry and set their values to 60 seconds. Now Painter doesn’t mention it on startup, but the baker is still very slow. Changing the TDR values didn’t have an effect on Toolbag’s baker’s stability either.
What’s worth mentioning is that I don’t have performance issues with games. I have no trouble running games on the usual, almost maxed out settings. The GPU’s instability is the most apparent when doing a long bake or loading a project.
So far I’ve:
— Uninstalled all graphics drivers using DDU, and reinstalled the most recent NVIDIA drivers manually from their site.
— Rolled back the GPU driver 6 versions to see if one of the drivers would be better suited for the new Windows version.
— Did a complete Windows reinstall, wiping my C: drive clean. The installation left me with the Fall Creators Update which is a little unfortunate.
— «sfc /scannow», with no unusual results.
— Disk cleanup including system files.
— Checked NVIDIA control panel for odd settings.
— Made sure Windows game mode was turned off.
— Changed the TDR registry values to 60.
— Updated all other components’ drivers to see if they’re conflicting with the GPU.
— Contacted NVIDIA support. They adviced me to reinstall the OS to eliminate the possibility of any corruption.
— Contacted Microsoft support twice. The first time wasn’t all that useful, the second time the support agent managed to mess up my computer during a remote session.
— I did several GPU benchmarks to see if the issue is hardware related, but nothing unusual came up. The GPU itself is fine.
— Reinstalled Toolbag 3 and Substance Painter.
— Done countless reboots and shutdowns.
Oh, and I can’t roll back to a previous version of Windows since my C: drive was completely wiped during the Windows reinstallation process.
Источник
Console: GPU error
When you see a «GPU error» on your 24h logs or worker’s latest activity there is a trouble with detecting information connected to your GPU — in some cases, you will also be able to see which GPUs are the problematic ones.
We suggest to double check if all devices are properly connected and detected and to re-check your overclocking settings and adjust them to make your GPUs more stable.
There are different groups of GPU errors that appear in the console and while they are all facing the same issues they report different events.
Not enough memory, failed to allocate memory — VRAM, BUFFER, or DAG
- If you are using Claymore, switch to some other mining client as Claymore is outdated at this point.
- If you have 3GB or 4GB GPUs and you are getting this error while mining ETH, you will need to switch to some other coin as DAG size already exceeded the space you have available in the memory. See DAG calculator to learn more about DAG size.
- If your GPUs have enough RAM and you are getting this error, you can check your overclocking settings as too intense or missing overclocking settings can be the reason for this error as well.
Unresponsive GPUs
To fix this error, we recommend you to check out overclocking settings and adjust them to make your GPUs more stable.
Temperature limit reached
You can try to fix this error by setting proper temprature triggers or auto-fan control in your overclocking settings.
OpenCL or CUDA crash and/or other unknown errors
One case when this error can appear is if you are using AMD GPUs but you have added worker as Nvidia on the minerstat dashboard or vice-versa. If this is the case, write to us, or delete your worker and re-add it.
Use minerstat software and improve your mining operation
Источник
Run lolminer 1.3 error report #652
Comments
wulong2020 commented Jul 6, 2021 •
After running for a few hours, if the following error message is reported, lolminer will restart:
Unrecoverable memory error by GPU 2.
Reset of all Cuda GPUs required.
Please check your (memory) OC & UV settings on this card.
New job received: 0x542ce7 Epoch: 425 Target: 000000016e80fe03
Device 0 detected as crashed.
Closing miner and trying to call external script: ./emergency.sh (—watchdog script)
After trying to reduce the video memory frequency from 2500 to 2400, the error will still occur
The text was updated successfully, but these errors were encountered:
jgonzis commented Jul 6, 2021
After running for a few hours, if the following error message is reported, lolminer will restart:
Unrecoverable memory error by GPU 2.
Reset of all Cuda GPUs required.
Please check your (memory) OC & UV settings on this card.
New job received: 0x542ce7 Epoch: 425 Target: 000000016e80fe03
Device 0 detected as crashed.
Closing miner and trying to call external script: ./emergency.sh (—watchdog script)
After trying to reduce the video memory frequency from 2500 to 2400, the error will still occur
Could you try to reduce Memory OC to 2200 for example. 2400 is quite up and could be the problem of the crash.
Источник