Anonymous edits have been disabled on the wiki. If you want to contribute please login or create an account.


Warning for game developers: PCGamingWiki staff members will only ever reach out to you using the official press@pcgamingwiki.com mail address.
Be aware of scammers claiming to be representatives or affiliates of PCGamingWiki who promise a PCGW page for a game key.

Difference between revisions of "Troubleshooting guide/Hardware problems"

From PCGamingWiki, the wiki about fixing PC games
m (spelling + reordered relevant software list)
m (→‎CPU + GPU: General maintenance with AutoWikiBrowser in Bot mode)
 
(8 intermediate revisions by 4 users not shown)
Line 5: Line 5:
  
 
'''Relevant software'''
 
'''Relevant software'''
* [http://www.piriform.com/speccy/ Speccy] - System information tool for Windows
+
* [https://www.cpuid.com/ CPUID] - System information tools for Windows
* [http://www.resplendence.com/whocrashed WhoCrashed] - Kernel crash dump analyzer for Windows
+
* [https://www.resplendence.com/whocrashed WhoCrashed] - Kernel crash dump analyzer for Windows
* [http://www.ultimatebootcd.com/ Ultimate Boot CD (UBCD)]
+
* [https://www.ultimatebootcd.com/ Ultimate Boot CD (UBCD)]
* [http://www.hirensbootcd.org/ Hiren's Boot CD]
+
* [https://www.hirensbootcd.org/ Hiren's Boot CD]
  
 
==Show hardware components==
 
==Show hardware components==
Line 32: Line 32:
  
 
===Power supply unit (PSU)===
 
===Power supply unit (PSU)===
Non-deterministic problems are sometimes caused by a bad power supply unit (PSU).<ref>[http://ask-leo.com/could_my_power_supply_be_causing_memory_errors.html Could my power supply be causing memory errors? - Ask Leo!]</ref> If the power supply is not stable, it is futile to test other parts of the system because they will yield inconsistent results. Power supplies do not indicate whether they are having problems because they generally do not include self-testing hardware. Sometimes electrical noise (buzzing) may be heard though.  
+
Non-deterministic problems are sometimes caused by a bad power supply unit (PSU).<ref>{{Refurl|url=http://ask-leo.com/could_my_power_supply_be_causing_memory_errors.html|title=Could my power supply be causing memory errors? - Ask Leo!|date=May 2023}}</ref> If the power supply is not stable, it is futile to test other parts of the system because they will yield inconsistent results. Power supplies do not indicate whether they are having problems because they generally do not include self-testing hardware. Sometimes electrical noise (buzzing) may be heard though.  
  
 
The best way to test a power supply is with a special PC power supply tester. If one is not available, try to load the PSU as much as possible, generally by running [[#CPU + GPU|all]] stress tests and connecting as many external devices as available (preferably in "cold" conditions). If this hamper system stability in any way, swapping the PSU with a different known working one should be enough to rule out or not the problem.  
 
The best way to test a power supply is with a special PC power supply tester. If one is not available, try to load the PSU as much as possible, generally by running [[#CPU + GPU|all]] stress tests and connecting as many external devices as available (preferably in "cold" conditions). If this hamper system stability in any way, swapping the PSU with a different known working one should be enough to rule out or not the problem.  
 +
 +
===Motherboard===
 +
The other big source of nebulous, whimsical and unpredictable issues is the motherboard. Putting aside blatant boot errors caused by innocuous misconfiguration (that most of times can be diagnosed with a [[Wikipedia:PC speaker|buzzer]], or from LEDs and seven-segment displays on more premium models), there are infinitely more insidious ways electrical components here can fail.<ref name=caps>[https://www.badcaps.net/forum/showthread.php?t=425#post2045 The Bad Capacitor FAQ - Badcaps Forums]</ref> The problem should be OS-independent and may even be reproduced in the BIOS setup utility (if it's not load-specific and doesn't compromise [[Wikipedia:Power-on self-test|POST]] altogether).
 +
 +
Causes can in turn range from something trifling like dust bridging circuit traces<ref>{{Refurl|url=https://badcaps.net/forum/showthread.php?t=86912#post978344|title=Asus P5QPL-AM problem - Badcaps Forums|date=May 2023}}</ref> or covering pins in the socket, all the way to fried VRMs.<ref>[https://web.archive.org/web/20151103100533/https://www.overclock.net/t/943109/about-vrms-mosfets-motherboard-safety-with-125w-tdp-processors About VRMs & Mosfets / Motherboard Safety with 125W+ TDP processors | Overclock.net] (retrieved)</ref> They should all be eventually quantifiable with enough measuring and (de)soldering<ref>[https://www.vogons.org/viewtopic.php?t=75819 Testing capacitors in circuit \ VOGONS]</ref><ref>{{Refurl|url=https://www.badcaps.net/forum/showthread.php?t=18141|title=Measuring ESR in circuit - Badcaps Forums|date=May 2023}}</ref> equipment, and patience.<ref>{{Refurl|url=https://www.badcaps.net/forum/showthread.php?t=600|title=Testing VRMs - Badcaps Forums|date=May 2023}}</ref><ref>[https://www.eevblog.com/forum/testgear/finding-short-on-motherboards-with-a-shorty-(with-display)/ finding short on motherboards with a shorty (with display) - EEVblog Electronics Community Forum]</ref>
 +
 +
Woes with capacitors deserve a special mention too. In particular the so-called [[Wikipedia:Capacitor plague|capacitor plague]] affected a disproportionate number of boards made in the first decade of the century (all manufacturers have since switched to solid electrolytes). Most of times it's an easy defect to spot just with the naked eye<ref>{{Refurl|url=https://www.badcaps.net/index.php?pageid=identity|title=Badcaps.net Forums - How To Identify|date=May 2023}}</ref><ref>{{Refurl|url=http://www.capacitorlab.com/visible-failures/|title=Capacitor Lab - Visual Signs of Capacitor Failure|date=May 2023}}</ref> (including identification of known unreliable part numbers<ref>{{Refurl|url=https://badcaps.net/forum/showthread.php?t=388&page=9#post182333|title=List of Bad Cap Manufacturers - Badcaps Forums|date=May 2023}}</ref><ref>{{Refurl|url=https://www.badcaps.net/forum/showthread.php?t=30655#post364184|title=Bad Fake? Nichichon TMV 4v 680uF Caps - Badcaps Forums|date=May 2023}}</ref>), but some bad cases may still look completely inconspicuous without an [[Wikipedia:ESR meter|ESR meter]] for actual testing.<ref name=caps/><ref>{{Refurl|url=https://www.eevblog.com/forum/repair/bad-electrolytic-capacitor-looks-ok/|title=Bad Electrolytic capacitor looks OK - EEVblog Electronics Community Forum|date=May 2023}}</ref> Take note that protracted inactivity can be even ''more'' taxing on capacitors than constant relentless usage.<ref>{{Refurl|url=https://badcaps.net/forum/showthread.php?t=48850|title=Gigabyte GA-8IPE1000 - KZG caps bloat in storage - Badcaps Forums|date=May 2023}}</ref><ref>{{Refurl|url=https://www.badcaps.net/forum/showthread.php?t=50530|title=How to Recondition (Reform) Electrolytic Capacitors and Why - Badcaps Forums|date=May 2023}}</ref> Conversely, it could happen systems with totally blown caps still run fine, but the risk of destroying the rest of the components and even the CPU itself raises significantly.<ref>{{Refurl|url=https://www.badcaps.net/forum/showthread.php?t=4306|title=Bad caps and stable system (or vice versa)? - Badcaps Forums|date=May 2023}}</ref>
  
 
===Memory (RAM)===
 
===Memory (RAM)===
Line 41: Line 48:
 
Windows Vista+ have a built in memory tester also, which can be found by running mdsched.exe.
 
Windows Vista+ have a built in memory tester also, which can be found by running mdsched.exe.
  
In case errors don't present in a random pattern (meaning the issue is confined only to determinate RAM addresses) it might be possible to just bypass the affected locations.<ref>[http://unix.stackexchange.com/a/76188/163877 linux - How to blacklist a correct bad RAM sector according to MemTest86+ error indication? - Unix & Linux Stack Exchange]</ref><ref>[https://superuser.com/a/490522/567466 memory - Running Windows with defective RAM - Super User]</ref>
+
In case errors don't present in a random pattern (meaning the issue is confined only to determinate RAM addresses) it might be possible to just bypass the affected locations.<ref>{{Refurl|url=http://unix.stackexchange.com/a/76188/163877|title=linux - How to blacklist a correct bad RAM sector according to MemTest86+ error indication? - Unix & Linux Stack Exchange|date=May 2023}}</ref><ref>{{Refurl|url=https://superuser.com/a/490522/567466|title=memory - Running Windows with defective RAM - Super User|date=May 2023}}</ref>
  
 
===Drive (SSD, SSHD, HDD, etc.)===
 
===Drive (SSD, SSHD, HDD, etc.)===
Line 56: Line 63:
  
 
AMD does not release diagnostic software for end users. Use [http://www.mersenne.org/download/ Prime95] to stress test the CPU and see if it fails. The Windows Event Log may record a machine check error code; this will provide more specific information on what caused the problem.
 
AMD does not release diagnostic software for end users. Use [http://www.mersenne.org/download/ Prime95] to stress test the CPU and see if it fails. The Windows Event Log may record a machine check error code; this will provide more specific information on what caused the problem.
 +
 +
Make sure temperature is well within the operating margins (if it's just ''slightly'' below, certain spots around the processor may still be hotter). Sudden case movements may untight the heatsink pins.<ref>{{Refurl|url=https://old.reddit.com/r/pcmasterrace/comments/l1hn0r/stock_intel_cpu_cooler_came_loose/|title=Stock Intel CPU Cooler came loose? : pcmasterrace - Reddit|date=May 2023}}</ref>
  
 
===Graphics card (GPU)===
 
===Graphics card (GPU)===
If you are experiencing visual artifacts or sluggishness in visual applications, then there may indeed be a problem with you GPU. First, use a utility such as [http://www.techpowerup.com/gpuz/ GPU-Z] or [http://www.cpuid.com/softwares/hwmonitor.html HWMonitor] to see if your card is running over the recommended temperature (the max is usually around 80 Celsius), if so, then you card will be throttling itself in self-preservation. Check then, to see if the fans or blocks on your card are functioning correctly, and blow out any dust or debris build-up with a can of compressed air. If that isn't the problem, check the video drivers, and see if there is a new version or if the one you are using is reported as being unstable, in either case, perform a clean install of the drivers. Next, if you have integrated video as well as a discreet card, make sure that the computer switches when in game to your discreet card correctly instead of staying on integrated. Last try to [https://www.raymond.cc/blog/having-problems-with-video-card-stress-test-its-memory/ memtest] VRAM and stress test it with [http://www.ozone3d.net/benchmarks/fur/ FurMark]. If all else fails, check the warranty on your GPU (most are 2-3yrs) and RMA the card for repairs or a rebate of some sort.
+
If you are experiencing visual artifacts or sluggishness in visual applications, then there may indeed be a problem with you GPU. First, use a utility such as [https://www.techpowerup.com/gpuz/ GPU-Z] or [http://www.cpuid.com/softwares/hwmonitor.html HWMonitor] to see if your card is running over the recommended temperature (the max is usually around 80 Celsius), if so, then you card will be throttling itself in self-preservation. Check then, to see if the fans or blocks on your card are functioning correctly, and blow out any dust or debris build-up with a can of compressed air. If that isn't the problem, check the video drivers, and see if there is a new version or if the one you are using is reported as being unstable, in either case, perform a clean install of the drivers. Next, if you have integrated video as well as a discreet card, make sure that the computer switches when in game to your discreet card correctly instead of staying on integrated. Last try to [https://www.raymond.cc/blog/having-problems-with-video-card-stress-test-its-memory/ memtest] VRAM and stress test it with [http://www.ozone3d.net/benchmarks/fur/ FurMark]. If all else fails, check the warranty on your GPU (most are 2-3yrs) and RMA the card for repairs or a rebate of some sort.
  
 
===CPU + GPU===
 
===CPU + GPU===
Run both tests together. Remember to lower CPU test priority to avoid bottlenecks in the GPU one.<ref>[http://blog.szynalski.com/2012/11/the-right-way-to-stress-test-an-overclocked-pc/ The right way to stress-test an overclocked PC « Trying To Be Helpful]</ref>
+
Run both tests together. Remember to lower CPU test priority to avoid bottlenecks in the GPU one.<ref>{{Refurl|url=http://blog.szynalski.com/2012/11/the-right-way-to-stress-test-an-overclocked-pc/|title=The right way to stress-test an overclocked PC « Trying To Be Helpful|date=9 June 2023}}</ref>
  
 
{{References}}
 
{{References}}
Line 67: Line 76:
 
[[Category:Hardware]]
 
[[Category:Hardware]]
 
[[Category:Guide]]
 
[[Category:Guide]]
[[Category:Troubleshooting]]
 

Latest revision as of 03:49, 9 June 2023

This page may require cleanup to meet basic quality standards. You can help by modifying the article. The discussion page may contain useful suggestions.

Hardware diagnosis software can be used to determine whether the problems on your PC are being caused by faulty or broken hardware. There are many utilities that are designed to scan the physical components of your computer to check whether they are in good condition.

Relevant software

Show hardware components

Windows

Open the DirectX Diagnostic Tool:

  • Windows Vista and later: open the Start Screen/Start Menu, type dxdiag and press Enter.
  • Windows XP: press Win+R, type dxdiag and press Enter.

Open the System Information utility:

  • Windows Vista and later: open the Start Screen/Start Menu, type msinfo32 and press Enter.
  • Windows XP: press Win+R, type msinfo32 and press Enter.

Linux

Through the Terminal

$ lspci
$ lsusb

See also Linux.

Stability testing

Many parts of a PC work together to run a game. Crashes are often caused by problems where two or more parts interact. The first question to be asked when a crash occurs is whether the PC is stable without the game running.

Power supply unit (PSU)

Non-deterministic problems are sometimes caused by a bad power supply unit (PSU).[1] If the power supply is not stable, it is futile to test other parts of the system because they will yield inconsistent results. Power supplies do not indicate whether they are having problems because they generally do not include self-testing hardware. Sometimes electrical noise (buzzing) may be heard though.

The best way to test a power supply is with a special PC power supply tester. If one is not available, try to load the PSU as much as possible, generally by running all stress tests and connecting as many external devices as available (preferably in "cold" conditions). If this hamper system stability in any way, swapping the PSU with a different known working one should be enough to rule out or not the problem.

Motherboard

The other big source of nebulous, whimsical and unpredictable issues is the motherboard. Putting aside blatant boot errors caused by innocuous misconfiguration (that most of times can be diagnosed with a buzzer, or from LEDs and seven-segment displays on more premium models), there are infinitely more insidious ways electrical components here can fail.[2] The problem should be OS-independent and may even be reproduced in the BIOS setup utility (if it's not load-specific and doesn't compromise POST altogether).

Causes can in turn range from something trifling like dust bridging circuit traces[3] or covering pins in the socket, all the way to fried VRMs.[4] They should all be eventually quantifiable with enough measuring and (de)soldering[5][6] equipment, and patience.[7][8]

Woes with capacitors deserve a special mention too. In particular the so-called capacitor plague affected a disproportionate number of boards made in the first decade of the century (all manufacturers have since switched to solid electrolytes). Most of times it's an easy defect to spot just with the naked eye[9][10] (including identification of known unreliable part numbers[11][12]), but some bad cases may still look completely inconspicuous without an ESR meter for actual testing.[2][13] Take note that protracted inactivity can be even more taxing on capacitors than constant relentless usage.[14][15] Conversely, it could happen systems with totally blown caps still run fine, but the risk of destroying the rest of the components and even the CPU itself raises significantly.[16]

Memory (RAM)

Memory stability testing is performed using the memtest86+ utility.

Windows Vista+ have a built in memory tester also, which can be found by running mdsched.exe.

In case errors don't present in a random pattern (meaning the issue is confined only to determinate RAM addresses) it might be possible to just bypass the affected locations.[17][18]

Drive (SSD, SSHD, HDD, etc.)

Drive stability testing is performed using smartmontools. Using the -x argument to the utility, verify the following:

  • The drive is not overheating (SSDs may not have a temperature sensor, measure their temperature some other way).
  • The drive is not reporting read or write faults in its error log.
  • The drive is not reporting a pre-fail condition.

If each of those items are true, then follow the directions to perform a short self-test. Verify that the drive executes and passes this test. If not, go to the drive vendor web site support section and follow the directions to download their drive analysis software. Follow the directions to obtain a specific problem report and return the drive if it is under warranty. If the drive is not under warranty, swap the drive for a new one.

CPU

Intel CPU testing is performed using the Intel Processor Diagnostic Tool.

AMD does not release diagnostic software for end users. Use Prime95 to stress test the CPU and see if it fails. The Windows Event Log may record a machine check error code; this will provide more specific information on what caused the problem.

Make sure temperature is well within the operating margins (if it's just slightly below, certain spots around the processor may still be hotter). Sudden case movements may untight the heatsink pins.[19]

Graphics card (GPU)

If you are experiencing visual artifacts or sluggishness in visual applications, then there may indeed be a problem with you GPU. First, use a utility such as GPU-Z or HWMonitor to see if your card is running over the recommended temperature (the max is usually around 80 Celsius), if so, then you card will be throttling itself in self-preservation. Check then, to see if the fans or blocks on your card are functioning correctly, and blow out any dust or debris build-up with a can of compressed air. If that isn't the problem, check the video drivers, and see if there is a new version or if the one you are using is reported as being unstable, in either case, perform a clean install of the drivers. Next, if you have integrated video as well as a discreet card, make sure that the computer switches when in game to your discreet card correctly instead of staying on integrated. Last try to memtest VRAM and stress test it with FurMark. If all else fails, check the warranty on your GPU (most are 2-3yrs) and RMA the card for repairs or a rebate of some sort.

CPU + GPU

Run both tests together. Remember to lower CPU test priority to avoid bottlenecks in the GPU one.[20]


References

  1. Could my power supply be causing memory errors? - Ask Leo! - last accessed on May 2023
  2. 2.0 2.1 The Bad Capacitor FAQ - Badcaps Forums
  3. Asus P5QPL-AM problem - Badcaps Forums - last accessed on May 2023
  4. About VRMs & Mosfets / Motherboard Safety with 125W+ TDP processors | Overclock.net (retrieved)
  5. Testing capacitors in circuit \ VOGONS
  6. Measuring ESR in circuit - Badcaps Forums - last accessed on May 2023
  7. Testing VRMs - Badcaps Forums - last accessed on May 2023
  8. finding short on motherboards with a shorty (with display) - EEVblog Electronics Community Forum
  9. Badcaps.net Forums - How To Identify - last accessed on May 2023
  10. Capacitor Lab - Visual Signs of Capacitor Failure - last accessed on May 2023
  11. List of Bad Cap Manufacturers - Badcaps Forums - last accessed on May 2023
  12. Bad Fake? Nichichon TMV 4v 680uF Caps - Badcaps Forums - last accessed on May 2023
  13. Bad Electrolytic capacitor looks OK - EEVblog Electronics Community Forum - last accessed on May 2023
  14. Gigabyte GA-8IPE1000 - KZG caps bloat in storage - Badcaps Forums - last accessed on May 2023
  15. How to Recondition (Reform) Electrolytic Capacitors and Why - Badcaps Forums - last accessed on May 2023
  16. Bad caps and stable system (or vice versa)? - Badcaps Forums - last accessed on May 2023
  17. linux - How to blacklist a correct bad RAM sector according to MemTest86+ error indication? - Unix & Linux Stack Exchange - last accessed on May 2023
  18. memory - Running Windows with defective RAM - Super User - last accessed on May 2023
  19. Stock Intel CPU Cooler came loose? : pcmasterrace - Reddit - last accessed on May 2023
  20. The right way to stress-test an overclocked PC « Trying To Be Helpful - last accessed on 9 June 2023