Subject: | |
From: | |
Reply To: | |
Date: | Tue, 23 Apr 2013 13:26:09 -0700 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
Thank you Steven and Todd,
atop is now installed waiting for the next time it happens.
nvidia-smi reports fan at 40% and temp at 33°C but I do have a 550ti
sitting around so I will replace it to see if it makes a difference.
liveCD is being downloaded.
Thank you again. I was running out of things to try.
Joe
On 04/23/2013 12:43 PM, Steven J. Yellin wrote:
> The atop service from epel logs processes into /var/log/atop
> files. You can run 'atop -r ...' interactively on the file being
> updated at the time the computer froze in order to see what was
> happening just before it happened.
>
> Steven Yellin
>
> On Tue, 23 Apr 2013, Joseph Areeda wrote:
>
>>
>> On 04/23/2013 11:44 AM, Joseph Areeda wrote:
>>> Greetings,
>>>
>>> I'm having this strange behavior that I think is a hardware problem
>>> I can't find.
>>>
>>> I can usually run for 4-8 hrs without a problem then all of a sudden
>>> I get one of the following:
>>>
>>> * System freezes, mouse and keyboard dead, sshd unresponsive
>>> sometimes
>>> * if the keyboard is alive going to an open terminal I get one of
>>> the following errors about equally probable:
>>> o input out put error
>>> o too many files open
>>> o bus error
>>> o may be others that haven't happened for a while
>>>
>>> I've run memtest for 10 hrs, no problem. Fsck shows now problem,
>>> disk utility show those with SMART are all fine.
>>>
>>> I have now found any particular program or operation that causes the
>>> failure.
>>>
>>> Any suggestions on how to find the cause.
>>>
>>> I'm just about ready to sacrifice a small animal as soon as I find
>>> the old gypsy woman who reads the entrails and tells me which part
>>> to replace.
>>>
>>> Thanks,
>>> Joe
>>>
>> Sorry about the typos in my first message. I wanted to add that
>> Einstein at Home runs both CPU and GPU jobs and they validate, so
>> those parts don't have any hard failures.
>>
>> And lm sensors show temperatures in the 30-50 °C range depending on
>> what's running.
>>
>> And the system has been running well for over a year so I don't think
>> it's a build problem.
>>
>> I'm looking for any way to test more.
>>
>> Joe
>>
|
|
|