How to debug Linux hang? The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Community Moderator Election ResultsHow to debug a completely stuck kernel?How to debug system freeze?How to debug Linux kernel panic?How to debug random reboots, with no kernel panic, of an embedded system?Use netconsole to debug with kernel crashHow to run kernel in debug mode and wait for KGDB with Virtualbox and Kali Linux?How do I generate the /sys/kernel/debug/tracing folder in kernel with yocto project?How do I debug intermittent System Crashes?Embedded Linux: getting two distinct DHCP responsesCrash during startup on a recent corporate computerHow to tell Linux where initramfs is in RAM

What is the padding with red substance inside of steak packaging?

Word for: a synonym with a positive connotation?

Would an alien lifeform be able to achieve space travel if lacking in vision?

What happens to a Warlock's expended Spell Slots when they gain a Level?

How to read αἱμύλιος or when to aspirate

Am I ethically obligated to go into work on an off day if the reason is sudden?

Huge performance difference of the command find with and without using %M option to show permissions

Why can't wing-mounted spoilers be used to steepen approaches?

should truth entail possible truth

Why are PDP-7-style microprogrammed instructions out of vogue?

Why don't hard Brexiteers insist on a hard border to prevent illegal immigration after Brexit?

60's-70's movie: home appliances revolting against the owners

Is there a writing software that you can sort scenes like slides in PowerPoint?

Using dividends to reduce short term capital gains?

Is every episode of "Where are my Pants?" identical?

How to type a long/em dash `—`

Sort list of array linked objects by keys and values

For what reasons would an animal species NOT cross a *horizontal* land bridge?

Can each chord in a progression create its own key?

How to determine omitted units in a publication

Is an up-to-date browser secure on an out-of-date OS?

Homework question about an engine pulling a train

Didn't get enough time to take a Coding Test - what to do now?

How do spell lists change if the party levels up without taking a long rest?



How to debug Linux hang?



The 2019 Stack Overflow Developer Survey Results Are In
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Community Moderator Election ResultsHow to debug a completely stuck kernel?How to debug system freeze?How to debug Linux kernel panic?How to debug random reboots, with no kernel panic, of an embedded system?Use netconsole to debug with kernel crashHow to run kernel in debug mode and wait for KGDB with Virtualbox and Kali Linux?How do I generate the /sys/kernel/debug/tracing folder in kernel with yocto project?How do I debug intermittent System Crashes?Embedded Linux: getting two distinct DHCP responsesCrash during startup on a recent corporate computerHow to tell Linux where initramfs is in RAM



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















We are using beagle bone black based custom board, with kernel version 3.12.

We are facing system hang during one of the init script,(Which brings up WiFi)

this hang happens after random number of power cycle,



Nothing works during this hang, it looks like system is frozen, It doesn't even respond to sysrq keys



I assume this to be in the ISR code because of which none of the thing works.



Unluckily When we enable 'Detect hung task(DETECT_HUNG_TASK)' we don't see the issue. :(



Only thing works is if watchdog is enabled after watchdog timer expires it reboots the system and system recovers.



However we want to find out where the issue is.



Any suggestion?



I thought of using softdog and repair script pair to print some messages but I assume external interrupt will have higher priority and when it executes and hangs in there, softdog timer will also not get a chance to execute right ?



Randomness of the bug makes it much more difficult to debug :(



Any help appreciated.










share|improve this question
























  • Did you look carefully at the logs when you rebooted after it hung?

    – Julie Pelletier
    Jun 6 '16 at 6:23











  • Yes, We don't see anything there :( We doubt WiFI over SDIO driver, but we are not sure about it. Because when we don't load WiFi driver module we don't see it. However sometimes we don't see it even when it is enabled.

    – AnkurTank
    Jun 6 '16 at 6:32











  • @JuliePelletier Logs usually cannot be written after a hard hang, similar to kernel panic

    – user140866
    Jun 6 '16 at 6:46






  • 1





    Did you try to redirect console to serial, and boot the kernel with loglevel=7 without quiet (if any)? Are there some obscure messages from kernel were coming?

    – user140866
    Jun 6 '16 at 6:47






  • 2





    DETECT_HUNG_TASK is usually for userspace tasks that hang inside system call. If hang comes from kernel code (for example, driver), it is useless.

    – user140866
    Jun 6 '16 at 6:52

















1















We are using beagle bone black based custom board, with kernel version 3.12.

We are facing system hang during one of the init script,(Which brings up WiFi)

this hang happens after random number of power cycle,



Nothing works during this hang, it looks like system is frozen, It doesn't even respond to sysrq keys



I assume this to be in the ISR code because of which none of the thing works.



Unluckily When we enable 'Detect hung task(DETECT_HUNG_TASK)' we don't see the issue. :(



Only thing works is if watchdog is enabled after watchdog timer expires it reboots the system and system recovers.



However we want to find out where the issue is.



Any suggestion?



I thought of using softdog and repair script pair to print some messages but I assume external interrupt will have higher priority and when it executes and hangs in there, softdog timer will also not get a chance to execute right ?



Randomness of the bug makes it much more difficult to debug :(



Any help appreciated.










share|improve this question
























  • Did you look carefully at the logs when you rebooted after it hung?

    – Julie Pelletier
    Jun 6 '16 at 6:23











  • Yes, We don't see anything there :( We doubt WiFI over SDIO driver, but we are not sure about it. Because when we don't load WiFi driver module we don't see it. However sometimes we don't see it even when it is enabled.

    – AnkurTank
    Jun 6 '16 at 6:32











  • @JuliePelletier Logs usually cannot be written after a hard hang, similar to kernel panic

    – user140866
    Jun 6 '16 at 6:46






  • 1





    Did you try to redirect console to serial, and boot the kernel with loglevel=7 without quiet (if any)? Are there some obscure messages from kernel were coming?

    – user140866
    Jun 6 '16 at 6:47






  • 2





    DETECT_HUNG_TASK is usually for userspace tasks that hang inside system call. If hang comes from kernel code (for example, driver), it is useless.

    – user140866
    Jun 6 '16 at 6:52













1












1








1


1






We are using beagle bone black based custom board, with kernel version 3.12.

We are facing system hang during one of the init script,(Which brings up WiFi)

this hang happens after random number of power cycle,



Nothing works during this hang, it looks like system is frozen, It doesn't even respond to sysrq keys



I assume this to be in the ISR code because of which none of the thing works.



Unluckily When we enable 'Detect hung task(DETECT_HUNG_TASK)' we don't see the issue. :(



Only thing works is if watchdog is enabled after watchdog timer expires it reboots the system and system recovers.



However we want to find out where the issue is.



Any suggestion?



I thought of using softdog and repair script pair to print some messages but I assume external interrupt will have higher priority and when it executes and hangs in there, softdog timer will also not get a chance to execute right ?



Randomness of the bug makes it much more difficult to debug :(



Any help appreciated.










share|improve this question
















We are using beagle bone black based custom board, with kernel version 3.12.

We are facing system hang during one of the init script,(Which brings up WiFi)

this hang happens after random number of power cycle,



Nothing works during this hang, it looks like system is frozen, It doesn't even respond to sysrq keys



I assume this to be in the ISR code because of which none of the thing works.



Unluckily When we enable 'Detect hung task(DETECT_HUNG_TASK)' we don't see the issue. :(



Only thing works is if watchdog is enabled after watchdog timer expires it reboots the system and system recovers.



However we want to find out where the issue is.



Any suggestion?



I thought of using softdog and repair script pair to print some messages but I assume external interrupt will have higher priority and when it executes and hangs in there, softdog timer will also not get a chance to execute right ?



Randomness of the bug makes it much more difficult to debug :(



Any help appreciated.







linux linux-kernel embedded






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jun 6 '16 at 7:00







AnkurTank

















asked Jun 6 '16 at 6:16









AnkurTankAnkurTank

3912827




3912827












  • Did you look carefully at the logs when you rebooted after it hung?

    – Julie Pelletier
    Jun 6 '16 at 6:23











  • Yes, We don't see anything there :( We doubt WiFI over SDIO driver, but we are not sure about it. Because when we don't load WiFi driver module we don't see it. However sometimes we don't see it even when it is enabled.

    – AnkurTank
    Jun 6 '16 at 6:32











  • @JuliePelletier Logs usually cannot be written after a hard hang, similar to kernel panic

    – user140866
    Jun 6 '16 at 6:46






  • 1





    Did you try to redirect console to serial, and boot the kernel with loglevel=7 without quiet (if any)? Are there some obscure messages from kernel were coming?

    – user140866
    Jun 6 '16 at 6:47






  • 2





    DETECT_HUNG_TASK is usually for userspace tasks that hang inside system call. If hang comes from kernel code (for example, driver), it is useless.

    – user140866
    Jun 6 '16 at 6:52

















  • Did you look carefully at the logs when you rebooted after it hung?

    – Julie Pelletier
    Jun 6 '16 at 6:23











  • Yes, We don't see anything there :( We doubt WiFI over SDIO driver, but we are not sure about it. Because when we don't load WiFi driver module we don't see it. However sometimes we don't see it even when it is enabled.

    – AnkurTank
    Jun 6 '16 at 6:32











  • @JuliePelletier Logs usually cannot be written after a hard hang, similar to kernel panic

    – user140866
    Jun 6 '16 at 6:46






  • 1





    Did you try to redirect console to serial, and boot the kernel with loglevel=7 without quiet (if any)? Are there some obscure messages from kernel were coming?

    – user140866
    Jun 6 '16 at 6:47






  • 2





    DETECT_HUNG_TASK is usually for userspace tasks that hang inside system call. If hang comes from kernel code (for example, driver), it is useless.

    – user140866
    Jun 6 '16 at 6:52
















Did you look carefully at the logs when you rebooted after it hung?

– Julie Pelletier
Jun 6 '16 at 6:23





Did you look carefully at the logs when you rebooted after it hung?

– Julie Pelletier
Jun 6 '16 at 6:23













Yes, We don't see anything there :( We doubt WiFI over SDIO driver, but we are not sure about it. Because when we don't load WiFi driver module we don't see it. However sometimes we don't see it even when it is enabled.

– AnkurTank
Jun 6 '16 at 6:32





Yes, We don't see anything there :( We doubt WiFI over SDIO driver, but we are not sure about it. Because when we don't load WiFi driver module we don't see it. However sometimes we don't see it even when it is enabled.

– AnkurTank
Jun 6 '16 at 6:32













@JuliePelletier Logs usually cannot be written after a hard hang, similar to kernel panic

– user140866
Jun 6 '16 at 6:46





@JuliePelletier Logs usually cannot be written after a hard hang, similar to kernel panic

– user140866
Jun 6 '16 at 6:46




1




1





Did you try to redirect console to serial, and boot the kernel with loglevel=7 without quiet (if any)? Are there some obscure messages from kernel were coming?

– user140866
Jun 6 '16 at 6:47





Did you try to redirect console to serial, and boot the kernel with loglevel=7 without quiet (if any)? Are there some obscure messages from kernel were coming?

– user140866
Jun 6 '16 at 6:47




2




2





DETECT_HUNG_TASK is usually for userspace tasks that hang inside system call. If hang comes from kernel code (for example, driver), it is useless.

– user140866
Jun 6 '16 at 6:52





DETECT_HUNG_TASK is usually for userspace tasks that hang inside system call. If hang comes from kernel code (for example, driver), it is useless.

– user140866
Jun 6 '16 at 6:52










1 Answer
1






active

oldest

votes


















0














Well, We did code reading as it was being suggested in the comments and found the section of the patch where system may go into infinite loop(in irq) and won't come out of it.



However when we put printk in that irq function issue was not getting reproduced. (timing issue you know!)



So finally my colleague tried old school method of toggling GPIO and it helped. That was also difficult as more than two entries of GPIO toggle would prevent reproducing issue.



inside function he used GPIO toggle as follows,



func()

//set gpio high
some doubtfull code..
....
//set gpio low



That's how he tracked the problematic code and its solution is available in linux-4.1 he fixed it and he is testing it.



@ShankarSM:If you are reading this, all credit goes to you for tracking down it :-)






share|improve this answer























    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f287890%2fhow-to-debug-linux-hang%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Well, We did code reading as it was being suggested in the comments and found the section of the patch where system may go into infinite loop(in irq) and won't come out of it.



    However when we put printk in that irq function issue was not getting reproduced. (timing issue you know!)



    So finally my colleague tried old school method of toggling GPIO and it helped. That was also difficult as more than two entries of GPIO toggle would prevent reproducing issue.



    inside function he used GPIO toggle as follows,



    func()

    //set gpio high
    some doubtfull code..
    ....
    //set gpio low



    That's how he tracked the problematic code and its solution is available in linux-4.1 he fixed it and he is testing it.



    @ShankarSM:If you are reading this, all credit goes to you for tracking down it :-)






    share|improve this answer



























      0














      Well, We did code reading as it was being suggested in the comments and found the section of the patch where system may go into infinite loop(in irq) and won't come out of it.



      However when we put printk in that irq function issue was not getting reproduced. (timing issue you know!)



      So finally my colleague tried old school method of toggling GPIO and it helped. That was also difficult as more than two entries of GPIO toggle would prevent reproducing issue.



      inside function he used GPIO toggle as follows,



      func()

      //set gpio high
      some doubtfull code..
      ....
      //set gpio low



      That's how he tracked the problematic code and its solution is available in linux-4.1 he fixed it and he is testing it.



      @ShankarSM:If you are reading this, all credit goes to you for tracking down it :-)






      share|improve this answer

























        0












        0








        0







        Well, We did code reading as it was being suggested in the comments and found the section of the patch where system may go into infinite loop(in irq) and won't come out of it.



        However when we put printk in that irq function issue was not getting reproduced. (timing issue you know!)



        So finally my colleague tried old school method of toggling GPIO and it helped. That was also difficult as more than two entries of GPIO toggle would prevent reproducing issue.



        inside function he used GPIO toggle as follows,



        func()

        //set gpio high
        some doubtfull code..
        ....
        //set gpio low



        That's how he tracked the problematic code and its solution is available in linux-4.1 he fixed it and he is testing it.



        @ShankarSM:If you are reading this, all credit goes to you for tracking down it :-)






        share|improve this answer













        Well, We did code reading as it was being suggested in the comments and found the section of the patch where system may go into infinite loop(in irq) and won't come out of it.



        However when we put printk in that irq function issue was not getting reproduced. (timing issue you know!)



        So finally my colleague tried old school method of toggling GPIO and it helped. That was also difficult as more than two entries of GPIO toggle would prevent reproducing issue.



        inside function he used GPIO toggle as follows,



        func()

        //set gpio high
        some doubtfull code..
        ....
        //set gpio low



        That's how he tracked the problematic code and its solution is available in linux-4.1 he fixed it and he is testing it.



        @ShankarSM:If you are reading this, all credit goes to you for tracking down it :-)







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jun 8 '16 at 10:53









        AnkurTankAnkurTank

        3912827




        3912827



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f287890%2fhow-to-debug-linux-hang%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            -embedded, linux, linux-kernel

            Popular posts from this blog

            Mobil Contents History Mobil brands Former Mobil brands Lukoil transaction Mobil UK Mobil Australia Mobil New Zealand Mobil Greece Mobil in Japan Mobil in Canada Mobil Egypt See also References External links Navigation menuwww.mobil.com"Mobil Corporation"the original"Our Houston campus""Business & Finance: Socony-Vacuum Corp.""Popular Mechanics""Lubrite Technologies""Exxon Mobil campus 'clearly happening'""Toledo Blade - Google News Archive Search""The Lion and the Moose - How 2 Executives Pulled off the Biggest Merger Ever""ExxonMobil Press Release""Lubricants""Archived copy"the original"Mobil 1™ and Mobil Super™ motor oil and synthetic motor oil - Mobil™ Motor Oils""Mobil Delvac""Mobil Industrial website""The State of Competition in Gasoline Marketing: The Effects of Refiner Operations at Retail""Mobil Travel Guide to become Forbes Travel Guide""Hotel Rankings: Forbes Merges with Mobil"the original"Jamieson oil industry history""Mobil news""Caltex pumps for control""Watchdog blocks Caltex bid""Exxon Mobil sells service station network""Mobil Oil New Zealand Limited is New Zealand's oldest oil company, with predecessor companies having first established a presence in the country in 1896""ExxonMobil subsidiaries have a business history in New Zealand stretching back more than 120 years. We are involved in petroleum refining and distribution and the marketing of fuels, lubricants and chemical products""Archived copy"the original"Exxon Mobil to Sell Its Japanese Arm for $3.9 Billion""Gas station merger will end Esso and Mobil's long run in Japan""Esso moves to affiliate itself with PC Optimum, no longer Aeroplan, in loyalty point switch""Mobil brand of gas stations to launch in Canada after deal for 213 Loblaws-owned locations""Mobil Nears Completion of Rebranding 200 Loblaw Gas Stations""Learn about ExxonMobil's operations in Egypt""Petrol and Diesel Service Stations in Egypt - Mobil"Official websiteExxon Mobil corporate websiteMobil Industrial official websiteeeeeeeeDA04275022275790-40000 0001 0860 5061n82045453134887257134887257

            Frič See also Navigation menuinternal link

            Identify plant with long narrow paired leaves and reddish stems Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?What is this plant with long sharp leaves? Is it a weed?What is this 3ft high, stalky plant, with mid sized narrow leaves?What is this young shrub with opposite ovate, crenate leaves and reddish stems?What is this plant with large broad serrated leaves?Identify this upright branching weed with long leaves and reddish stemsPlease help me identify this bulbous plant with long, broad leaves and white flowersWhat is this small annual with narrow gray/green leaves and rust colored daisy-type flowers?What is this chilli plant?Does anyone know what type of chilli plant this is?Help identify this plant