Code Crashing...?

The other day when we were driving our robot during practice the robot got stuck in a loop and then the controls wouldn’t work anymore.

I’ve heard that it could happen because the code may be to complex but then I look at 1103’s code and I doubt that my robot’s code is to complex.

I use EasyC V4, we have a NC-2 brain, and both the joysticks and cortex are updated to the newest master code.

I can post my code if necessary, I just want to get to the bottom of this.

Please help
Thanks

We can’t get anywhere without posting your code, so zip the two files together and post it here.

How did you diagnose that the code was stuck in a loop?
There are many other reason for the controls to not work anymore.

If it is an intentional loop you could program in a time out.

Here is my program I am currently using.

When we are driving the robot every now and then it will randomly stop recieving controls from the controllers and it will be stuck doing the last thing we told it to with the joysticks.

Also when it happens we look at the LEDs on the controllers and the brain and all three lights are green.
569C Worlds Program.zip (90.3 KB)

Stuck in what type of loop? Why do you draw this conclusion? Has this code run successfully before? Sometimes if code exhibits weird behavior one time only you have to (reluctantly) move on unless you can find a way of duplicating the problem.

I don’t think that complex code would necessarily crash the cortex, code that used too many resources (such as too much memory) could possibly do that. One common problem is an illegal access to an array that corrupts another variable.

What is “the newest” ? Is that version 3.21 ?

Probably the only way we can help.

Code that tests a flag or waits for a sensor to reach a certain value should always have some type of timeout. This caught out one of our teams last weekend when the robot arm got caught on the chassis as it was trying to raise, the code was waiting for a potentiometer to reach a particular point but, as the arm was not moving, the PTC tripped. Had the code understood that the arm normally only takes two seconds to reach that point it could have timed out and saved the motors until the driver was able to perhaps shake it loose.

10 characters

Well having looked at it for just a few minutes I don’t see anything obvious. I will run it on a cortex tonight but in the mean time, as I see you have a LCD display, I suggest that you use it to help in debugging. You first want to find out if the code is in fact crashing, so perhaps put a counter in your driver_control while loop and display that on the LCD as a means of checking that the loop is running. You could perhaps also put up the value of a couple of joystick controls to see if you are receiving them.

To confirm, are you using EasyC 4.0.4.6 ?

Ok thank you very much.

I just checked and I am in fact using the 4.0.4.6 version of EasyC V4

I will try the timer like you said.

Hopefully we can find the root of this problem.

Yes, just declare a variable such as

int MyCounter;

then in the while loop

while(1)
{
MyCounter++;
SetLCDText ( 1 , 1 , "%d" , MyCounter ) ;


etc.

}

or something like that, it doesn’t matter that it will count very fast and overflow, you just want to see that this loop continues to run when you loose control. Debugging can be slow but the idea is to gradually find out where the code stops (or not).

Are you sure you are not just loosing VEXnet connection? Do you have the competition switch connected? When did you upgrade to 4.0.4.6? Was the code working before that?

The good news is I can duplicate your problem, after a random amount of the time the code crashes and control is lost, it has done this to me several times. I’ve also seen the program reset itself after a crash.

The other good news is that as far as I can tell your code is good, there is no obvious logical error that I can find.

The bad news is that I don’t know what is causing this to happen yet and EasyC is a pain to debug for this type of problem.

I’m going to try a few ideas and then revert back to 4.0.2.7.

Ok. Thank you very much!

I can’t even express how grateful my team is.

Please let me know if you uncover anything else.

Are you going to Only revert the Master Code??? Or the Whole Tool Chain??

[EDIT]
A quick Binary File Check show the Default Firmware File for 4.0.2.7, 4.0.2.8 and 4.0.4.6 are Bit-wise Identical…

But the easyCRuntimeLib.lib is different between each version… Also v4.1.0.1 is available, most likely with another version of the easyCRuntimeLib.lib.
[/EDIT]

Mark

I’m leaving master firmware at 3.20 and using 4.0.2.7, 4.0.4.6 and 4.1.0.1. I run each version in a virtual machine for ease. I’m about to post a reply and a proposed fix for Dpbailey, as far as I can tell it may be a bug due to excessive serial communication to the LCD. Give me a few minutes to compose it because it’s going to be lengthy.

OK…

I can’t change my Master Code, because the VEXnet Upgrade Utility says I have a Programming Board Found… It might be because of the Prolific Adapter for my computer that is Not from Vex…

[EDIT]
Hmm… It was my Generic Prolific… Dumb Detection Circuit!!

Hmm… The IFI/inteletek Loader says my ( Cortex ) Master Code is 2.6 ( Update to 3.21 )… Another “Fly in the Ointment”…

[/EDIT]

Debugging any type of code can be very time consuming, debugging EasyC can be particularly hard as there is no breakpoint capability or anything else. To give you some idea of the process let me describe how I went about looking at your problem and the possible cause and workaround. This is not going to be great english, I’m just going to bash it out so it’s done today.

So I started by adding the code I had suggested earlier, a variable to count each iteration of the while loop in your driver control function and a call to display that variable on the LCD. Having done this I noticed that you are writing to the LCD every time around the loop, usually just “569C - Cthulu!!!” but always some type of information. The code crashed after a few minutes, restarting the code several times I was able to duplicate this. So my next thought was “is the code actually crashing or is it just the LCD display that stops working”. I added code to flash a LED on digital output 12 each time around the while loop and again ran the code. When the code crashes, the LED also stops flashing so it’s a good assumption that it is indeed the code rather than the LCD. Next step, remove the SetLCDText calls and see if we still crash. First thing to notice is that the loop now runs considerably faster so the LED cannot be seen flashing, so slow down the LED and re-run. After 20 minutes I conclude the code will probably not crash so it may well be the excessive serial communication to the LCD.

Next test was to repeat all these tests in an older version of EasyC, 4.0.2.7. Ran the same code as before monitoring the loop counter, waited 20 minutes, no crash. So one solution may be to revert to the older version of EasyC, however, there is a newer version out so lets try that first.

Test everything again under 4.1.0.1, things look promising but then after 10-15 minutes, same problem. It is better but the underlying problem is still there.

Final test, try and reproduce this with some simple code we can send to Intelitek so I create the following small test program.

#include "Main.h"

void OperatorControl ( unsigned long ulTime )
{
      int LoopCounter = 0; 

      InitLCD ( 1 ) ;
      SetLCDLight ( 1 , 1 ) ;
      while ( 1 ) // Insert Your RC Code Below
      {
            SetLCDText ( 1 , 1 , "%d" , LoopCounter++ ) ;
            SetDigitalOutput ( 12 , LoopCounter & 1 ) ;
      }
}

And voila, code crashes after 10 minutes.

So how can we fix this, well one simple solution is to omit all calls the SetLCDText, however, that removes significant functionality from the program. The better alternative (although still not 100% guaranteed to solve it) would be to only send new text to the LCD when needed and not continuously. To achieve this I wrote a small wrapper function to save the current text sent to each line and compare new text to that before sending to the LCD. Here is the user function, it’s modeled after the SetLCDText function so should be a drop in replacement.

// LcdBuffered.c : implementation file
#include "Main.h"
#include <stdio.h>
#include <string.h>
#include <stdarg.h>

unsigned char
SetLCDBuffered(unsigned char ucPort, unsigned char nLine, const char *szMsg, ...)
{
    va_list args;
    char    str[20];
    static  char LcdData[2][20];
    unsigned char ret = 0;

    // bounds check line
    if((nLine < 1) || (nLine > 2))
        return(0);

    // create the formatted output string
    // we assume it will be suitable for the LCD 
    va_start(args, szMsg);
    vsprintf(str, szMsg, args );

    // See if we sent this already
    if( strcmp( LcdData[nLine-1], str ) != 0 )
        {
        // save
        strcpy( LcdData[nLine-1], str );
        // send to LCD
        ret = SetLCDText(ucPort, nLine, str);
        }

    return(ret);
}

New text will not be sent to the LCD until it changes from what was sent before.

I will attach the whole project for you, but to summarize.

I believe 4.0.4.6 and later has introduced a bug in the serial transmit code that can cause the cortex to crash. Lots of communication with the LCD can cause this bug to happen. We need to report this possible problem (as I can not definitively prove it without lots more testing) so they can implement a fix.

Here is a revised copy of you code with calls to SetLCDText replaced with calls to SetLCDBuffered.

Enjoy
569C Worlds Program revised.zip (93.6 KB)

I finally got my Cortex to Talk to my computer, Updated the Master Code to 3.21 and I believe, I Crashed Dpbailey’s Code, because I could not get an Init of LCD on UART 1, but I did on UART2.
( I though maybe the Cortex had bad Hardware, but after Compiling and Downloading the “UART 1 & 2 LCD” Sample, both UARTS worked as programmed )

Implementing jpearman’s simple test, I got a Crash at about Loop 8100…

I then changed the UART from Port 1 to Port 2, Recompiled, Downloaded and Executed … I then got a crash at 4162.

I then changed the UART from Port 2, back to Port 1, Recompiled, Downloaded and Executed … I then got a crash at 46338, after 25 Minutes.

I then changed the UART from Port 1, back again to Port 2, Recompiled, Downloaded and Executed … I then got a crash at 29422, after 15 Minutes.

All the Above was done with the EasyC v4.0.4.6 Compiler, Master Code 3.21 on a NC-1, Modded to NC-2 Cortex, with 2000mAh 7.2VDC battery and the Backup Battery, connected to my Computer’s USB Port with One Vex LCD Screen, and a Red/Green LED in Digital Output #12.
There was No Power Downs between Tests…

Time for Bed…

Thank you guys SOOOOOO much!!!

I would have never been able to diagnose this problem by myself.

Thanks for the educational walkthrough of debug methods.
You mention, “not 100% guaranteed”.
I’d assume that the proposed code also fails the update counter test.

If you are intending to leave 100% or incremental improvements as an exercise for the readers, I’ll suggest this brainstorm: something repeating timer something

Yes, it will also fail.

I’m not sure yet if there is a 100% fix that does not involve eliminating calls to SetLCDText. Also, although this routine is apparently the culprit it may not be the root cause and just cause some other underlying problem to seen.

The runtime library in EasyC 4.1.0.1 is built using a newer toolchain ( GCC: Sourcery G++ Lite 2010.09-51 4.5.1 ) compared to 4.0.2.8 (GCC: Sourcery G++ Lite 2008q3-66 4.3.2) but both use the usart library from STMicroelectronics (stm32f10x_usart.c) so the problem shouldn’t be in that. I do notice the code in the new version is more optimized than the old one and, as far as I can tell, serial output is polled rather than using an interrupt. One theory I have is that there is a delay between each character in the older version ( I see a variable g_nLCDDelay ) that is not in the new one, perhaps I can test this tonight with the scope.

I would almost think a Timing Issue is the Culprit, rather than an Overflowed Value… Although, possibly the Depletion of Send Buffers could be an Issue, and when the call to “SetLCDText ( 1 , 1 , “%d” , LoopCounter++ ) ;” is invoked, if there is No More Room in the Buffer, and with out Adequate Error Checking, hangs the Function, rather than Returning a Failure…

There might be Four or Five Functions in the Call Chain, that are invoked from the SetLCDText Function Call. Any One could be the Culprit, but with out Access to the Source Code, makes it difficult much more difficult to trouble shoot…

EasyC 4.0.4.6 has the GCC: Sourcery G++ Lite 2010.09-51 4.5.1 for the BackEnd. I would assume that the Runtime Library is compiled in the same version…

Is the Runtime Library compiled in Sourcery G++ Lite 2010.09-51 4.5.1 performing Output as Polled , verses the Runtime Library compiled in Sourcery G++ Lite 2008q3-66 4.3.2 performing Output as Interrupted Driven?? Can you tell how the Runtime Library for EasyC 4.0.2.7 is compiled, since this one is known to work correctly??