- To save bank switching, move variables in different banks together.In initialization code, at startup of the program, look at the order of initialization - first all variables in bank0, then in bank1 then in bank2, then in bank3.
In initialization - may be some variables do not need initialization. Where is possible, reorder operators to let the compiler avoid redundant loads of W register or temp locations.Use variables in same bank in arithmetic expressions to avoid bank switching.If possible, take the chance to use byte arithmetic instead of word arithmetic.If possible, use of pointers to array's elements instead of index. Note that in small loops manipulating pointers, however, the overhead of the loop cancels out the saving using pointers, so its about equivalent.
A series of:
if
else if
else if ... often generates smaller code than the equivalent case statement.
In switch - case, change constants to be sequental numbers, without gaps.
Depending on the bank switching required:
var = value1;
if (!flag)
var = value2;
generates more optimal code then:
if (flag)
var = value1;
else
var = value2;
Just make sure that var won't be used in a interrupt while this code executes.Clearing, incrementing, and decrementing a byte are single instruction operations. Assigning a value to a byte requires 2 instructions (value -> W, and W -> byte).Use bits instead of unsigned chars whenever possible. Bit sets, clears, and tests and skips are all single instructions. Since you can't declare bits in a function, you may benefit from a globally declared bit.There is overhead to making function calls. Try replacing some of your smaller functions with macros.Large blocks of duplicated code should be replaced with a function and function calls if stack space allows. - Optimization of existing logic. I have yet to be given a project with non-changing requirements, so I try to write my code to be very flexible. As it gets closer to the end of the project, I find some of the flexibility isn't needed, and may be removed at a code savings.
Thanks to Ivan Cenov [imc@okto7.com] and Michael Dipperstein [mdippers@harris.com].
Compare the assembly for signed and unsigned variables, and you will find that there is a few more instructions for doing comparisons on signed variables.
Conclusion 1:
Use unsigned integers and/or chars if possible.
Optimization Tip 2: Byte Loops
Ok, heres two pieces of code, that do exactly the same thing. Yet, one of them is finished 25% faster, with less memory space! Can you pick which one? unsigned char i;
for(i=0;i<250;i++) do_func(); //executes do_func() 250 times, in 3.25ms
for(i=250;i!=0;i--) do_func(); //executes do_func() 250 times, in 2.5ms
To figure this out, have a look at the assembly produced.
for(i=0;i<250;i++) do_func(); //executes 250 times in 3251 cy 1617 01B8 clrf 0x38 1618 260F call 0x60F 1619 0AB8 incf 0x38 161A 3008 movlw 0xFA 161B 0238 subwf 0x38,W 161C 1C03 btfss 0x3,0x0 161D 2E18 goto 0x618 |
for(i=250;i!=0;i--) do_func(); //executes 250 times in 2502 cy 1621 3008 movlw 0xFA 1622 00B8 movwf 0x38 1623 260F call 0x60F 1624 0BB8 decfsz 0x38 1625 2E23 goto 0x623 |
Have your loops decrementing to zero, if possible. Its fast to check a ram variable against zero.
However, note that in the incrementing loop, do_func(); was called one clock cycle earlier. If you want speed of entry, choose the incrementing loop.
Optimization Tip 3: Integer Timeout Loops
If you want to poll a port, or execute a function a certain number of times before timing out, you need a timeout loop.
unsigned int timeout;
#define hibyte(x) ((unsigned char)(x>>8))
#define lobyte(x) ((unsigned char)(x&0xff))
//the optimizer takes care of using the hi/lo correct byte of integer
- Loops to avoid with timeouts: 320000 to 380000 cycles for 20000 iterations.
for(timeout=0;timeout<20000;timeout++) do_func(); //380011 cycles
for(timeout=20000;timeout!=0;timeout--) do_func(); //320011 cycles |
- Best loop for a timeout: 295000 cycles for 20000 iterations.
//we want to execute do_func() approx. 20000 times before timing out
timeout=(20000/0x100)*0x100; //keeps lobyte(timeout)==0, which speeds up assignments
for(;hibyte(timeout)!=0;timeout--) do_func(); //295704 cycles
Notice the features of the loop shown above.
1. It only tests the high byte of the integer each time around the loop.
2. It checks this byte against zero, very fast.
3. When initializing variable timeout, it takes advantage of the fact that the assembly command to initialize a ram variable to zero is one instruction, whereas to assign it a number its two instructions.
- Have your loops decrementing to zero, if possible, its easy to check a ram variable against zero.Only test the high byte of an integer in a timeout loop, its faster.
- When assigning integers, its faster to assign zero to a ram variable, rather than a number.
Of course, the fastest form of timeout is to use the built-in PIC timers, and check for an interrupt. This is typically 70% faster than using your own timeout loops.
//set up tmr0 to set flag T0IF high when it rolls over
while(RA0==0 && !T0IF); //wait until port goes high
Conclusion 4:
Use the built in timers and/or interrupt flags whenever possible.
Optimization Tip 5: Case statements
Slow and Inefficient c=getch(); switch(c) { case 'A': { do something; break; } case 'H': { do something; break; } case 'Z': { do something; break; } } | Fast and Efficient c=getch(); switch(c) { case 0: { do something; break; } case 1: { do something; break; } case 2: { do something; break; } } |
- Use sequential numbers in case statements whenever possible.
If you use Hi-Tech C, and there is any mathematical division at all in the entire program, this uses up between 13 and 23 bytes in bank0, and some EPROM/flash.
This occurs even if the variables used are not in bank0.
Occurrence | RAM usageROM/flash usage | Fix/Explanation |
Any mathematical division at all in the entire program using a variable of type 'long', even if all variables do not reside in bank0. | 23 bytes in bank0large, it has to include ldiv routines | Use combinations of bit shifts ie: x=x*6 is replaced by x1=x;x2=x;x=x1<<2 + x2<<1 |
Any mathematical division at all in the entire program using a variable of type 'unsigned int', even if all variables do not reside in bank0. | 13 bytes in bank0large,it has to include ldiv routines | Use combinations of bit shifts |
Any mathematical division involving a divisor that is a power of 2, ie: x=x/64; | -low | Use combinations of bit shifts |
Any mathematical division involving a divisor that is not a power of 2, ie: x=x/65; | -high | make your divisors a power of 2, ie: 2^5=32. |
Conclusion 6:
If necessary, make it easy on the C compiler and use bit shifts, and divisors that are a power of 2. Divisors that are a power of 2, such as 256=2^8, can be optimized into a bit shift by the C compiler.
If you dont use any division at all in the program, you will save 23 bytes in bank0 and a portion of ROM ldiv() routines.
No comments:
Post a Comment