Re: Hey, what is all this 'off topic' posting?



On Sun, 04 Sep 2005 23:06:31 +0100, Pooh Bear
<rabbitsfriendsandrelations@xxxxxxxxxxx> wrote:

>John Fields wrote:
>
>> >I sometimes use the 'show code' function with PL/M to see what asm has been
>> >generated by the compiler.
>> >
>> >Invariably it produces code that's very efficient and I don't have to worry
>> >about it.
>> >
>> >Best example I've come across comes to mind. 8031 with 12MHz crystal.
>> >Interrupt service routine needs to compare 2 x *16* bit numbers and return
>> >a conditional result.
>> >
>> >It took ~ 70 us including the vectoring in and out overhead.
>> >
>> >Even better. I deliberately sent interrupts at greater than the interrupt
>> >service routine's period. You might expect a 'lock-up'. Not so. Reduce the
>> >interrupt rate and normal service is restored !
>> >
>> >Now beat that hand coding in asm !
>>
>> ---
>> OK.
>>
>> ;Motorola 68HSC705C8A:
>> ;*****allocate RAM:
>>
>> org $XXXX ;start of RAM
>> lobytea ds1 ;storage for low byte of word a
>> hibytea ds1 ;storage for high byte of word a
>> lobyteb ds1 ;storage for low byte of word b
>> hibyteb ds1 ;storage high byte of word b
>> temp ds1 ;temp register
>> status ds1 ;status register
>>
>> ;*****Here we go!
>>
>> start: code
>> .
>> .
>> swi 10 ;goto interrupt routine
>> .
>> .
>> code
>> .
>> .
>> end
>>
>> ;*****Done.
>>
>>
>> ;*****ISR:
>>
>> cmplo: lda lobytea 3 ;get low byte of word a
>> sta temp 4 ;store it
>> lda lobyteb 3 ;get low byte of word b
>> cmp temp 3 ;compare it with what's in temp
>> beq cmphi 3 ;branch if they're equal
>> bclr 0,status 5 ;if they're not, clear status bit 0
>> bra out 3 ;and return
>> cmphi: bset 0,status 5 ;set status bit 0 (lobyeta = lobyteb)
>> lda hibytea 3 ;get high byte of word a
>> sta temp 4 ;store it
>> lda hibyteb 3 ;get high byte of word b
>> cmp temp 3 ;compare it with what's in temp
>> beq out 3 ;return if they're equal
>> bclr 0,status 5 ;else clear status bit 0
>> out: rti 9 ;and return
>>
>> ^
>> |
>> internal processor cycles
>>
>> On rti, bit 0 in the status register will be set if
>> word a = word b, otherwise it'll be cleared.
>>
>> Assuming an external interrupt takes as long as a software interrupt
>> does to set up everything before the ISR and to clear it all up
>> after, that's 69 internal cycles and, using a 4MHz clock, that comes
>> to a total of 34.5 microseconds since Freescale/Motorola divides the
>> external clock by two to get the internal clock.
>
>The number of internal cycles actually sounds similar to the result I got from
>the PL/M51 compiler.
>
>The standard 8051 isn't as fast wrt the crystal freq though.
>
>I inspected the code the compiler produced and concluded it couldn't be hand
>optimised any.

---
Well, that doesn't necessarily mean it _couldn't_ it just means you
couldn't figure out how to tighten it up. ;)
---

>
>So, it seems you're saying you can't outrun my HLL example, simply equal it.

---
If you want to play that way I can go to an old 705J1A with a 2 MHz
clock and get 69µs, so I'd say I was outrunning you about 6:1 in
terms of clock speed.
---

>Your choice of CPU is quicker compared to my 1989 example however. Of course
>8051 family parts are now routinely available with 33 and 40 MHz clock options -
>not to mention a faster instruction cycle time vs clock speed.

---
And you think everybody else has been standing still?
---

>It would be easy to do the same task now in around 12-15us.

---
That wasn't the point.

I showed you how to do it with 16 instructions (counting the vector
into the ISR as an instruction in 69 cycles with handwritten code,
now you show me the code your compiler came up with and the number
of cycles it takes to execute it, OK?

Actually, for a no compare on the low byte, the code can tighten up
to:

cmplo: lda lobytea 3 ;get low byte of word a
sta temp 4 ;store it
lda lobyteb 3 ;get low byte of word b
cmp temp 3 ;compare it with what's in temp
beq cmphi 3 ;branch if they're equal
bclr 0,status 5 ;if they're not, clear status bit 0
rti 9 ;and return
cmphi: bset 0,status 5 ;set status bit 0 (lobyeta = lobyteb)
lda hibytea 3 ;get high byte of word a
sta temp 4 ;store it
lda hibyteb 3 ;get high byte of word b
cmp temp 3 ;compare it with what's in temp
beq out 3 ;return if they're equal
bclr 0,status 5 ;else clear status bit 0
out: rti 9 ;and return

which saves three cycles (from 40 to 37) if there's not a valid
compare on the low byte. A good compare on the high byte stays at
64 cycles, and a non-compare stays at 69 cycles.

--
John Fields
Professional Circuit Designer
.



Relevant Pages

  • Re: Hey, what is all this off topic posting?
    ... >>I sometimes use the 'show code' function with PL/M to see what asm has been ... >>generated by the compiler. ... I deliberately sent interrupts at greater than the interrupt ... > external clock by two to get the internal clock. ...
    (sci.electronics.design)
  • Re: Accurate(ish) frequency measurement
    ... The simple method - counting cycles - would sort of work. ... clock to measure things more accurately isn't too feasable. ... an interpolator will find Delta_T1 = the time from the active ... edge of the gate to the next active edge of the unknown clock. ...
    (sci.electronics.design)
  • Re: Division by Zero in Nature, and Decomposition of Time.
    ... It *starts* with cycles, historically, but is not ... "the whole universe" to build a clock, ... > they would be either zero or very near zero relative to everything ... > that they simply do not exist relative to an observer such as us. ...
    (sci.math)
  • Re: 8051 architecture
    ... design philosophies from traditional to soft cores fro ASICs. ... Some do run in 2 clock cycles per machine state. ...
    (comp.arch.embedded)
  • Re: Need DSP recommendation
    ... > headers etc, the bit banger would have to operate around 1 MHz clock ... Thus, for each sample, the bit banger would have to run 20 ... > cycles, so the number of cycles for bit banging would be less. ...
    (comp.arch.embedded)