Posts: 6
Joined: Dec. 22, 2006
From: Richmond, VA USA
Status: offline
I am working with Microchip's FFT routines in C on a dsPIC30F3012. (Thankyou Microchip for that code - It would take a mere mortal like myself months to code that) What I am after is the power spectrum of the transformed data. I am using the SquareMagnitudeCplx function to accomplish (real^2 + imaj^2), but using the standard (floating-point) sqrt() function on the result is WAY too time-consuming. Even just calculating the power of a single bin is costly. I'v seen the Microchip PDF that describes an integer method of sqrt() http://ww1.microchip.com/downloads/en/AppNotes/91040a.pdf Does anybody have an algorithm like that coded in C for use on the dsPIC30F family? Maybe even an in-line assembler deal like SquareMagnitudeCplx?
If it's really a power spectrum you're after, then you don't need to take the square root. Could you describe in some more detail what you're trying to do?
Posts: 6
Joined: Dec. 22, 2006
From: Richmond, VA USA
Status: offline
It's an industrial application - I'm bringing in audio from a microphone/preamp/anti-aliasing filter, and sampling it at 32 ksp with the on-board 12bit ADC. I'm doing in-place fft on 128 sample blocks, and looking for several specific components to be x dB above the noise-floor. The issue is, I need to calculate the power of the noise floor to figure out that S/N ratio. I guess I could leave the (R^2 + I^2) squared and somehow compensate for the huge values.
It's an industrial application - I'm bringing in audio from a microphone/preamp/anti-aliasing filter, and sampling it at 32 ksp with the on-board 12bit ADC. I'm doing in-place fft on 128 sample blocks, and looking for several specific components to be x dB above the noise-floor. The issue is, I need to calculate the power of the noise floor to figure out that S/N ratio. I guess I could leave the (R^2 + I^2) squared and somehow compensate for the huge values.
So what you would really look for is the energy content of the useful signal in comparison with the energy content of the noise floor -- no square root here, just addding up the modulus squared of the respective bins. You just have to check whether the dB number for the required S/N ratio refers to noise power or noise amplitude.
Posts: 6
Joined: Dec. 22, 2006
From: Richmond, VA USA
Status: offline
After reading your post, I found several conflicting explainations of power-spectrum calculation on the web - some say sqrt(r^2 + j^2), and some say (r^2 + j^2). The bottom line is that I can calculate my S/N on amplitude, and it should work fine - thanks for the excellent advice. The part that hurts my feelings is that in converting a complex number from rectangular to polar, the magnitude is the squareroot of the sum of the squares. Are we getting around the squareroot because of the "squared" relationship that signal amplitude has to power?
Haven't seen that method before. And not sure I'd agree it was easy arithmetic: there are 5 multi-digit long divisions done to 3 decimal places. Done by hand? Must be! Or else would use the tables/sliderule/calculator to extract the square root directly.
Back in the 70s I learnt a different method for square roots from my mum, who must have learnt it in the 40s. Telephoned her to ask her to tell me again - I'd forgotten. And it's included in the pdf (strip off the .txt) for those weird, bent souls interested in mathematics
But when converted to binary it becomes quite easy - doable with only rotates, subtracts, compares; no multiply or divide needed. So could even do it on 12- and 14-bit cores.
Even more interesting is that when you code it up it's quite fast. I wrote it in assembly for 18f452 and it did a 32-bit unsigned int square root in about 520 - 550 cycles, which is almost twice as fast as the algorithm in the OP's link. And my code is quick (not so quick ) and dirty and could certainly be cleaned up and optimised some more.
Code and project included too, in case anyone is interested.
Not saying it is optimum but it certainly worked for me. I wrote mine in C and it compiled into 220 lines of assembly with the only call being to the divide function. Since most instructions execute in 1 cycle and 2 worst case, that puts it in the 220 cycle range. The two divides take 18 cycles ea plus a little overhead. This includes error checking, scaling up the intermediate results to avoid losing precision, and 2 passes of the Babylonian approx which gets the correct answer withing a few percent, if I remember correctly. On the dsPIC, divides and multiplies are fast.
I still think you get a better seed using the sqrt(x) ~ 2^(x_bits/2) method than the x = 3^D method shown as an alternate and it is computationally easier to do. The closer the seed is to the final answer, the faster it converges.
I will check out your method too, always looking for other ways to do things.
Your method is quicker... except perhaps on 12/14-bit cores with no easy way to divide. I learnt my method before calculators were invented, when you had to do longhand. I was really replying to yeaux that your method is difficult by hand because of the divides - unless you have some tricks there? And because I'd never seen that method but it is good.
When I tried a binary example of my way by hand it seemed it would be very easy and efficient to code, though it turned out to be slower and more cumbersome than expected. Probably it could be greatly improved - I'm a hobbyist, not a professional programmer; and usually in C not assembly. Chose the 18F for direct comparison with the OP's link, and because I've never used the 30F which seems a whole new language.
Rob- you are absolutely right that if divides are not readily available, the Babylonian approximation is not as easy by hand. I found references to this method on another forum and tried it myself. I was impressed how quickly the result converged to the right answer.
You should do a search on the forum for square root. There was a nice method proposed a few months back using the 30F multiplier and shifts which is determinate and much faster than the methods you are discussing.
If I rememeber correctly, you just shift to the left a bit one field at a time, multiply the word by itself, if it's bigger than X from SQRT(X) then make the bit 0, otherwise, keep the one bit. Keep shifting/multiplying for 16 bits. Should be much less than 100 cycles.
< Message edited by OccamMD -- Jan. 4, 2007 3:10:31 AM >
There was a nice method proposed a few months back using the 30F multiplier and shifts which is determinate and much faster than the methods you are discussing.
If I rememeber correctly, you just shift to the left a bit one field at a time, multiply the word by itself, if it's bigger than X from SQRT(X) then make the bit 0, otherwise, keep the one bit. Keep shifting/multiplying for 16 bits. Should be much less than 100 cycles.
The massmind has implementations of a variety of square root algorithms http://massmind.org/techref/microchip/math/sqrt/ . The Microchip app note TB040 for the PIC18Cxx2 (as mentioned by the original poster on this thread, zerodbu) and Scott Dattalo's code for the PIC18Cxx2 both look like they both use the square-and-test algorithm you suggested. The faster one ( Scott Dattalo's code for the PIC18Cxx2 ) finds the square root of a 16 bit integer in 108 cycles max. (There's another piece of code there that does a 32 bit integer square root in 439 cycles on the 17c43, which looks like it does digit-by-digit calculation without multiplies or divides, apparently the same algorithm that rmoline suggested).
If anyone writes a faster square root routine than Scott Dattalo's hand-optimized assembler, I will be impressed.
Returning to the original poster's question: "an algorithm like that coded in C for use on the dsPIC30F family?"
Is this what you wanted?
/* from Scott Dattalo */ /* http://www.dattalo.com/technical/theory/sqrt.html */ unsigned char sqrt(unsigned int N){ unsigned int x,j; x = 0; for(j= 1<<7; j<>0; j>>=1){ x = x + j; if( x*x > N) x = x - j; }; return(x); }
< Message edited by davidcary -- Jan. 4, 2007 9:08:30 AM >
You should do a search on the forum for square root. There was a nice method proposed a few months back using the 30F multiplier and shifts which is determinate and much faster than the methods you are discussing.
If I rememeber correctly, you just shift to the left a bit one field at a time, multiply the word by itself, if it's bigger than X from SQRT(X) then make the bit 0, otherwise, keep the one bit. Keep shifting/multiplying for 16 bits. Should be much less than 100 cycles.
sorry it took so long, finally got to it this AM. Tested out fine, please critique and see if we can speed it up, takes about 66 cycles right now for an unsigned 16 bit sqrt. This will interface with the C30 compiler via W0 passing, but you'll need to save w4 and w5 in this routine:
.include "p30f2010.inc"
.global _usqrt
; w0 has caller, returns sqrt ; w1 shifter ; w2 result ; w3 N/A ; w4:w5 scratch